CTCI: A simple feed analysis

If you were integrating a feed of end of day stock price information (open, high, low, and closing price) for 5,000 companies, how would you do it? You are responsible for the development, rollout and ongoing monitoring and maintenance of the feed De- scribe the di erent methods you considered and why you would recommend your approach The feed is delivered once per trading day in a comma-separated format via an FTP site The feed will be used by 1000 daily users in a web application.

 

Approach:

Let’s assume we have some scripts which are scheduled to get the data via FTP at the end of the day Where do we store the data? How do we store the data in such a way that we can do various analyses of it?

Proposal #1

Keep the data in text les This would be very di cult to manage and update, as well as very hard to query Keeping unorganized text les would lead to a very ine cient data model

Proposal #2

We could use a database This provides the following bene ts:

  • »  Logical storage of data
  • »  Facilitates an easy way of doing query processing over the data

    Example: return all stocks having open > N AND closing price < M

    Advantages:

  • »  Makes the maintenance easy once installed properly
  • »  Roll back, backing up data, and security could be provided using standard database features We don’t have to “reinvent the wheel ”

     

Proposal #3

If requirements are not that broad and we just want to do a simple analysis and distribute the data, then XML could be another good option

Our data has xed format and xed size: company_name, open, high, low, closing price The XML could look like this:

<root>
<date value=“2008-10-12”>

<company name=“foo”> <open>126.23</open> <high>130.27</high> <low>122.83</low>

<closingPrice>127.30</closingPrice> </company>

<company name=“bar”> <open>52.73</open> <high>60.27</high> <low>50.29</low> <closingPrice>54.91</closingPrice>

</company> </date>

<date value=“2008-10-11”> . . . </date> </root>

Bene ts:

  • »  Very easy to distribute This is one reason that XML is a standard data model to share / distribute data
  • »  E cient parsers are available to parse the data and extract out only desired data
  • »  We can add new data to the XML le by carefully appending data We would not have to re-query the database

    However, querying the data could be difficult

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s