Would you mind sharing a bird's eye view of what considerations are needed in the design of a trading platform? I've made it past "Hello World" programming but I'm very new to this and large project stuff is something I'm trying to get my head around.
I know this is a very broad question, so I'll make some very broad statements to see if it leads to something.
At its heart a trading platform needs to receive inputs (price update data, account update data, order update data, etc), do something with that data, and then output something (orders, a chart, a db write, etc).
On the input side, there is no universal standard for messages. Some may be FIX, some may be a broker API, some may be a datafeed API, some may be static data from a db, etc. "Connectors" that translate data/events into a common format are required.
The "core" of the trading platform processes data/events in that common format.
The "connectors" are then used on the output side, perhaps sending things back out as a FIX message or broker API-specific message, or being output to a graphics engine for charting or any trade visualization, etc.
So where would one even start if you wanted to build your own platform? A UML diagram and a specification document? Would you start at the "core" and work outwards?
What are the performance constraints, best practices, industry consensus, etc to keep in mind in all aspects of such a project?
I continue to look at Tradelink (tradelink - Project Hosting on Google Code) for general architecture ideas but I'm not smart enough to know if it is brilliant or not. What is actually important? The database structure? The messaging bus? Unique ways to handle concurrency and multithreading on a per "connector" basis?
Any thoughts, ideas, or suggested reading is most welcome.
Thanks in advance for your time.
The following 2 users say Thank You to MXASJ for this post:
This is very broad a topic and I think if from the way you ask it some explaantions may not stick well due to lack of knowledge.
First, I think, you need to agree on an architecture and what you want to do. Plus WHY - there are soem platforms out. They lack featuers, but identifying WHY you do something is core.
Then in the end you deal with a message passing / Event driven architecture. You need a decent set of messages. Fix etc. is a start. The all lack interoperability, but at the end an order is an order, so within your system you want ONE define view on it, regardless of what the end connected system says. I go with a large C# library defining classes for everything. They possibly can not cary all data, but they carry all data that I need.
Another tricky thing is how to deal with threads. WIthout multi threading you get into perforamcne hell. WIth multi threading you get into programming hell - unless you plan that properly. THat said, planning that is not that hard for something like trading.
Dealing with financial data is also challenging performance wise, unless you go end user (few symbols). Updates even for ES are not that many.... but getting some complete exchange is another ball game (one reason I do my own). Data storage is tricky here. Wnat something that is end user usable? Flat files are pretty much it, or db blobs. Useing a db server ?Get a lot of discs. I currently run 6 discs for data, soon going to 8... only for the data files (logs, tempdb resides on other discs). Not only the volume is tricky, also the amount of updates.
In the end, it is not a small proejct whatever side you tackle it. Iti s not a project for someone just after his "hello world".
The following 3 users say Thank You to NetTecture for this post:
This is a big task, but you have to start somewhere.
I would start by looking at the different feeds you want to support. Look at how symbols are done, how historical and live data are provided, how orders are processed. I have looked at IQ, Rithmic, ZenFire, and Fix. They are all similar, but enough different that you need to transform their API into a common interface that the core uses. Your core code should not deal with any of the APIs directly.
Seems strange, but just dealing with the symbol universe is kind of a pain because each API does symbols differently. You need to pick a format that is your base, and then be able to map all of the other feeds to your symbol mapping. This is especially true if you are going to support multiple feeds concurrently (like MC). How do you store your symbol info?
The next step is to decide how to store data. You can use flat files, files per date, database, or whatever. Personally, I don't think a DB is the right choice as you are just adding another layer that does not provide much, but it could make sense based on your goals. One of the key issues here is keeping all ticks in order, but most feeds still use second based timestamps. This is where a straight files are nice, because you just write them to the file. In a db, you need to add another field to maintain the order of the ticks. Also, what timestamp granularity do you want to keep track of?
You can work thru most of the above without even thinking about the GUI much, but you have to tackle that at some point as well. The GUI is also related to your threading model, which is critical to your performance. Are you going to go windows only or include others like OS X/Linux? If you are going to go windows, then I like the WPF architecture as you can do so much with the GUI, and the threading works well. The main drawback is you are locked in to windows/C#. The alternative is to go old school with C++ and use the older windows model (like SierraCharts).
The next step is work thru the charting, order processing, DOM, position tracking, .....
This is a huge task, and before you even start, you have to have your goals laid out, and then start peeling the onion. As you attack pieces, be sure to come back to the top level occasionally to make sure you have not pigeon holed yourself.
The following 2 users say Thank You to aslan for this post:
I question your selection. Fix is ok, Rithmic, too. IQ, too. But Zen-Fire is redundant - it internally IS RIghtmic, uses the API, so you gain little real insight.
I actually gained a lot by integrating (ongoing work) NxCore. VERY interesting.
That actually is painfull. It goes gfurther as often between technologies the symbols also propagate to the broker. Trade YM on Rightmic and TradeMaeven, and the symbols are different ON THE ACCOUNT STATEMENTS. Nice, isn't it.
No feed I know of uses second timestamp granularity. Neither RIghmit nor Zen-Fire do (NInja does, but that happens after the feed and it is a major issue wit h the platform, together with the fact that tehy do NOT maintani order - bid/ask and quotes are stored in separate streams, apparently). No idea about IQ. IB does 1/10th of a second snapshots, NxCore 25ms timestamps with order, Righmic microsecond as they arrive.
The following user says Thank You to NetTecture for this post:
I started out with ZenFire, so it is what I knew. You are right though, in that ZF is completely built on top of Rithmic, so if you can deal with the Rithmic API, ZenFire gets you nothing, unless your broker wont give you the Rithmic connection strings for the ZenFire server. The two APIs do a few things differently, but in general the R API is much cleaner.
I espeically dislike "Tradestation does". Tradestation is not a feed. This is like saying "NinjaTrader does", but at the end this is NInjaTrader fucking up the stored data. So, how you know that TradeStation "feed" (which to my knowledge you hae no API to) does it?
When I started my quest back into trading, I determined I am not inteested in a feed that comes with an API that requires more than a token local application / window (i.e. something like nanex installing a local price SERVER is ok - it has a window showing data throughput etc., but something like IB is not where you run requests through a local trading application).
Regarding the general design of a trading platform I am absolutely with Aslan so far (already built a platform on my own and in the process of building it learnt some lessons the hard way):
- Main point is to abstract from any existing API, symbology, data formats. Otherwise the platform will be very hard to adapt for changing environments.
- Symbology has to be universal, logical and straightforward
- Storing data has to be as simple as possible. Trying to be sophisticated is detrimental.
- Store every single bit of information that is available at the maximum resolution possible. You will at some time in the future find a use for it.
- If the data feed does not supply timestamps use the processor clock and store together with the data.
Some ideas for the design:
Have a format that can be used consistently for everything. Examples (hopefully self explaining):
Store every tick from a day in just one plain file (binary) with a name like TICKDATA 2010 12 21.DAT.
Header contains a list of all symbols together with a unique code that identifies them in the data file.
This makes retrieval a little more difficult.
It makes much easier data storage for longer timeframes and backup.
You see what you have and it's easy to transfer data to another computer, exchange or join data.
Don't know why such misinformation should be disseminated:
- Zenfire has millisecond timestamps.
- IB has no timestamps, ticks come irregularily. Sometimes 2 snapshots/sec, sometimes even 12/sec depending on market activity, symbol and other factors that are only known by IB.
Yes, very sure about IQ. TS actually has more granularity in the feed, but they store data with sec granularity.
Totally agree that TS is NOT a good starting point.
The key point here, is the granularity does not matter, because you still have to maintain order of quotes, and unless you have an unrealistic granularity you will always have a few quotes with the same timestamp. So, you have a sorting issue if you rely on the timestamp field alone (i.e. in a db).
The following user says Thank You to aslan for this post:
Reality check: DateTime does NOT store milliseconds. Really. It stores TICKS. I suggest reading the documentation you point to for clarification. MS is just the smallest increment exposed OUTSIDE TICKS. Ticks are 100ns large, finer than any clock available. Not milliseconds. NICE try.
No clue where you got the idea System.DateTime stores milliseconds. Again, maybe Zen-Fire cuts off the ticks. my own Rithmic code did not (now pulling data from Nanex).
I have quite some stored Rithmic data here from development, and the timestamps go a lot finer than milliseconds. Not sure whether Zen-Fire cuts them off (never used the API too long, way too clumsy with the object model). I now use nxCore which has a theoretical MS resolution, but gives the "minimum effective resolution" in every daily tape start, which is currently set at 25 (i.e. the milliseconds counter only increments in increments of 25).
Actually I think of storing the order of items in the ticks. Even bwlow MS that gives me 10.000 ticks per millisecond, 250.000 if using up the complete resolution, so I can store timetamp and maintain order in one field.
Another item - I second storing everything with as much resolution as possible Order and delay MAY come in handy writing a decent simulator. Yes, your order takes time, but the exact order is necessary to make sure you can give good simulated fills.
Final note: If you are multi threaded - Windows Scheduler Resolution is about 55ms. Smallest task switch interval. Want more? High prioerity thread, never switching (critical section) and using spinlocks when needed. Effectively uses tons of power (CPU core at 100%) and blocks one core. OTOH the only way to go smaller. IF that is needed. IO etc. still make it mostly a futile attempt.