Multi-threaded Custom Financial Database - Matlab, R project and Python | futures io social day trading
futures io futures trading


Multi-threaded Custom Financial Database
Updated: Views / Replies:2,244 / 16
Created: by dcooke888 Attachments:6

Welcome to futures io.

(If you already have an account, login at the top of the page)

futures io is the largest futures trading community on the planet, with over 90,000 members. At futures io, our goal has always been and always will be to create a friendly, positive, forward-thinking community where members can openly share and discuss everything the world of trading has to offer. The community is one of the friendliest you will find on any subject, with members going out of their way to help others. Some of the primary differences between futures io and other trading sites revolve around the standards of our community. Those standards include a code of conduct for our members, as well as extremely high standards that govern which partners we do business with, and which products or services we recommend to our members.

At futures io, our focus is on quality education. No hype, gimmicks, or secret sauce. The truth is: trading is hard. To succeed, you need to surround yourself with the right support system, educational content, and trading mentors all of which you can find on futures io, utilizing our social trading environment.

With futures io, you can find honest trading reviews on brokers, trading rooms, indicator packages, trading strategies, and much more. Our trading review process is highly moderated to ensure that only genuine users are allowed, so you dont need to worry about fake reviews.

We are fundamentally different than most other trading sites:
  • We are here to help. Just let us know what you need.
  • We work extremely hard to keep things positive in our community.
  • We do not tolerate rude behavior, trolling, or vendors advertising in posts.
  • We firmly believe in and encourage sharing. The holy grail is within you, we can help you find it.
  • We expect our members to participate and become a part of the community. Help yourself by helping others.

You'll need to register in order to view the content of the threads and start contributing to our community.  It's free and simple.

-- Big Mike, Site Administrator

Reply
 6  
 
Thread Tools Search this Thread
 

Multi-threaded Custom Financial Database

  #11 (permalink)
Elite Member
Boston, MA
 
Futures Experience: Beginner
Platform: IB
Favorite Futures: Stocks
 
Posts: 29 since Feb 2012
Thanks: 6 given, 21 received

Bare Disruptor tests

Looks like the latency is specific to my application. Running latency tests on a similar diamond disruptor configuration with much smaller memory footprints looks to reduce the multithreaded latency issues.

Tests for a Disruptor running multi-threaded ~ 10-11mm messages / second
Run 0, Disruptor=10,785,159 ops/sec
Run 1, Disruptor=10,907,504 ops/sec
Run 2, Disruptor=11,059,500 ops/sec
Run 3, Disruptor=10,725,010 ops/sec
Run 4, Disruptor=11,037,527 ops/sec
Run 5, Disruptor=10,703,200 ops/sec
Run 6, Disruptor=11,156,978 ops/sec

Run #, 50/90/99/99.9/99.99/worst (s)
Run 0, 3.0 / 598.0 / 741.0 / 1513.0 / 4840.0 / 16430
Run 1, 86.0 / 598.0 / 746.0 / 1784.0 / 3068.0 / 3451
Run 2, 1.0 / 567.0 / 725.0 / 1352.0 / 3278.0 / 4265
Run 3, 1.0 / 593.0 / 724.0 / 1091.0 / 3077.0 / 5660
Run 4, 1.0 / 573.0 / 700.0 / 1189.0 / 3299.0 / 10503
Run 5, 1.0 / 593.0 / 721.0 / 1737.0 / 3687.0 / 5372
Run 6, 1.0 / 558.0 / 698.0 / 1790.0 / 3096.0 / 3278


The tests that I'm running for a baseline is a slightly modified version of "OneToThreeDiamondSequencedThroughputTest" from the disruptor tests (Disruptor by LMAX-Exchange) specifically this test: https://github.com/LMAX-Exchange/disruptor/blob/master/src/perftest/java/com/lma...encedThroughputTest.java

So the good and bad news is that the latency is mostly originating from my program...just not sure how to best resolve it. I'm not going to spend too much more time on this before moving on, but I'm going to look around to see if there might be some large changes that make a difference...or if single threaded is the way to go.

Reply With Quote
 
  #12 (permalink)
Elite Member
Boston, MA
 
Futures Experience: Beginner
Platform: IB
Favorite Futures: Stocks
 
Posts: 29 since Feb 2012
Thanks: 6 given, 21 received

A better understanding of latency

Figured out the problem with my latency tests. I had been testing both latency and throughput, trying to publish as many messages as possible to my program and measuring the latencies I'd expect when it was maxed out.

I feel that a better latency test is to send a message, pause, send a message, pause,.... and to measure the latency quantiles and overall throughput throughout the test since this should replicate the burst traffic we can expect to see from the market.

So, adding a 1s pause between sending each event resulted in latencies that are only 10X the single threaded case vs. 10,000X times I was finding before. Throughput is still ~12MB/s which is way above what I'd expect to see from IQFeed even with 1800 symbols being watched in realtime. The other thing to take into account is I'm currently running this on a dual core computer, so until I run it on a computer with 4 cores or higher I won't have a true picture of the latency profile.

Messages Per Second 92,943
Message Size(bytes): 130
TotalThroughput (KB/S): 12,082.6

Journaler
50.0% took 21.0 s, 90.0% took 73917.0 s, 99.0% took 97146.0 s, 99.9% took 225502.0 s, 99.99% took 298069.0 s, worst took 299526 s
Parser
50.0% took 14.0 s, 90.0% took 31.0 s, 99.0% took 608.0 s, 99.9% took 34895.0 s, 99.99% took 121863.0 s, worst took 126712 s
Serializer
50.0% took 22.0 s, 90.0% took 102.0 s, 99.0% took 1659.0 s, 99.9% took 59190.0 s, 99.99% took 156052.0 s, worst took 171280 s

OverallTime
50.0% took 35.0 s, 90.0% took 84655.0 s, 99.0% took 125872.0 s, 99.9% took 306603.0 s, 99.99% took 331352.0 s, worst took 334334 s

Reply With Quote
The following user says Thank You to dcooke888 for this post:
 
  #13 (permalink)
Elite Member
Boston, MA
 
Futures Experience: Beginner
Platform: IB
Favorite Futures: Stocks
 
Posts: 29 since Feb 2012
Thanks: 6 given, 21 received

Next Step: Building out the TIC database


At this point I've got a Connector that takes IQFeed messages in, writes them to disk, parses to objects and fires them off. The goal for this section was low latency, and since I'm not doing any really high frequency trading I believe my current connector will suffice at current latency / throughput levels.

End uses
Backtesting
- Throughput is most important, latency doesn't really matter
- Ideally the program will be massively parallel streaming to multiple receiving computers

Live Trading
- Program state will be kept in memory using the stream of event objects
- Reading TIC data from disk will not happen often in live trading, however I in case of program failure I will need to be able quickly load the current program state, ideally with some sort of failover protection.

Overall design requirements:
- Updates in real-time (data can be compressed / verified at the EOD)
- Preserves event received order
- Data can be requested by day, symbol and type (trade, Bid/Ask TOB, Bid/Ask depth)

Performance Measurement: based on throughput given various conditions
- One ticker, one feed type
- Multiple tickers, one feed type
- One Ticker, multiple feed types
- Multiple tickers, multiple feed types
- Startup performance (stream whole market 1 day multiple tickers / multiple feeds);

Most of my backtesting occurs across an entire portfolio, using multiple feed types. I will likely bias to the multiple tickers and multiple feed types, since this is my test case and should make the best startup / reboot times.

Reply With Quote
 
  #14 (permalink)
Site Administrator
Manta, Ecuador
 
Futures Experience: Advanced
Platform: My own custom solution
Favorite Futures: E-mini ES S&P 500
 
Big Mike's Avatar
 
Posts: 46,238 since Jun 2009
Thanks: 29,350 given, 83,218 received

The only thing I don't understand is all the focus on latency, when IQFeed is not a latency focused app. It is not designed to be latency sensitive.

Sent from my LG Optimus G Pro

Due to time constraints, please do not PM me if your question can be resolved or answered on the forum.

Need help?
1) Stop changing things. No new indicators, charts, or methods. Be consistent with what is in front of you first.
2) Start a journal and post to it daily with the trades you made to show your strengths and weaknesses.
3) Set goals for yourself to reach daily. Make them about how you trade, not how much money you make.
4) Accept responsibility for your actions. Stop looking elsewhere to explain away poor performance.
5) Where to start as a trader? Watch this webinar and read this thread for hundreds of questions and answers.
6)
Help using the forum? Watch this video to learn general tips on using the site.

If you want
to support our community, become an Elite Member.

Reply With Quote
 
  #15 (permalink)
Elite Member
Boston, MA
 
Futures Experience: Beginner
Platform: IB
Favorite Futures: Stocks
 
Posts: 29 since Feb 2012
Thanks: 6 given, 21 received


Big Mike View Post
The only thing I don't understand is all the focus on latency, when IQFeed is not a latency focused app. It is not designed to be latency sensitive.

Sent from my LG Optimus G Pro

you're right, IQFeed is not particularly latency sensitive. I've looked at latency mainly since I want the code base to be modular allowing me to plug/play feeds as I see fit as well as a general learning experience.

All the work that I have done except for the parsing functionality from String to java object is generic, even the objects are designed as generic objects (I parse an IQFeed L1 message to a lastEvent (last price/size/time) and two TOBEvents (bid/ask / price/size/time)....same for other types of messages. Therefore the latency measurements will be useful if I move to a different feed and need to know what kind of latency I can expect from my program.

The second reason is learning, my background is mechanical engineering...only had one comsci class at university, so there is a lot I don't know about the field. I really didn't understand the relationship between buffers, latency and throughput and the testing has helped me to shape a more clear mental model of the relationship. I really didn't know what performance(latency / throughput) I could expect from any of the different types of classes, nor did I understand how multi-threading would effect the performance. This has at least helped me realize the complexity required to tackle the lower latencies and at the same time given me a sense of latencies that are realistic to achieve on cheap commodity hardware without a ton of work.

Reply With Quote
The following 2 users say Thank You to dcooke888 for this post:
 
  #16 (permalink)
Elite Member
Boston, MA
 
Futures Experience: Beginner
Platform: IB
Favorite Futures: Stocks
 
Posts: 29 since Feb 2012
Thanks: 6 given, 21 received

Updates on Tic Database

The goal for the TIC database is throughput, since application state will be kept in memory it will not need to be latency sensitive.

Since I've already tested out serialization libraries, now I can test out the different database stream speeds. A database stream essentially looks like the following:

Writing to Disk
- Object input stream (Takes in a series of objects)
- Serialize Objects to bytes (Turns objects to bytes)
- Compress Bytes
- Write bytes to file

Reading from Disk
- Read Bytes
- Decompress bytes
- Deserialize bytes to an object
- Fire Object event off

The above is pretty straightforward and can be tested in a single threaded manner to determine the highest overall throughput of events. Below are the results

Please register on futures.io to view futures trading content such as post attachment(s), image(s), and screenshot(s).

Please register on futures.io to view futures trading content such as post attachment(s), image(s), and screenshot(s).


Its amazing to me how important it is to use a compression library in order to achieve higher read/write performance, File I/O is really slow.

Two key Takeaways:
- Compression - Use it and LZ4 slightly beats out Snappy
- Serialization - Unsafe is 3x faster than using direct buffers for read performance, definitely use it.

Reply With Quote
The following 2 users say Thank You to dcooke888 for this post:
 
  #17 (permalink)
Elite Member
Boston, MA
 
Futures Experience: Beginner
Platform: IB
Favorite Futures: Stocks
 
Posts: 29 since Feb 2012
Thanks: 6 given, 21 received

Finally a Tic Stream Database

I've written a Tic datastore using LZ4 Compression and Unsafe serialization that stores Trades, TopOfBook Updates (Bid/Ask), Depth Updates, and News Events.

My goals were the following:
- Ensure no duplicates - When adding files to the database ensures I don't accidentally double up on the same events.
- High Stream Rate - Use LZ4 and Unsafe but allow for changes in the future if another library performs better
- Asynchronous - Data stream would be too large to dump into memory, so force asynchronous

This will be used for both backtesting and data analysis. All of my data bars will be based off this information. After completing the database here is the performance that I'm seeing

Stream Type : Events/Sec : (MB/sec)
TOB_One_Ticker : 1897198, 189.7
Trade/TOB/Depth_One_Ticker : 1852469, 185.2
TOB/Depth_One_Ticker : 1979646, 197.9
Depth_One_Ticker : 2024561, 202.4
Trades_One_Ticker : 1552441, 155.2
Trades/TOB_One_Ticker : 1944720, 194.4

Performance Relative to Tickers added:
Please register on futures.io to view futures trading content such as post attachment(s), image(s), and screenshot(s).


The short list of performance data: (single 5200rpm HDD)

Stream 1 wks data - 1 symbol - 8 seconds
Stream 1 wks data - 476 symbols - 38 seconds (25 largest futures contracts and 450 highest dollar volume equities)
File Size for 476 symbols (3.9 GB)

Reply With Quote
The following 2 users say Thank You to dcooke888 for this post:

Reply



futures io > > > > Multi-threaded Custom Financial Database

Thread Tools Search this Thread
Search this Thread:

Advanced Search



Upcoming Webinars and Events (4:30PM ET unless noted)

Linda Bradford Raschke: Reading The Tape

Elite only

Adam Grimes: TBA

Elite only

NinjaTrader: TBA

January

Ran Aroussi: TBA

Elite only
     

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom database with custom tickers. Is it possible ? enjoyaol AmiBroker 4 March 28th, 2014 01:33 PM
Multi-Trillion Bank Bailout Leads to Multi-Billion Bank Profit Bloomberg Finds Quick Summary News and Current Events 0 November 28th, 2011 02:00 AM
NT Multi Timeframe and multi indicator review eurostoxx NinjaTrader 3 August 29th, 2011 01:07 PM
Am looking for ELCollections.dll that has been modified for multi-threaded cpu's. sigmatrader EasyLanguage Programming 2 August 12th, 2011 06:02 PM
Database for NT bomberone1 NinjaTrader Programming 6 April 29th, 2011 10:11 AM


All times are GMT -4. The time now is 08:56 PM.

Copyright © 2017 by futures io, s.a., Av Ricardo J. Alfaro, Century Tower, Panama, +507 833-9432, info@futures.io
All information is for educational use only and is not investment advice.
There is a substantial risk of loss in trading commodity futures, stocks, options and foreign exchange products. Past performance is not indicative of future results.
no new posts
Page generated 2017-12-11 in 0.13 seconds with 20 queries on phoenix via your IP 54.83.122.227