Real Time Data: MySQL or Excel?

July 28th, 2018, 05:26 AM

Hi! Pretty much what title says.

I want to save data(DOM) outside NinjaTrader, because of the memory.

Can you tell me the pros and cons of each one?

I'm really inclined to MySQL, but, this is just because i've never used Spreadsheets.

Thanks for your time.

July 28th, 2018, 05:45 AM

Fernand0

Hi! Pretty much what title says.

I want to save data(DOM) outside NinjaTrader, because of the memory.

Can you tell me the pros and cons of each one?

I'm really inclined to MySQL, but, this is just because i've never used Spreadsheets.

Thanks for your time.

Hi Fernand0

I don't think I have ever seen a DOM data sample - Do you have a sample you want to store that you can share?

It strikes me as something for which MySQL would be better suited - but it depends on the data resolution, your reasons to store this data as well as your longer term plans about it.

July 28th, 2018, 06:23 AM

xplorer

Hi Fernand0

I don't think I have ever seen a DOM data sample - Do you have a sample you want to store that you can share?

It strikes me as something for which MySQL would be better suited - but it depends on the data resolution, your reasons to store this data as well as your longer term plans about it.

Hey, yes..

this structure is not mine( original post), but is an example of what i want to do

July 28th, 2018, 06:31 AM

Fernand0

Hey, yes..

this structure is not mine, but is an example of what i want to do

Given the data resolution, it really depends on how you intend to store the data. If you are going to store it in single-day files, you can use Excel.

But if you intend to use Excel to store multiple days in one single file, chances are after a few weeks/months of data gathering you will exceed Excel's limit of 1,048,576 rows, which means mySQL would be more suitable.

Again, it really depends on your intentions, as well as your proficiency in both Excel and SQL.

July 28th, 2018, 06:49 AM

1 register per change in the DOM.. release the information after a couple of minutes.. let's say.. 30 minutes

July 28th, 2018, 09:39 AM

In SQL you can build views (Which are predefined reports that aggregate, calculate, and filter data in anyway you can possibly imagine). So the approach you want is to send rows over to your SQL Server on whatever frequency you like, and then when you need it to query the views back to your application. If you application is running on the same machine as your SQL Server and you have decent memory allocation, this process will be the fastest and cheapest on your memory.

I might do a post about this eventually and show how I do it....

Excel by contrast only holds 1.2 million rows, and starts to hemorrhage after around 500K rows. It can't calculation, aggregate, or filter quick at all, and there is no chance you could do any analysis and send it back to your application in any reasonable time.

Hope this helps.

Ian

July 28th, 2018, 09:56 AM

@iantg

Thanks a lot. But you did it with Spreadsheets? or that image was just to show an example?

July 28th, 2018, 11:22 AM

Hi Fernand0,

There are two different use case.

1. To do the initial analysis (Not in real time to optimize for latency or execution speed) I just do a print to the output window of ninjatrader and copy batches of 100K records at a time into excel. The raw data is about 3x the size of the final analyzed dataset. So for one day in the ES you will be around 150k to 200k raw rows. I typically keep a spreadhseet for each day, where I compute my various bets and test different things. In case it's not obvious, I play in the HFT space.... So I don't need months and years of history at a macro level to test my bets. I need weeks up to at most a few months of very very detailed information to validate my current betting models. This might be a different approach from the needs of some traders.

2. Now in a production environment, optimizing for real time data feeds and execution speed you would need a different approach to work with this much data. If the goal is to synthesize 10k to 50k rows of data into your decision engine in real time to get alpha signals to trade off of, then you will certainly need to take the SQL approach. Here the idea would be to do an insert statement for each row into your database table, and then as needed fetch various views in SQL to determine your analytics.

So if you are just getting started and want to do the analysis, then just printing the output to an output window and then copying it straight into excel will work fine during the initial analysis phase.... But for real time trading, there is no way to move between excel and your application fast enough to have any value, you will have to go with SQL for this step.

There has been interest in some of these topics before from others, and I have been saying I would do it for a while now.... So maybe I will start a microstructure thread and share how to do some of this data modeling, analysis, betting logic, etc.

Ian

Fernand0

@iantg

Thanks a lot. But you did it with Spreadsheets? or that image was just to show an example?

August 2nd, 2018, 03:23 PM

If you have any programming experience I would suggest storing raw data in flat binary files with a fixed structure (one file per instrument per day). ES alone has 2-3 million level1 (best bid/ask/trades) updates per day and > 4 million level2 (10 levels of order book on each side). It's way too much data even for SQL. But as other posters said, it depends on whether you want to store the whole thing.

Real Time Data: MySQL or Excel?

Discussion in Platforms and Indicators

Real Time Data: MySQL or Excel?