ES Tick Data

May 15th, 2022, 12:26 PM

I keep seeing little bits and pieces of ES tick data here on the forum, but I'm trying to find a solid 10 to 15 years of ES tick.

has anyone here compiled this?
is there a torrent out there?

May 15th, 2022, 07:01 PM

Not sure why you want tick data; are you trying to figure out based on opening ticks, what levels typically fall into a trend day vs a range day? If so, you can just gather data for maybe 3 months on highest tick within the first 30 minutes of RTH and label each as a trend day vs range day. That should give you somewhat of a statistical edge, then track daily for a period of time to fine tune. I have another method to determine how to trade for the day myself, based on likely range or trend day. The overall data is not always perfectly accurate, but the divergences typically work and you don't need historical data for that. Why are you gathering tick data?

May 16th, 2022, 04:06 PM

I get irritated when people don�t answer the question I ask and instead question why I want to know something. Consequently, I�ll answer to your question up front. I don�t know where you can get 10 � 15 years of ES tick data other than to purchase it from a data broker.

However, as someone who has wallowed around extensively in tick data, I can offer you some info that may be useful.

I use Sierra Charts with a proprietary data feed from Infiniti Futures. Sierra Charts has an optional bar structure that allows me to get tick data by keying in an otherwise ridiculous bar parameter. Then, I can download the data from the chart into a comma-delimited text file. That file shows for each transaction in sequence, the date, the time, the price, the volume, and whether the transaction was at bid or ask.

It is not uncommon, although not anywhere near the average, to have over 200 ES trades of various volumes happening within a one second time frame. The first one second bar for the trade day of Jan 4, 2021 (which began on Jan 3 at the open in Tokyo) had 194 separate trades, Bid Volume of 950, Ask Volume of 76 (Total Volume 1026) and 39 price changes (40 prices) over a range of 3.0 (12 ticks). Many retail traders would be happy to capture those 12 ticks but one must ask, �Can I really react fast enough in a one second time period to capture those 12 ticks or even a significant fraction of them?� To me, that looks like a job for an algorithm and an algorithm operating on a highspeed data line at that.

Tick data can show you features that can�t be seen any other way but for the non-automated retail trader, those features are usually cautionary, not actionable. They include the speed at which price changes can occur and price gaps that are not discernible within almost any kind of price bar.

While the first second of trading for Jan 4, 2021 mentioned above was not unusual, it was also not typical. In fact, the notion of �typical� is best established over specific time frames relating to trading sessions or trading hours within trading sessions. For example: What is typical during the London session? What is typical during the overlap between the London and New York sessions? What is typical in the morning of the New York session and what is typical in the afternoon of the New York session? �Typical� relates to trading activity which occurs differently during different periods of the day based on when the market participants typically show up. Even within those subdivisions, �typical� varies significantly from day-to-day.

Focusing on tick data gets you into the arena of big data in a hurry and you need to have, or intend to get the programming skills to deal with it. Otherwise, you can use other practical alternatives which I will suggest shortly.

To give you a sense of the scale we are talking about, consider the following data for one day. For Jan 4, 2021, I show 1,154,512 individual trades over the period starting at the Tokyo open on Sunday, Jan 3 and ending at the cash close of Jan 4. To shrink the data load, I programmed my computer to consolidate the bid volume, the ask volume, the number of bid trades and the number of ask trades by price. Each time the price changed, the tallies zeroed and began accumulating again. That process reduced the 1,154,512 transactions down to 82,819 data records, which gives you a sense of how many transactions happen at the same price before the price changes. You can get the same consolidated data without doing the programming by specifying a .25 range bar within your trading platform, assuming your data feed provides the underlying details. Having access to the million plus data points that dropped away through this summarizing process gave less insight than I gained by looking at the consolidated data.

If you look at the pencil in your hand at the atomic level, viewing each individual atom, you are unlikely to recognize that collection of atoms as a pencil any longer. Trading data has a similar dynamic.

Price will often stay at one price for a number of trades, then change by one tick for a period of time and then return to the previous price level. One has to question how significant a price change of a single tick can be. A price change of one tick can only mean something important when it presages a significant move in one direction and nearly all price changes of one tick do no such thing. This argues for the use of range bars to explore how much price change may presage a significant directional move. When viewed using a .50 (2 ticks) range bar, the 1,154,512 trades on Jan 4, 2021 yielded 4,631 data records. Similarly, a .75 (3 ticks) range bar yielded 2,429 data records. These numbers are much easier to deal with and the lost resolution is relatively meaningless.

Range bars are not time dependent. They change only as often the price strays from the bar range regardless of whether that process takes one second or four hours. If you feel that the passage of time is important (and I strongly believe that it isn�t, at least not with respect to price levels � the passage of one minute does not affect price in any particular way) you can explore candlesticks of various numbers of seconds which will provide known amounts of data. A one second candlestick gives 86,400 data points per 24 hour day, assuming trades occur within each second on the clock (Surprisingly, they don�t� in ES. Some clock seconds pass without a single trade.). Similarly, a 2 second candlestick yields a max of 43,200 data records per day, a 3 second candlestick a max of 28,800, etc.

Rather than, or at least before, you dive into tick data, I would urge you to explore range bars of various sizes and time-based bars in various numbers of seconds. I strongly recommend that you also include volume in those studies. This data is immediately accessible through your platform and will provide most of the insight you might gain from tick data and doesn't require ancillary programming.

Too often, we are distracted with the computational power of computers and don�t pay enough attention to the meaning of the data. The meaning of the data is constructed through thinking about the data rather than computing the data. Computation can have value in that it can apply a structure to the data that allows us to think about it better, but we still must do the thinking. It is very seductive to believe that a new way of assembling data will somehow immediately show us a previously unseen dynamic within the data. This belief most often leads to endless, nonproductive programming that is constrained only by our unlimited imaginations, which is no constraint at all. Thinking must guide all exploration rather than random searches for computed patterns.

At this point in my explorations, I see tick data as only valuable for exploring algorithm effectivity, nothing more. Creating an algorithm requires the same insight you can gain through the methods I have described above.

I also suggest that you limit your explorations, at least in the beginning, to no more than a year and probably less. My reasoning for that is this: If you can make trading sense over 10 years of data but then your notion of sense falls apart over the eleventh and most recent year, are you likely to trade in the prescribed manner going forward? If you can make trading sense over the last year of data but that sense falls apart when applied to the preceding ten years, are you going to abandon your sense of the last year because ancient history doesn�t conform? It seems likely to me, that if you can�t make trading sense of the last year, then you need to focus on that before getting distracted with more data from the past.

The countervailing argument is that there are repetitive patterns in the data that can only be teased out through year-over year comparisons and various applications of mathematics and statistics. After much study, I conclude that this is not the case, that the market is so dynamic (speculative) that it does not have repetitive rhythms over time cycles. If there are rhythms, they are between price levels (not over time) and those price levels are approximate at that. Consequently, the value of extended history is limited to providing insight into price behavior and contains no deterministic elements that can be modelled mathematically. This is such a powerful and unproveable statement that I am willing to admit the possibility of being wrong, but this is consistent with my extensive study and I share that conclusion freely and with conviction.

Machine learning may open up new frontiers in trading but that is a whole other background to acquire.

While I haven�t answered your question very satisfactorily, I hope I have given you food for thought that will save you many hours in your trading explorations. Best of luck to you and may you discover something ground breaking.

May 17th, 2022, 06:15 AM

That's a lot of info. I mainly trade NQ, but do trade ES and others. What I have found is the algos are programmed to move in a certain manner and can do random movements around target levels, especially when reading news, large and small based on volatility. I have also found algos have memory (which you may consider as fair value or previous fair value areas, but don't necessarily follow any particular indicator, but do respect the 20 & 50 EMAs to an extent on trend days; I believe algos are programmed to acquire the most profitability tick/points from the majority of traders the majority of time based on order flow (I'm talking about algos that drive the market from institutions and big banks). Although I find trading difficult, observing rules around the algo patterns is what has made me profitable. Good luck with your pursuit of more tick data.

May 18th, 2022, 10:03 AM

I will definitely explore range bars. thank you for this tip!

Chancellor

I get irritated when people don�t answer the question I ask and instead question why I want to know something. Consequently, I�ll answer to your question up front. I don�t know where you can get 10 � 15 years of ES tick data other than to purchase it from a data broker.

However, as someone who has wallowed around extensively in tick data, I can offer you some info that may be useful.

I use Sierra Charts with a proprietary data feed from Infiniti Futures. Sierra Charts has an optional bar structure that allows me to get tick data by keying in an otherwise ridiculous bar parameter. Then, I can download the data from the chart into a comma-delimited text file. That file shows for each transaction in sequence, the date, the time, the price, the volume, and whether the transaction was at bid or ask.

It is not uncommon, although not anywhere near the average, to have over 200 ES trades of various volumes happening within a one second time frame. The first one second bar for the trade day of Jan 4, 2021 (which began on Jan 3 at the open in Tokyo) had 194 separate trades, Bid Volume of 950, Ask Volume of 76 (Total Volume 1026) and 39 price changes (40 prices) over a range of 3.0 (12 ticks). Many retail traders would be happy to capture those 12 ticks but one must ask, �Can I really react fast enough in a one second time period to capture those 12 ticks or even a significant fraction of them?� To me, that looks like a job for an algorithm and an algorithm operating on a highspeed data line at that.

Tick data can show you features that can�t be seen any other way but for the non-automated retail trader, those features are usually cautionary, not actionable. They include the speed at which price changes can occur and price gaps that are not discernible within almost any kind of price bar.

While the first second of trading for Jan 4, 2021 mentioned above was not unusual, it was also not typical. In fact, the notion of �typical� is best established over specific time frames relating to trading sessions or trading hours within trading sessions. For example: What is typical during the London session? What is typical during the overlap between the London and New York sessions? What is typical in the morning of the New York session and what is typical in the afternoon of the New York session? �Typical� relates to trading activity which occurs differently during different periods of the day based on when the market participants typically show up. Even within those subdivisions, �typical� varies significantly from day-to-day.

Focusing on tick data gets you into the arena of big data in a hurry and you need to have, or intend to get the programming skills to deal with it. Otherwise, you can use other practical alternatives which I will suggest shortly.

To give you a sense of the scale we are talking about, consider the following data for one day. For Jan 4, 2021, I show 1,154,512 individual trades over the period starting at the Tokyo open on Sunday, Jan 3 and ending at the cash close of Jan 4. To shrink the data load, I programmed my computer to consolidate the bid volume, the ask volume, the number of bid trades and the number of ask trades by price. Each time the price changed, the tallies zeroed and began accumulating again. That process reduced the 1,154,512 transactions down to 82,819 data records, which gives you a sense of how many transactions happen at the same price before the price changes. You can get the same consolidated data without doing the programming by specifying a .25 range bar within your trading platform, assuming your data feed provides the underlying details. Having access to the million plus data points that dropped away through this summarizing process gave less insight than I gained by looking at the consolidated data.

If you look at the pencil in your hand at the atomic level, viewing each individual atom, you are unlikely to recognize that collection of atoms as a pencil any longer. Trading data has a similar dynamic.

Price will often stay at one price for a number of trades, then change by one tick for a period of time and then return to the previous price level. One has to question how significant a price change of a single tick can be. A price change of one tick can only mean something important when it presages a significant move in one direction and nearly all price changes of one tick do no such thing. This argues for the use of range bars to explore how much price change may presage a significant directional move. When viewed using a .50 (2 ticks) range bar, the 1,154,512 trades on Jan 4, 2021 yielded 4,631 data records. Similarly, a .75 (3 ticks) range bar yielded 2,429 data records. These numbers are much easier to deal with and the lost resolution is relatively meaningless.

Range bars are not time dependent. They change only as often the price strays from the bar range regardless of whether that process takes one second or four hours. If you feel that the passage of time is important (and I strongly believe that it isn�t, at least not with respect to price levels � the passage of one minute does not affect price in any particular way) you can explore candlesticks of various numbers of seconds which will provide known amounts of data. A one second candlestick gives 86,400 data points per 24 hour day, assuming trades occur within each second on the clock (Surprisingly, they don�t� in ES. Some clock seconds pass without a single trade.). Similarly, a 2 second candlestick yields a max of 43,200 data records per day, a 3 second candlestick a max of 28,800, etc.

Rather than, or at least before, you dive into tick data, I would urge you to explore range bars of various sizes and time-based bars in various numbers of seconds. I strongly recommend that you also include volume in those studies. This data is immediately accessible through your platform and will provide most of the insight you might gain from tick data and doesn't require ancillary programming.

Too often, we are distracted with the computational power of computers and don�t pay enough attention to the meaning of the data. The meaning of the data is constructed through thinking about the data rather than computing the data. Computation can have value in that it can apply a structure to the data that allows us to think about it better, but we still must do the thinking. It is very seductive to believe that a new way of assembling data will somehow immediately show us a previously unseen dynamic within the data. This belief most often leads to endless, nonproductive programming that is constrained only by our unlimited imaginations, which is no constraint at all. Thinking must guide all exploration rather than random searches for computed patterns.

At this point in my explorations, I see tick data as only valuable for exploring algorithm effectivity, nothing more. Creating an algorithm requires the same insight you can gain through the methods I have described above.

I also suggest that you limit your explorations, at least in the beginning, to no more than a year and probably less. My reasoning for that is this: If you can make trading sense over 10 years of data but then your notion of sense falls apart over the eleventh and most recent year, are you likely to trade in the prescribed manner going forward? If you can make trading sense over the last year of data but that sense falls apart when applied to the preceding ten years, are you going to abandon your sense of the last year because ancient history doesn�t conform? It seems likely to me, that if you can�t make trading sense of the last year, then you need to focus on that before getting distracted with more data from the past.

The countervailing argument is that there are repetitive patterns in the data that can only be teased out through year-over year comparisons and various applications of mathematics and statistics. After much study, I conclude that this is not the case, that the market is so dynamic (speculative) that it does not have repetitive rhythms over time cycles. If there are rhythms, they are between price levels (not over time) and those price levels are approximate at that. Consequently, the value of extended history is limited to providing insight into price behavior and contains no deterministic elements that can be modelled mathematically. This is such a powerful and unproveable statement that I am willing to admit the possibility of being wrong, but this is consistent with my extensive study and I share that conclusion freely and with conviction.

Machine learning may open up new frontiers in trading but that is a whole other background to acquire.

While I haven�t answered your question very satisfactorily, I hope I have given you food for thought that will save you many hours in your trading explorations. Best of luck to you and may you discover something ground breaking.

ES Tick Data

Discussion in Emini and Emicro Index

ES Tick Data