I've been trying out IQFeed data streaming via Python and Amibroker. As IQFeed gives unfiltered data (From IQFeed's website: "...IQFeed provides a TRUE, tick-by-tick datafeed. IQFeed feed is completely unfiltered, allowing you to see EVERY TRADE..."), bad ticks are actually quite frequent.
When I streamed 1-min data into Amibroker, those bad ticks appear as very long 'tails' in certain bars that, to human beings, seem obviously wrong. An obvious way to filter would be to use some sort of standard deviation or ATR to determine if the bar's high or low is correct. But I feel that this is rather arbitrary.
I have 2 very preliminary ideas that I wish to bounce off the community. They are technically possible because IQFeed provides a lot more information than just bar-data. They also provide bid-ask spreads, tick-level trades, among other data.
Preliminary Idea #1
Using Python, I stream bars of a much finer timescale (e.g. 1sec) than what I really need (i.e. 1-min bars). At the finer timescale, trades which are out of whack with the rest can be easily identified and ignored. I will then programmatically aggregate the 1sec bars into 1minute bars.
Preliminary Idea #2
Another idea is to monitor in real-time the bid-ask spreads and when a trade appears that are way different from the bid-ask spreads, ignore that trade.
I apologize if my 2 ideas above sound hazy and lack detail - as mentioned, they're very preliminary ideas. Thus, please feel free to comment on them.
I would also be happy to hear from anyone who has other methods of filtering bad ticks from streaming data.