There have been a number of posts and threads by several users to chart spreads and calculate z-scores such as the one started by MXASJ along with calculating correlation or better yet, co-integration in either R or MATLAB. One of the things I think has been alluded to but not presented is how to find stocks grouped by industries.
I'm starting this thread to see if I've overlooked a presented approach or to see if some one can improve upon mine.
Yahoo divides the world of stocks in to 9 sectors and further into 215 industries, so the approach I've taken is to scrape the stocks in each of the 215 industries into a file. Perhaps this could be done totally with the Yahoo API's but I did not easily find everything I wanted so I used python to scrape the web pages directly. There is probably room for improvement there.
I further wanted to remove over the counter or pink sheets, along with filtering on volume, price and optionable. Personally, I don't want to trade pairs on illiquid instruments.
I also wanted the ability to easily format the output file into suitable format to directly be used by an R program I've written, or pasted into PairTrade Finder or imported into ArbMaker. I'm beta testing ArbMaker and think it will become a very useful method of testing for co-integration as they use Engle-Granger rather than Augmented Dickey Fuller.
Putting all this together is done in a number of steps with intermediate to allow debugging at each step. It is certainly possible to combine some of scripts.