How Can I Go About Applying Machine Learning Algorithms to Stock Markets?


I am not very sure, if this question fits in here.

I have recently begun, reading and learning about machine learning. Can someone throw some light onto how to go about it or rather can anyone share their experience and few basic pointers about how to go about it or atleast start applying it to see some results from data sets? How ambitious does this sound?

Also, do mention about standard algorithms that should be tried or looked at while doing this.



There seems to be a basic fallacy that someone can come along and learn some machine learning or AI algorithms, set them up as a black box, hit go, and sit back while they retire.

My advice to you:

Learn statistics and machine learning first, then worry about how to apply them to a given problem. There is no free lunch here. Data analysis is hard work. Read “The Elements of Statistical Learning” (the pdf is available for free on the website), and don’t start trying to build a model until you understand at least the first 8 chapters.

Once you understand the statistics and machine learning, then you need to learn how to backtest and build a trading model, accounting for transaction costs, etc. which is a whole other area.

After you have a handle on both the analysis and the finance, then it will be somewhat obvious how to apply it. The entire point of these algorithms is trying to find a way to fit a model to data and produce low bias and variance in prediction (i.e. that the training and test prediction error will be low and similar). Here is an example of a trading system using a support vector machine in R, but just keep in mind that you will be doing yourself a huge disservice if you don’t spend the time to understand the basics before trying to apply something esoteric.


Just to add an entertaining update: I recently came across this master’s thesis: “A Novel Algorithmic Trading Framework Applying Evolution and Machine Learning for Portfolio Optimization” (2012). It’s an extensive review of different machine learning approaches compared against buy-and-hold. After almost 200 pages, they reach the basic conclusion: “No trading system was able to outperform the benchmark when using transaction costs.” Needless to say, this does not mean that it can’t be done (I haven’t spent any time reviewing their methods to see the validity of the approach), but it certainly provides some more evidence in favor of the no-free lunch theorem.


Stack Exchange