How hedge funds use machine learning

“Man vs machine” has been a popular discussion in the investment management industry in the last few years. The proponents of the cutting edge technology application to portfolio management claim that algorithms are more efficient and less prone to emotional biases than human investors. They even believe that some day in the not so distant future, artificial intelligence will completely take over the entire industry of active asset management leaving intuition based fundamental investors behind.

Their opponents, equally smart and experienced financial experts, argue that AI no matter how advanced, will never be able to figure out the ever changing markets. Their key argument is that markets are very different from the game of chess, that mere actions of market participants constantly change rules of the game and thus patterns found in history are useless.

To make things even more confusing, media coverage on emotionless algorithms is ironically often very emotional and lacks important technical details. Because the headline “Robots beat humans in the Wall Street game” will probably attract a bigger audience than “Statistical analysis of multidimensional big data sets helps identify short term market price dislocations”.

Machine learning by itself is not a trading strategy. There are multiple different investment strategies across various asset classes that can benefit from machine learning capabilities. Generalizing based merely on use of the technology is rather meaningless. What matters is the market inefficiency the strategy aims to explore and whether machine learning adds value to that specific process.

Statistical arbitrage — a short term trading strategy that employs mean reversion models — became one of the earliest practical applications of machine learning in investment management. The strategy doesn’t imply directional bets or exposure to the broader market moves, focusing instead on relationships dynamics between factors and prices.

The idea of mean reversion assumes that stock prices regardless of fluctuations eventually get back to normal. The opportunity is in finding the “errors” when stock price behavior is different from “normal” explained by historical relations, as such “errors” are supposed to disappear over time. The tricky part is to identify factors or benchmarks that represents the market equilibrium.

But why would such opportunities even exist in seemingly efficient markets? The reason lies in the actions of the largest market participants — mutual funds, ETFs, large fundamental hedge funds. They dominate the markets in terms of the amounts of capital under management and even in diversified portfolios, their individual positions are very large. When such giants rebalance their portfolios to express their long-term view they don’t care that much about market impact and the precise execution price. With a long investment horizon they operate on a big picture level relying on fundamental considerations in their research. Large player’s trades move the market away from the equilibrium, and that is something shorter term traders can benefit from by exploiting temporary imbalance between what certain correlations are supposed to be and what they are at the moment. In other words, arbitrage opportunities come from other people’s actions and reactions to price moves. With so many active market participants, this source of opportunities is here to stay. Profit margin on each individual arbitrage trade is slim, especially after accounting for trading costs, but even a tiny edge can be just enough to build a profitable strategy.

Years ago stat arb funds exploited simple pair trades, relying on rather obvious correlations (e.g. historically stock A always did better than stock B in a bull market, but in a recent market rally they both added 25%. A trader would then sell stock A and buy B in anticipation that their relative performance will be back to historically normal). But those no longer work. As more people chase the same arbitrage opportunities, already thin profit margins start fading away. That doesn’t mean there are no more inefficiencies and opportunities though. Markets constantly evolve and become not only more efficient but also more complex and interconnected. There are plenty of factors that impact market prices; any information that can be digitized and tested potentially represent a source of market moving signals.

Investment strategy focused on exploring short term market inefficiencies thus became a natural application of machine learning. Algorithms can identify subtle multidimensional anomalies an investor can’t see by the unaided eye. Of course, there is a risk of identifying false coincidental correlations (a commonly criticized quant wishful thinking called overfitting), but with the right testing process in place, this can be avoided. For statistical significant signals lack of intuitive explainability is not necessarily a deal breaker. The good thing about using deep factors that are not easily explainable is that those signals are rarely overcrowded.

Identifying inefficiencies is just the beginning of the process though. Most of the signals are just not strong enough to support sizable bets. To produce solid results, thousands of strategies need to be combined and assigned carefully calibrated weights. The optimal portfolio with the target risk profile is ultimately a unified system, not a random collection of strategies.

On top of that, execution limitations should be taken into account. If the model suggestions are impractical (for example, it suggests to sell short a stock that can’t be borrowed) realized profits of such a strategy will be very disappointing. One of the biggest execution limitations is market impact. The larger the trade is the more it will move the market and reduce profit margin. Modeling market impact is yet another important application of machine learning in the quant investing process.

Perhaps the biggest misconception about machine learning funds is that all it takes to succeed is to buy a commercialized dataset and an off-the-shelf machine learning algorithm. This myth leads to an unreasonable expectation that high tech will make competitive alpha generation easier. In reality it is quite the opposite, the proliferation of big data and machine learning will further raise the entry barrier and make the hedge fund industry more competitive.

Even for the most talented quant teams, it may take years of hard work and millions of dollars in investments in R&D to build a highly complex custom-made system to collect and process data, identify factors, test signals, create an optimal portfolio and execute with minimum possible market impact. But once the whole system is combined into an elegant monolithic model, economies of scale become apparent, and it may become a self-improving machine that just requires data as a fuel.

If machine learning works for investment research, then why hasn’t it taken over the industry yet and why investors still work with fundamental researchers and discretionary fund managers? The reason is in the nature of the trading strategies machine learning is most suitable for (at least for now). A high Sharpe strategy exploiting short-term opportunity faces a tradeoff between scale and profitability. Statistical arbitrage is limited in capacity, some of the successful quant teams can’t even reinvest profits to achieve compounding growth without shrinking profit margins. And when they face a dilemma on whether to take 20% of an investor’s profit or retain 100% of return, simply borrowing money through leverage seems like a much cheaper option.

One way for successful stat arb teams to accommodate external investors without diluting their high-Sharpe strategy is to offer a separate product with a longer holding period and higher capacity. Of course, the returns of such scalable longer-term programs will not be the same as those of short-term high Sharpe funds, but investors can still get an attractive level of risk adjusted return and benefit from the state-of-the-art infrastructure and research process powered by machine learning.

Stat arb funds are an example of early adopters of machine learning, but that is by far not the only investment strategy that can benefit from the technology. A growing number of portfolio managers who seemingly have nothing to do with quant trading, have already started using products of natural language processing and image recognition as inputs into their research process. In the nearest future virtually all asset managers will utilize machine learning techniques either by developing their own tools, or by consuming some sort of information product created by a third party provider using elements of machine learning.

Managing Director at

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store