Last week at Beryl Elites Alternative Investment Conference 2018, we started our panel discussion about finding an informational edge with the definition of alternative data. Turns out, in a highly heterogeneous and unstructured alternative data market, even the definition is not standardized.
Most people think of alternative data as a very niche esoteric segment of unstructured data that was created recently and has not been used in the investment process before. The most common examples usually include geolocation, satellite images or social media feeds. This is not incorrect, but this definition is far from comprehensive, and it leads to the false perception that alternative data is only relevant to a handful of quants.
I think the definition should be much broader. What we refer to as “alternative data” is, in fact, any data being used for a different purpose than initially intended. And it includes:
(1) Unstructured data that hasn’t been traditionally employed in investment decision making, such as web traffic used to track consumer behavior or IoT data in logistics used to estimate shipping activity across a supply chain.
(2) Unstructured data that has always been an essential source of hedge funds’ information edge, only now it is collected and used in a different manner. For example, sell-side recommendations collected via alpha capture programs and turned into scores or retail sentiment collected via crowdsourcing.
(3) Structured traditional numerical data repurposed for alternative use cases. For example, order-book data used for alpha generation, options or debt market data used in equity trading.
Another common perception is that the value of an alternative data source comes exclusively from its scarcity and widely available data is useless in alpha generation. This is not always accurate. The ultimate goal of data driven research is not finding data, it is finding alpha in the data, and it takes expertise and creativity. Applying a proprietary feature engineering approach and combining different data sets to enhance the signal, one investment manager can find much more value than others in the same sources of data, just like musicians create very different music using the same keyboard.
Alternative data is high maintenance compared to the traditional data sources. Alpha decay effect is real and implementing a new alternative data set is just the beginning of the process. In order to maintain acceptable ROI, an investment manager needs to monitor their data library and constantly search for ways to optimize it. Dropping data sets with a weak signal to noise ratio and adding those containing uncorrelated features is quite obvious. The less obvious way to increase efficiency is to replace performing data sets with alternative sources that can cover those same factors at a lower cost.
The interest in alternative data keeps growing. Now it is not only quants but also the mainstream fundamental hedge fund managers who recognize that alternative data is a source of informational edge and competitive advantage. Even private equity funds have started using alternative data for deal sourcing and monitoring performance of their portfolio companies.
I expect that speaking at an alternative investment conference in a year from now, we will not need to start the panel discussion with the definitions as everybody in the audience, from hedge funds to family offices, will have some exposure to a data-driven economy. I also expect that the conversation will shift from data quality issues towards use cases as the industry standards for data quality grow. And finally, in the near future, we can even drop the term “alternative” as one will not be able to imagine a successful investment decision making process without big data insights.