Twitter Data Outperforms Other Investment Strategies
March 27, 2012

Twitter Data Outperforms Other Investment Strategies

From palm readers to the Weather Channel to Dionne Warwick and her Psychic Friends, people have always been interested in ways to foresee the future. Sometimes, as in the case of meteorology, these methods are based in science and at other times these predictions are based in something quite different.

A University of California, Riverside professor recently looked to Twitter to help predict the traded volume and value of a stock for the following day. Using a trading model based on statistical data culled from the micro-blogging site, Professor Vagelis Hristidis and a small team of researchers were able to outperform not only the Dow Jones Industrial Average, but also other baseline strategies by between 1.4 percent and nearly 11 percent during a four-month simulation.

Hristidis noted several weaknesses in the study, which was presented last month at the Fifth ACM International Conference on Web Search & Data Mining in Seattle, including the fact that the study was only performed over a time period where the Dow Jones dropped in value. Also, the simulated Twitter-based model did not start outperforming all other models until 30 days after the simulation began.

In the study, researchers focused on both the volume of tweets and how the content of each message relates to the stock. They randomly selected 150 stocks from the S&P500 index and both obtained the daily closing price and the daily number of trades for the first half of 2010.

After filtering out only relevant tweets for those companies during that time period, the researchers found the number of stock trades was not correlated as strongly with tweet volume as it was with what the study referred to as “connected components.” These components are tweets that all occur on the same trading day, but are about unique topics related to a particular stock.

Hristidis´s team also found a somewhat looser association between these connected components and stock price by simulating a series of investments between March 1, 2010 and June 30, 2010.  During that time frame, the Dow Jones Industrial Average fell 4.2 percent, yet the investment model the researchers developed using Twitter data lost on average 2.4 percent.

The researchers also compared their Twitter-based model to several other models. The Twitter model outperformed an auto-regression model (which lost 13.1 percent), a random model (5.5 percent), and a fixed model (3.8 percent) based on a combination of market cap, company size, and total debt.

Many people have looked to the micro-blogging site to forecast other events, with somewhat less scientific means. Many of these prognostic models view Twitter as having the capacity of a “hive mind” with a greater collective conscious or predictive abilities.

Researchers from both private companies and public universities have looked at Twitter-based data to predict everything from Grammy winners to box office sales. A study at the USC Annenberg Innovation Lab inaccurately predicted this year´s Academy Award winners after a year of collecting data from Twitter. A major reason for this missed prediction might be demographics. An L.A. Times study showed Oscar voters to be mostly older, white men, while Twitter users tend to be under 35 years old and evenly split between men and women.

The published paper that outlines the findings can be found here.