Couple days ago, when Mark Zuckerberg, the billionaire founder and chief executive of Facebook faced senators on the House side of Capitol Hill, for two-day detailed questioning by more than 100 lawmakers, he didn’t break a sweat. It was, by basketball lingo, a serious mismatch in the paint and posterizing was inevitable. CEO of biggest social network came well prepared and next two halves of the game that lasted more than 20 hours he spent explaining to longstanding senators how social network (and internet) actually work. Mr Zuckerberg completed his job successfully, once again. He protected his empire for which he stated “has no known competitors”, and investors gave him thumbs up on the stock exchange next morning. The biggest takeaway from the press was following: senators don’t understand how Facebook works. And they are not alone. If you have a Facebook profile, chances are, you don’t understand too.
In this blog post, we are going to forecast the Bitcoin price based on text data from Twitter and Reddit. Given that the observed Bitcoin price is formed by some supply and demand function, modeling the demand side, while assuming that the supply side behaves somehow stable, we may end up with some outstanding forecasting results. Social media data has been massively used in the financial industry and requires algorithms that can scale. However, social media data is unstructured and noisy. Supervised learning techniques are strongly domain dependent and need a massive amount of labeled data to be trained on to generalize well. We are going to tackle this problem by mapping the vectorized text data and sentiment directly to future price movements of Bitcoin. The economic theory claims that the price of an asset is a composition of its utility and speculation value. In 2017, we observed a crypto-currency market that went skyrocket – in the absence of a blockchain killer application so far; it is safe to assume that the reason behind this was driven by at least 90% of speculation and 10% by the utility. This assumption highly encourages our project.