We aim to predict the winner of the FIFA world cup solely based on data. The method applied is not fancy at all, but it should do the trick to get some neat results (spoiler alert: Germany wins!). We use three datasets obtained by Kaggle which contain the outcome of specific pairings between teams, rank, points and the weighted point difference with the opponent. Then, we create a model to predict the outcome of each match during the FIFA world cup 2018. To make the results more appealing, we translate the outcome probabilities to fair odds.
In this post, we are going to build a RNN-LSTM completely from scratch only by using numpy (coding like it’s 1999). LSTMs belong to the family of recurrent neural networks which are very usefull for learning sequential data as texts, time series or video data. While traditional feedforward networks consist of an input layer, a hidden layer, an output layer and the weights, bias, and activation function between each layer, the RNN network incorporates a hidden state layer which is connected to itself (recurrent).
In this blog post, we are going to forecast the Bitcoin price based on text data from Twitter and Reddit. Given that the observed Bitcoin price is formed by some supply and demand function, modeling the demand side, while assuming that the supply side behaves somehow stable, we may end up with some outstanding forecasting results. Social media data has been massively used in the financial industry and requires algorithms that can scale. However, social media data is unstructured and noisy. Supervised learning techniques are strongly domain dependent and need a massive amount of labeled data to be trained on to generalize well. We are going to tackle this problem by mapping the vectorized text data and sentiment directly to future price movements of Bitcoin. The economic theory claims that the price of an asset is a composition of its utility and speculation value. In 2017, we observed a crypto-currency market that went skyrocket – in the absence of a blockchain killer application so far; it is safe to assume that the reason behind this was driven by at least 90% of speculation and 10% by the utility. This assumption highly encourages our project.
Almost a year ago, having my laptop and a sleeping bag in my backpack, I attended an AI-Hackathon in Germany. Right after the kick-off meeting at 9:00 AM I teamed up with two UX Designers and one Business Developer. We immediately started brainstorming to identify a potential project using open data and AI. Our first idea was to find a new particle in the CERN data or new physics. However, we dropped that idea real quick and decided to build a service for visually impaired people. The idea was basically to create an audiobook from any video content. Usually, Hackathons often aren’t long enough to create something entirely from scratch. Nevertheless, as I was working with Deep Learning models for quite some time, it wouldn’t take too long to recycle a couple of thousands line of code and wrap it around some video feed. Given my professional experience, my task was to develop the back-end of a minimalistic prototype within 24 hours while my teammates were focused on the user interface, presentation, and a bulletproof business case.