In this article, we’ll be presenting some interesting papers from the premier machine learning and data mining conference in Europe. Namely, the European Conference on Machine learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), which took place in Würzburg, Germany, from the 16th to the 20th of September 2019.
At the Midas workshop, Argimiro Arratia and Eduardo Sepulveda presented their work on “Convolutional Neural Networks, Image Recognition, and Financial Time Series Forecasting“. In it, they converted financial information into images and fed them into a convolutional neural network (CNN). This resulted in an improvement in classification compared to feeding the time series into a CNN without conversion. To convert the time series into images, they used reccurence plots.
The following images show a conversion of a S&P500 timeseries into recurrence plots.
These recurrence plots were used in a first experiment to predict the direction of price of the S&P 500 index. To do this, they compared a Conv1D CNN with a Conv2D CNN that used the recurrence plots as input, and obtained the following average results for a S&P 500 experiment: At the Midas workshop, Argimiro Arratia and Eduardo Sepulveda presented their work on “Convolutional Neural Networks, Image recognition, and Financial Time Series Forecasting”. The following table shows the erformance of CNN without/with RP for S&P500 Price Prediction:
|Acc||Loss||10-fold CV||AUC||Matthews cor.|
|Convs2 + RP||0.63||2.69||63.22% (±0.93%)||-0.66||-0.0026|
In a second experiment, they tried to predict the possibility of bankruptcy in a set of U.S. banks. Again, with a Conv1D and a Conv2D with recurrence plots as inputs. They received the following results:
|Acc||Loss||10-fold CV||AUC||Matthews cor.|
|Conv2D + RP||0.91||1.01||93.75% (±0.07%)||0.83||0.67|
These results confirmed the outcome of the S&P 500 experiment and showed that converting financial information into images can imrpove the accuracy of a CNN, as well as reduce the loss.
The highlight of the conference was Guillaume Doquet and Michele Sebag receiving the award for the Best Paper for their paperon “Agnostic feature selection“.
In their work, they introduced the AgnoS algorithm, which combines an Auto Encoder with structural regularizations. This algorithm can be used for difficult feature selection tasks.
The following image shows the neural net structure of the AgnoS-S algoirthm, an AgnoS Algorithm with a LASSO regulation.
f1, …, fD represent the input features, a1, … ,aD the feature weights, Φ the AutoEncoder network and ^f1, …, ^fD hat the reconstructed features.
The regularization term is added to the lossfunction during the backprogrammation of the AutoEncoder of the AgnoS algorithm. This term pushes the weights a1, … ,aD towards a sparse vector to perform a feature selection. After the backprogrammation, the features are ranked by the decreasing absolute value of the weights a1, … ,aD.
They compared this Algorithm to four unsupervised feature selection baselines on different datasets taken from the scikit-feature database and showed that the AgnoS algorithm can better recover a whole feature set compared to the unsupervised baselines but suffer from higher computational cost.
Another interesting paper presented was “Fast Gradient Boosting Decision Trees with Bit-Level Data Structures” by Larens Devos, Wannes Meert, and Jesse Davis.
The Gradient Boosting Decision Tree model is a powerful machine learning method, but it could take a lot of time because it iteratively constructs decision trees to form an additive ensemble model. The majority of the time for constructing the trees is spent on the evaluation of candidate splits while learning a single tree. To overcome this problem, they presented the BitBoost algorithm, which represents the data and gradients in the model using bitsets and bitslices. The following table shows the training time and accuracies on different data sets.
Consequently, the BitBoost algorithm can speed up the construction of the model 2 to 10 times compared to other state-of-the-art systems (LightGBM, XGBoost, CatBoost), without harming the predictive performance.
The code can be found on GitHub here.
Devos, L., Meert, W., & Davis, J. (2019). Fast Gradient Boosting Trees with Bit-Level Data Structure. In Proceedings of ECML PKDD. Springer
Doquet, G., & Sebag, M. (2019) Agnostic feature selection.
Arratia, A., & Sepulveda, E. (2019). Convolutional Neural Networks, image recognition and financial time series forecasting.