Explainable Machine Learning Exploiting News and Domain-Specific Lexicon for Stock Market Forecasting

In this manuscript, we propose a Machine Learning approach to tackle a binary classification problem whose goal is to predict the magnitude (high or low) of future stock price variations for individual companies of the S&P 500 index. Sets of lexicons are generated from globally published art...

Full description

Bibliographic Details
Main Authors: Salvatore M. Carta, Sergio Consoli, Luca Piras, Alessandro Sebastian Podda, Diego Reforgiato Recupero
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9355141/
Description
Summary:In this manuscript, we propose a Machine Learning approach to tackle a binary classification problem whose goal is to predict the magnitude (high or low) of future stock price variations for individual companies of the S&P 500 index. Sets of lexicons are generated from globally published articles with the goal of identifying the most impactful words on the market in a specific time interval and within a certain business sector. A feature engineering process is then performed out of the generated lexicons, and the obtained features are fed to a Decision Tree classifier. The predicted label (high or low) represents the underlying company's stock price variation on the next day, being either higher or lower than a certain threshold. The performance evaluation we have carried out through a walk-forward strategy, and against a set of solid baselines, shows that our approach clearly outperforms the competitors. Moreover, the devised Artificial Intelligence (AI) approach is explainable, in the sense that we analyze the white-box behind the classifier and provide a set of explanations on the obtained results.
ISSN:2169-3536