Financial Latent Dirichlet Allocation (FinLDA): Feature Extraction in Text and Data Mining for Financial Time Series Prediction

News has been an important source for many financial time series predictions based on fundamental analysis. However, digesting a massive amount of news and data published on the Internet to predict a market can be burdensome. This paper introduces a topic model based on latent Dirichlet allocation (...

Full description

Bibliographic Details
Main Authors: Nont Kanungsukkasem, Teerapong Leelanupab
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8726415/
id doaj-2bced07196e24caaa12bc8171574fe3f
record_format Article
spelling doaj-2bced07196e24caaa12bc8171574fe3f2021-03-30T00:05:14ZengIEEEIEEE Access2169-35362019-01-017716457166410.1109/ACCESS.2019.29199938726415Financial Latent Dirichlet Allocation (FinLDA): Feature Extraction in Text and Data Mining for Financial Time Series PredictionNont Kanungsukkasem0https://orcid.org/0000-0002-2534-3662Teerapong Leelanupab1https://orcid.org/0000-0002-8117-0612Faculty of Information Technology, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, ThailandFaculty of Information Technology, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, ThailandNews has been an important source for many financial time series predictions based on fundamental analysis. However, digesting a massive amount of news and data published on the Internet to predict a market can be burdensome. This paper introduces a topic model based on latent Dirichlet allocation (LDA) to discover features from a combination of text, especially news articles and financial time series, denoted as Financial LDA (FinLDA). The features from FinLDA are served as additional input features for any machine learning algorithm to improve the prediction of the financial time series. We provide posterior distributions used in Gibbs sampling for two variants of the FinLDA and propose a framework for applying the FinLDA in a text and data mining for financial time series prediction. The experimental results show that the features from the FinLDA empirically add value to the prediction and give better results than the comparative features including topic distributions from the common LDA.https://ieeexplore.ieee.org/document/8726415/Bayesian methoddata miningdata preparationdata processingfeature extractionfinancial time series
collection DOAJ
language English
format Article
sources DOAJ
author Nont Kanungsukkasem
Teerapong Leelanupab
spellingShingle Nont Kanungsukkasem
Teerapong Leelanupab
Financial Latent Dirichlet Allocation (FinLDA): Feature Extraction in Text and Data Mining for Financial Time Series Prediction
IEEE Access
Bayesian method
data mining
data preparation
data processing
feature extraction
financial time series
author_facet Nont Kanungsukkasem
Teerapong Leelanupab
author_sort Nont Kanungsukkasem
title Financial Latent Dirichlet Allocation (FinLDA): Feature Extraction in Text and Data Mining for Financial Time Series Prediction
title_short Financial Latent Dirichlet Allocation (FinLDA): Feature Extraction in Text and Data Mining for Financial Time Series Prediction
title_full Financial Latent Dirichlet Allocation (FinLDA): Feature Extraction in Text and Data Mining for Financial Time Series Prediction
title_fullStr Financial Latent Dirichlet Allocation (FinLDA): Feature Extraction in Text and Data Mining for Financial Time Series Prediction
title_full_unstemmed Financial Latent Dirichlet Allocation (FinLDA): Feature Extraction in Text and Data Mining for Financial Time Series Prediction
title_sort financial latent dirichlet allocation (finlda): feature extraction in text and data mining for financial time series prediction
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description News has been an important source for many financial time series predictions based on fundamental analysis. However, digesting a massive amount of news and data published on the Internet to predict a market can be burdensome. This paper introduces a topic model based on latent Dirichlet allocation (LDA) to discover features from a combination of text, especially news articles and financial time series, denoted as Financial LDA (FinLDA). The features from FinLDA are served as additional input features for any machine learning algorithm to improve the prediction of the financial time series. We provide posterior distributions used in Gibbs sampling for two variants of the FinLDA and propose a framework for applying the FinLDA in a text and data mining for financial time series prediction. The experimental results show that the features from the FinLDA empirically add value to the prediction and give better results than the comparative features including topic distributions from the common LDA.
topic Bayesian method
data mining
data preparation
data processing
feature extraction
financial time series
url https://ieeexplore.ieee.org/document/8726415/
work_keys_str_mv AT nontkanungsukkasem financiallatentdirichletallocationfinldafeatureextractionintextanddataminingforfinancialtimeseriesprediction
AT teerapongleelanupab financiallatentdirichletallocationfinldafeatureextractionintextanddataminingforfinancialtimeseriesprediction
_version_ 1724188638006214656