Predicting Stock Price Movements with Text Data using Labeling based on Financial Theory

Fredrik Ahnve, Kasper Fantenberg, Gustav Svensson, Daniel Hardt*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

We apply Natural Language Processing and supervised Machine Learning to predict stock price movements, based on approximately 30,000 ad-hoc disclosures issued by publicly traded companies on Nasdaq OMX Stockholm. Three different labeling methods, based on financial theory, are defined and assessed. The best results, using Logistic Regression and TF-IDF with character-grams, achieve an increase of 6,3 percentage points above a majority class baseline. These results show that corporate ad-hoc disclosures, which are regulated to represent novel and value-relevant information, are particularly well-suited for this task. Furthermore, the most sophisticated labeling technique used, Jensen’s Alpha in the context of the Capital Asset Pricing Model, helps the model achieve its highest accuracy. The results therefore show that financial theory can help isolate the effect of an informational event on stock prices, improving the supervised Machine Learning approach. Finally, an algorithmic trading strategy is simulated with the best model, yielding positive abnormal returns.
Original languageEnglish
Title of host publicationProceedings - 2020 IEEE International Conference on Big Data. Big Data 2020
EditorsXintao Wu, Chris Jermaine, Li Xiong, Xiaohua Hu, Olivera Kotevska, Siyuan Lu, Weija Xu, Srinivas Aluru, Chengxiang Zhai, Eyhab Al-Masri, Zhiyuan Chen, Jeff Saltz
Number of pages8
Place of PublicationLos Alamos, CA
PublisherIEEE
Publication date2020
Pages4365-4372
Article number9378054
ISBN (Print)9781728162522
ISBN (Electronic)9781728162515
DOIs
Publication statusPublished - 2020
EventEighth IEEE International Conference on Big Data. IEEE BigData 2020 - Virtual Event
Duration: 10 Dec 202013 Dec 2020
Conference number: 8
https://bigdataieee.org/BigData2020/

Conference

ConferenceEighth IEEE International Conference on Big Data. IEEE BigData 2020
Number8
LocationVirtual Event
Period10/12/202013/12/2020
Internet address

Keywords

  • Algorithmic trading
  • Machine learning
  • Stock price prediction
  • Ad-hoc disclosures
  • Natural language processing
  • Text mining
  • Finance

Cite this