Abstract
We apply Natural Language Processing and supervised Machine Learning to predict stock price movements, based on approximately 30,000 ad-hoc disclosures issued by publicly traded companies on Nasdaq OMX Stockholm. Three different labeling methods, based on financial theory, are defined and assessed. The best results, using Logistic Regression and TF-IDF with character-grams, achieve an increase of 6,3 percentage points above a majority class baseline. These results show that corporate ad-hoc disclosures, which are regulated to represent novel and value-relevant information, are particularly well-suited for this task. Furthermore, the most sophisticated labeling technique used, Jensen’s Alpha in the context of the Capital Asset Pricing Model, helps the model achieve its highest accuracy. The results therefore show that financial theory can help isolate the effect of an informational event on stock prices, improving the supervised Machine Learning approach. Finally, an algorithmic trading strategy is simulated with the best model, yielding positive abnormal returns.
Original language | English |
---|---|
Title of host publication | Proceedings - 2020 IEEE International Conference on Big Data. Big Data 2020 |
Editors | Xintao Wu, Chris Jermaine, Li Xiong, Xiaohua Hu, Olivera Kotevska, Siyuan Lu, Weija Xu, Srinivas Aluru, Chengxiang Zhai, Eyhab Al-Masri, Zhiyuan Chen, Jeff Saltz |
Number of pages | 8 |
Place of Publication | Los Alamitos, CA |
Publisher | IEEE |
Publication date | 2020 |
Pages | 4365-4372 |
Article number | 9378054 |
ISBN (Print) | 9781728162522 |
ISBN (Electronic) | 9781728162515 |
DOIs | |
Publication status | Published - 2020 |
Event | Eighth IEEE International Conference on Big Data. IEEE BigData 2020 - Virtual Event Duration: 10 Dec 2020 → 13 Dec 2020 Conference number: 8 https://bigdataieee.org/BigData2020/ |
Conference
Conference | Eighth IEEE International Conference on Big Data. IEEE BigData 2020 |
---|---|
Number | 8 |
Location | Virtual Event |
Period | 10/12/2020 → 13/12/2020 |
Internet address |
Keywords
- Algorithmic trading
- Machine learning
- Stock price prediction
- Ad-hoc disclosures
- Natural language processing
- Text mining
- Finance