Abstract
Fake news has grown into a multi-billion dollar problem which the World Economic Forum ranks as among the world’s top global risks. Research on fake news detection solutions have predominantly been focused on high-resource languages, whereas low-resource languages have been left understudied due to data scarcity of labeled fake news claims. As the research advancements for fake news detection is insufficient on low-resource languages, and the global rise of disinformation also is prevailing in Swedish and Norwegian, a deeper understanding of the field is salutary. Hence, this study attempts to bridge the research gap of fake news detection on the Nordic low-resource languages by leveraging transfer learning in combination with the high-resource English language through the utilization of multilingual BERT. Data from esteemed media outlets and fact-checking organizations, such as the Norwegian Faktisk.no and the Swedish Källkritikbyrån, have been used in combination with the English third-party fact-checked dataset MultiFC. The findings demonstrate how the combined effects of transfer learning on a high-resource language and linguistically similar languages enable the highest ability to distinguish fake claims from true claims. The extended crosslingual learning setting displays a 11,07 and 6,40 percentage points increase in predictability power compared to the best monolingual baselines for Norwegian and Swedish respectively. The authors further propose how business value could be derived from the insights retrieved by complementing established fake news solutions through a broader language scope and more efficient utilization of resources.
Educations | MSc in Business Administration and Information Systems, (Graduate Programme) Final Thesis |
---|---|
Language | English |
Publication date | 2021 |
Number of pages | 94 |
Supervisors | Daniel Hardt |