Classifying Hate Speech on Social Media: Using Deep Bidirectional Transformers for Natural Language Processing

Jonas Vagn Jensen, Malte Lunde Hæstrup & Rasmus Rosenmeyer Poulsen

Student thesis: Master thesis


Hate speech has become more relevant now than ever. Widespread calls for action against online hate speech has pointed fingers at social media companies as those who should act. This thesis utilizes a technical lens to assess the prevalence of hate speech and analyse potential counter measures through the field of machine learning, as well as researching the potential value of such measures. This leads to the research question: - ‘Is it possible to classify online hate speech using Machine Learning, and can such classification methods create value for social media companies and society?’ We apply Naïve Bayes, Support Vector Machine and Logistic Regression to multiple hate speech datasets and compare their ability to identify hate speech against BERT, a natural language processing model developed by Google. Furthermore, we utilize different strategic business frameworks to identify how these classification methods can create value for social media companies and thereby create an incentive towards development. This thesis found that the state-of-the-art machine learning model BERT had mixed results when we tested it against traditional models such as Logistic Regression and Support Vector Machines. Four machine learning tasks were completed to test BERT in a variety of natural language processing machine learning tasks, but ultimately, we found that BERT did not provide any significant increase in performance, compared to the traditional models. We do, however, believe that there was room to optimize the BERT model further. Lastly, we conducted two cross-domain experiments, where we found that appending a dataset to another dataset can work to increase the scope of the models’ hate speech detection abilities. We also conducted an experiment by training on one dataset and testing on another, which proved to worsen the results of all models, due to the difference in data and annotation. Furthermore, the thesis presents a scenario analysis of a self-regulating entity that serves the collective interests of social media companies, lawmakers, social media users, and social media advertisers. By acting as an enforcer of certain Standards of Practice, and as a researcher, developer, and distributer of hate speech management technologies, it is found that such an organisation could pose significant value for social media companies and society.

EducationsMSc in Business Administration and Information Systems, (Graduate Programme) Final Thesis
Publication date2021
Number of pages171
SupervisorsDaniel Hardt