How Can a Machine Learning-based LDA Model Help in Literature Search in Systematic Literature Review?

Amila Akagić, Selma Kadic-Maglajlic

Publikation: Bidrag til bog/antologi/rapportBidrag til bog/antologiForskningpeer review

Abstract

The systematic literature review (SLR) is an important method for summarizing previous research findings, and as such, it is relevant to both scholars and practitioners. The critical decision in a SLR is determining the keywords that define the articles to be analyzed further. Although the keywords are carefully selected, the SLR is highly biased due to a possible human error in the selection of keywords, which may lead to various omissions in further analysis. In addition, the number of articles published each year is increasing exponentially and studies are becoming more interdisciplinary, making it increasingly difficult to identify all relevant articles. In this study, we show how machine-learning algorithms can help identify relevant articles by using the latent Dirichlet allocation (LDA) model. This model is based on an unsupervised machine-learning process that enables the identification of articles and topics based on the semantic similarity of the entire article (body) text rather than only keywords. In this study, we demonstrate the application of the LDA method on the COVID-19 Open Research Dataset (CORD-19) database of over 750,000 scientific articles. We describe the main features of the LDA method and provide step-by-step instructions so that readers without a technical background can understand the LDA process. Finally, we provide access to the model trained on the CORD-19 database that enables rapid identification of marketing and management research topics within the database, including a set of “do-it-yourself” options that can help non-technical readers in their initial exercises with LDA.
OriginalsprogEngelsk
TitelHow to Achieve Societal Impact Through Engaged and Collaborative Scholarship : A Guide to Purposeful Marketing Research
RedaktørerMichel van der Borgh, Adam Lindgreen, Tobias Schäfers
Antal sider21
UdgivelsesstedCheltenham
ForlagEdward Elgar Publishing
Publikationsdato2024
Sider190-210
Kapitel10
ISBN (Trykt)9781800888524
ISBN (Elektronisk)9781800888531
DOI
StatusUdgivet - 2024
NavnHow To Guides

Emneord

  • Systematic literature review
  • Machine learning
  • Latent Dirichlet allocation
  • LDA
  • COVID-19 Open Research Dataset

Citationsformater