Recommending Articles Using Implicit Feedback Data

Mikkel Sikker Sørensen

Student thesis: Master thesis

Abstract

This paper explores the area of different recommender systems, their implications, and business relevance. The experiments conducted are based on a dataset provided by the Danish news media Jyllands-Posten. The experiments seek to discover the possibilities for Jyllands-Posten to improve their subscribers online customer experience, utilizing the dataset provided with millions of data observations on thousands of articles. Different recommender systems and approaches to developing them have been tested and discussed. A ranking factorization model based on a collaborative filtering approach performed the best out of the tested item-similarity and popularity-based models. The popularity-based approach served as a baseline for the other models performances since it is the most simple approach and its functions are similar to the solution that they have currently implemented. The popularity-based model recommends the same articles to all users accessing jyllands-posten.dk and thus lack the element of personalization. This could also be why the ranking factorization model outperformed the popularity-based model by about 10-15% in every evaluation. The recommendation for implementing the ranking factorization model comes with a catch, which is that it only should start recommending to users exceeding the threshold of 80 read articles on jyllands-posten.dk. This is due to the fact that with 3 given recommendations and with a threshold of at least 80 read articles the recommendation system had an accuracy of 33,5% and thus it is fair to assume that at least one of the articles recommended, statistically, is a valuable recommendation. By giving personalized recommendations to its users, Jyllands-Posten can improve their satisfaction as well as better utilize the long tail of having thousands of articles available on their site, that despite of their publishing date might be of relevance to some users.

EducationsMSc in Business Administration and Information Systems, (Graduate Programme) Final Thesis
LanguageEnglish
Publication date2017
Number of pages61
SupervisorsDaniel Hardt