Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-task Learning

Daniel Hardt, Dirk Hovy, Sotiris Lamprinidis

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Newspapers need to attract readers with headlines, anticipating their readers’ preferences. These preferences rely on topical, structural, and lexical factors. We model each of these factors in a multi-task GRU network to predict headline popularity. We find that pre-trained word embeddings provide significant improvements over untrained embeddings, as do the combination of two auxiliary tasks, newssection prediction and part-of-speech tagging. However, we also find that performance is very similar to that of a simple Logistic Regression model over character n-grams. Feature analysis reveals structural patterns of headline popularity, including the use of forward-looking deictic expressions and second person pronouns.
Newspapers need to attract readers with headlines, anticipating their readers’ preferences. These preferences rely on topical, structural, and lexical factors. We model each of these factors in a multi-task GRU network to predict headline popularity. We find that pre-trained word embeddings provide significant improvements over untrained embeddings, as do the combination of two auxiliary tasks, newssection prediction and part-of-speech tagging. However, we also find that performance is very similar to that of a simple Logistic Regression model over character n-grams. Feature analysis reveals structural patterns of headline popularity, including the use of forward-looking deictic expressions and second person pronouns.
LanguageEnglish
Title of host publicationProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
EditorsEllen Riloff, David Chiang, Julia Hockenmaier, Tsujii Jun’ichi
Number of pages6
Place of PublicationBrussels
PublisherAssociation for Computational Linguistics
Date2018
Pages659-664
StatePublished - 2018
Event2018 Conference on Empirical Methods in Natural Language Processing - Square Meeting Center, Brussels, Belgium
Duration: 31 Oct 20184 Nov 2018
http://emnlp2018.org/

Conference

Conference2018 Conference on Empirical Methods in Natural Language Processing
LocationSquare Meeting Center
CountryBelgium
CityBrussels
Period31/10/201804/11/2018
Internet address

Cite this

Hardt, D., Hovy, D., & Lamprinidis, S. (2018). Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-task Learning. In E. Riloff, D. Chiang, J. Hockenmaier, & T. Jun’ichi (Eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 659-664). Brussels: Association for Computational Linguistics.
Hardt, Daniel ; Hovy, Dirk ; Lamprinidis, Sotiris. / Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-task Learning. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing . editor / Ellen Riloff ; David Chiang ; Julia Hockenmaier ; Tsujii Jun’ichi. Brussels : Association for Computational Linguistics, 2018. pp. 659-664
@inproceedings{b7aa3f7aba55440988071c6c4e0ed961,
title = "Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-task Learning",
abstract = "Newspapers need to attract readers with headlines, anticipating their readers’ preferences. These preferences rely on topical, structural, and lexical factors. We model each of these factors in a multi-task GRU network to predict headline popularity. We find that pre-trained word embeddings provide significant improvements over untrained embeddings, as do the combination of two auxiliary tasks, newssection prediction and part-of-speech tagging. However, we also find that performance is very similar to that of a simple Logistic Regression model over character n-grams. Feature analysis reveals structural patterns of headline popularity, including the use of forward-looking deictic expressions and second person pronouns.",
author = "Daniel Hardt and Dirk Hovy and Sotiris Lamprinidis",
year = "2018",
language = "English",
pages = "659--664",
editor = "Ellen Riloff and David Chiang and Julia Hockenmaier and Tsujii Jun’ichi",
booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
publisher = "Association for Computational Linguistics",
address = "United States",

}

Hardt, D, Hovy, D & Lamprinidis, S 2018, Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-task Learning. in E Riloff, D Chiang, J Hockenmaier & T Jun’ichi (eds), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics, Brussels, pp. 659-664, 2018 Conference on Empirical Methods in Natural Language Processing , Brussels, Belgium, 31/10/2018.

Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-task Learning. / Hardt, Daniel; Hovy, Dirk; Lamprinidis, Sotiris.

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing . ed. / Ellen Riloff; David Chiang; Julia Hockenmaier; Tsujii Jun’ichi. Brussels : Association for Computational Linguistics, 2018. p. 659-664.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

TY - GEN

T1 - Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-task Learning

AU - Hardt,Daniel

AU - Hovy,Dirk

AU - Lamprinidis,Sotiris

PY - 2018

Y1 - 2018

N2 - Newspapers need to attract readers with headlines, anticipating their readers’ preferences. These preferences rely on topical, structural, and lexical factors. We model each of these factors in a multi-task GRU network to predict headline popularity. We find that pre-trained word embeddings provide significant improvements over untrained embeddings, as do the combination of two auxiliary tasks, newssection prediction and part-of-speech tagging. However, we also find that performance is very similar to that of a simple Logistic Regression model over character n-grams. Feature analysis reveals structural patterns of headline popularity, including the use of forward-looking deictic expressions and second person pronouns.

AB - Newspapers need to attract readers with headlines, anticipating their readers’ preferences. These preferences rely on topical, structural, and lexical factors. We model each of these factors in a multi-task GRU network to predict headline popularity. We find that pre-trained word embeddings provide significant improvements over untrained embeddings, as do the combination of two auxiliary tasks, newssection prediction and part-of-speech tagging. However, we also find that performance is very similar to that of a simple Logistic Regression model over character n-grams. Feature analysis reveals structural patterns of headline popularity, including the use of forward-looking deictic expressions and second person pronouns.

M3 - Article in proceedings

SP - 659

EP - 664

BT - Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

PB - Association for Computational Linguistics

CY - Brussels

ER -

Hardt D, Hovy D, Lamprinidis S. Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-task Learning. In Riloff E, Chiang D, Hockenmaier J, Jun’ichi T, editors, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing . Brussels: Association for Computational Linguistics. 2018. p. 659-664.