Scanpath Based N-Gram Models for Predicting Reading Behavior

Abhijit Mishra, Pushpak Bhattacharyya, Michael Carl

    Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

    Abstract

    Predicting reading behavior is a difficult task. Reading behavior depends on various linguistic factors (e.g. sentence length, structural complexity etc.) and other factors (e.g individual's reading style, age etc.). Ideally, a reading model should be similar to a language model where the model is built upon a fixed number of overlapping word sequences (n-grams). But it would be difficult to decide what kind of representation of gaze data (unit of n-grams) would correlate more with cognitive effort associated with reading. Moreover, the randomness associated with gaze data also accounts for data sparsity, making it difficult for gaze based n-gram models to handle real test scenarios.
    It has already been seen that some important eye-movement phenomena are captured better by scanpaths than considering individual fixations, saccades and pauses. In this talk, we propose and validate an n-gram based gaze model for reading. The units contributing to each n-gram will be scanpaths (in a temporal order). We describe different scanpath extraction techniques and chose the one which minimizes the entropy/perplexity of the system. To handle data sparsity, we cluster the scanpaths into several groups, assign them with ids and use n-grams of cluster-ids instead of taking exact scanpaths.
    Original languageEnglish
    Title of host publicationBook of Abstracts : 17th European Conference on Eye Movemement, 11-16 August 2013, Lund, Sweden
    EditorsKenneth Holmqvist, Fiona Mulvey , Roger Johansson
    Place of PublicationLund
    PublisherLund University
    Publication date14 Aug 2013
    Pages448
    Publication statusPublished - 14 Aug 2013
    Event17th European Conference on Eye Movements. 2013 - Lund University, Lund, Sweden
    Duration: 11 Aug 201316 Aug 2013
    Conference number: 17
    http://ecem2013.eye-movements.org/

    Conference

    Conference17th European Conference on Eye Movements. 2013
    Number17
    LocationLund University
    CountrySweden
    CityLund
    Period11/08/201316/08/2013
    Internet address
    SeriesJournal of Eye Movement Research
    Number3
    Volume6
    ISSN1995-8692

    Cite this

    Mishra, A., Bhattacharyya, P., & Carl, M. (2013). Scanpath Based N-Gram Models for Predicting Reading Behavior. In K. Holmqvist, F. Mulvey , & R. Johansson (Eds.), Book of Abstracts: 17th European Conference on Eye Movemement, 11-16 August 2013, Lund, Sweden (pp. 448). Lund: Lund University. Journal of Eye Movement Research, No. 3, Vol.. 6
    Mishra, Abhijit ; Bhattacharyya, Pushpak ; Carl, Michael. / Scanpath Based N-Gram Models for Predicting Reading Behavior. Book of Abstracts: 17th European Conference on Eye Movemement, 11-16 August 2013, Lund, Sweden. editor / Kenneth Holmqvist ; Fiona Mulvey ; Roger Johansson . Lund : Lund University, 2013. pp. 448 (Journal of Eye Movement Research; No. 3, Vol. 6).
    @inbook{9ad067c1e06a4957aa073ae9bd8e6a91,
    title = "Scanpath Based N-Gram Models for Predicting Reading Behavior",
    abstract = "Predicting reading behavior is a difficult task. Reading behavior depends on various linguistic factors (e.g. sentence length, structural complexity etc.) and other factors (e.g individual's reading style, age etc.). Ideally, a reading model should be similar to a language model where the model is built upon a fixed number of overlapping word sequences (n-grams). But it would be difficult to decide what kind of representation of gaze data (unit of n-grams) would correlate more with cognitive effort associated with reading. Moreover, the randomness associated with gaze data also accounts for data sparsity, making it difficult for gaze based n-gram models to handle real test scenarios.It has already been seen that some important eye-movement phenomena are captured better by scanpaths than considering individual fixations, saccades and pauses. In this talk, we propose and validate an n-gram based gaze model for reading. The units contributing to each n-gram will be scanpaths (in a temporal order). We describe different scanpath extraction techniques and chose the one which minimizes the entropy/perplexity of the system. To handle data sparsity, we cluster the scanpaths into several groups, assign them with ids and use n-grams of cluster-ids instead of taking exact scanpaths.",
    author = "Abhijit Mishra and Pushpak Bhattacharyya and Michael Carl",
    year = "2013",
    month = "8",
    day = "14",
    language = "English",
    series = "Journal of Eye Movement Research",
    publisher = "Lund University",
    number = "3",
    pages = "448",
    editor = "Holmqvist, {Kenneth } and {Mulvey }, {Fiona } and {Johansson }, {Roger }",
    booktitle = "Book of Abstracts",

    }

    Mishra, A, Bhattacharyya, P & Carl, M 2013, Scanpath Based N-Gram Models for Predicting Reading Behavior. in K Holmqvist, F Mulvey & R Johansson (eds), Book of Abstracts: 17th European Conference on Eye Movemement, 11-16 August 2013, Lund, Sweden. Lund University, Lund, Journal of Eye Movement Research, no. 3, vol. 6, pp. 448, Lund, Sweden, 11/08/2013.

    Scanpath Based N-Gram Models for Predicting Reading Behavior. / Mishra, Abhijit ; Bhattacharyya, Pushpak ; Carl, Michael.

    Book of Abstracts: 17th European Conference on Eye Movemement, 11-16 August 2013, Lund, Sweden. ed. / Kenneth Holmqvist; Fiona Mulvey ; Roger Johansson . Lund : Lund University, 2013. p. 448 (Journal of Eye Movement Research; No. 3, Vol. 6).

    Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

    TY - ABST

    T1 - Scanpath Based N-Gram Models for Predicting Reading Behavior

    AU - Mishra, Abhijit

    AU - Bhattacharyya, Pushpak

    AU - Carl, Michael

    PY - 2013/8/14

    Y1 - 2013/8/14

    N2 - Predicting reading behavior is a difficult task. Reading behavior depends on various linguistic factors (e.g. sentence length, structural complexity etc.) and other factors (e.g individual's reading style, age etc.). Ideally, a reading model should be similar to a language model where the model is built upon a fixed number of overlapping word sequences (n-grams). But it would be difficult to decide what kind of representation of gaze data (unit of n-grams) would correlate more with cognitive effort associated with reading. Moreover, the randomness associated with gaze data also accounts for data sparsity, making it difficult for gaze based n-gram models to handle real test scenarios.It has already been seen that some important eye-movement phenomena are captured better by scanpaths than considering individual fixations, saccades and pauses. In this talk, we propose and validate an n-gram based gaze model for reading. The units contributing to each n-gram will be scanpaths (in a temporal order). We describe different scanpath extraction techniques and chose the one which minimizes the entropy/perplexity of the system. To handle data sparsity, we cluster the scanpaths into several groups, assign them with ids and use n-grams of cluster-ids instead of taking exact scanpaths.

    AB - Predicting reading behavior is a difficult task. Reading behavior depends on various linguistic factors (e.g. sentence length, structural complexity etc.) and other factors (e.g individual's reading style, age etc.). Ideally, a reading model should be similar to a language model where the model is built upon a fixed number of overlapping word sequences (n-grams). But it would be difficult to decide what kind of representation of gaze data (unit of n-grams) would correlate more with cognitive effort associated with reading. Moreover, the randomness associated with gaze data also accounts for data sparsity, making it difficult for gaze based n-gram models to handle real test scenarios.It has already been seen that some important eye-movement phenomena are captured better by scanpaths than considering individual fixations, saccades and pauses. In this talk, we propose and validate an n-gram based gaze model for reading. The units contributing to each n-gram will be scanpaths (in a temporal order). We describe different scanpath extraction techniques and chose the one which minimizes the entropy/perplexity of the system. To handle data sparsity, we cluster the scanpaths into several groups, assign them with ids and use n-grams of cluster-ids instead of taking exact scanpaths.

    M3 - Conference abstract in proceedings

    T3 - Journal of Eye Movement Research

    SP - 448

    BT - Book of Abstracts

    A2 - Holmqvist, Kenneth

    A2 - Mulvey , Fiona

    A2 - Johansson , Roger

    PB - Lund University

    CY - Lund

    ER -

    Mishra A, Bhattacharyya P, Carl M. Scanpath Based N-Gram Models for Predicting Reading Behavior. In Holmqvist K, Mulvey F, Johansson R, editors, Book of Abstracts: 17th European Conference on Eye Movemement, 11-16 August 2013, Lund, Sweden. Lund: Lund University. 2013. p. 448. (Journal of Eye Movement Research; No. 3, Vol. 6).