Analysis of MultiWord Expression Translation Errors in Statistical Machine Translation

Natalia Klyueva, Jeevanthi Liyanapathirana

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    Abstract

    In this paper, we analyse the usage of multiword expressions (MWE) in Statistical Machine Translation (SMT). We exploit the Moses SMT toolkit to train models for French-English and Czech-Russian language pairs. For each language pair, two models were built: a baseline model without additional MWE data and the model enhanced with information on MWE. For the French-English pair, we tried three methods of introducing the MWE data. For Czech-Russian pair, we used just one method – adding automatically extracted data as a parallel corpus.
    Original languageEnglish
    Title of host publicationWorkshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015
    EditorsGloria Corpas Pastor, Johanna Monti, Violeta Seretan, Ruslan Mitkov
    Place of PublicationGeneve
    PublisherTradulex
    Publication date2015
    Pages87-91
    ISBN (Print)9782970073697
    Publication statusPublished - 2015
    EventThe 2nd Workshop on Multi-word Units in Machine Translation and Translation Technology. MUMTTT 2015 : Part of EUROPHRAS 2015 - Malaga, Spain
    Duration: 1 Jul 20152 Jul 2015
    Conference number: 2
    http://typo.uni-konstanz.de/parseme/index.php/events/118-mumttt-workshop-at-europhras-15

    Conference

    ConferenceThe 2nd Workshop on Multi-word Units in Machine Translation and Translation Technology. MUMTTT 2015
    Number2
    CountrySpain
    CityMalaga
    Period01/07/201502/07/2015
    Internet address

    Cite this

    Klyueva, N., & Liyanapathirana, J. (2015). Analysis of MultiWord Expression Translation Errors in Statistical Machine Translation. In G. C. Pastor, J. Monti, V. Seretan, & R. Mitkov (Eds.), Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015 (pp. 87-91). Geneve: Tradulex.
    Klyueva, Natalia ; Liyanapathirana, Jeevanthi . / Analysis of MultiWord Expression Translation Errors in Statistical Machine Translation. Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015. editor / Gloria Corpas Pastor ; Johanna Monti ; Violeta Seretan ; Ruslan Mitkov. Geneve : Tradulex, 2015. pp. 87-91
    @inproceedings{08d47a5702a24d1491ca63d76c3eb83a,
    title = "Analysis of MultiWord Expression Translation Errors in Statistical Machine Translation",
    abstract = "In this paper, we analyse the usage of multiword expressions (MWE) in Statistical Machine Translation (SMT). We exploit the Moses SMT toolkit to train models for French-English and Czech-Russian language pairs. For each language pair, two models were built: a baseline model without additional MWE data and the model enhanced with information on MWE. For the French-English pair, we tried three methods of introducing the MWE data. For Czech-Russian pair, we used just one method – adding automatically extracted data as a parallel corpus.",
    author = "Natalia Klyueva and Jeevanthi Liyanapathirana",
    year = "2015",
    language = "English",
    isbn = "9782970073697",
    pages = "87--91",
    editor = "Pastor, {Gloria Corpas} and Johanna Monti and Violeta Seretan and Ruslan Mitkov",
    booktitle = "Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015",
    publisher = "Tradulex",
    address = "Switzerland",

    }

    Klyueva, N & Liyanapathirana, J 2015, Analysis of MultiWord Expression Translation Errors in Statistical Machine Translation. in GC Pastor, J Monti, V Seretan & R Mitkov (eds), Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015. Tradulex, Geneve, pp. 87-91, Malaga, Spain, 01/07/2015.

    Analysis of MultiWord Expression Translation Errors in Statistical Machine Translation. / Klyueva, Natalia; Liyanapathirana, Jeevanthi .

    Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015. ed. / Gloria Corpas Pastor; Johanna Monti; Violeta Seretan; Ruslan Mitkov. Geneve : Tradulex, 2015. p. 87-91.

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    TY - GEN

    T1 - Analysis of MultiWord Expression Translation Errors in Statistical Machine Translation

    AU - Klyueva, Natalia

    AU - Liyanapathirana, Jeevanthi

    PY - 2015

    Y1 - 2015

    N2 - In this paper, we analyse the usage of multiword expressions (MWE) in Statistical Machine Translation (SMT). We exploit the Moses SMT toolkit to train models for French-English and Czech-Russian language pairs. For each language pair, two models were built: a baseline model without additional MWE data and the model enhanced with information on MWE. For the French-English pair, we tried three methods of introducing the MWE data. For Czech-Russian pair, we used just one method – adding automatically extracted data as a parallel corpus.

    AB - In this paper, we analyse the usage of multiword expressions (MWE) in Statistical Machine Translation (SMT). We exploit the Moses SMT toolkit to train models for French-English and Czech-Russian language pairs. For each language pair, two models were built: a baseline model without additional MWE data and the model enhanced with information on MWE. For the French-English pair, we tried three methods of introducing the MWE data. For Czech-Russian pair, we used just one method – adding automatically extracted data as a parallel corpus.

    M3 - Article in proceedings

    SN - 9782970073697

    SP - 87

    EP - 91

    BT - Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015

    A2 - Pastor, Gloria Corpas

    A2 - Monti, Johanna

    A2 - Seretan, Violeta

    A2 - Mitkov, Ruslan

    PB - Tradulex

    CY - Geneve

    ER -

    Klyueva N, Liyanapathirana J. Analysis of MultiWord Expression Translation Errors in Statistical Machine Translation. In Pastor GC, Monti J, Seretan V, Mitkov R, editors, Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015. Geneve: Tradulex. 2015. p. 87-91