Predicting Post-Editor Profiles from the Translation Process

Karan Singla, David Orrego-Carmona, Ashleigh Rhea Gonzales, Michael Carl, Srinivas Bangalore

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    40 Downloads (Pure)


    The purpose of the current investigation is to predict post-editor profiles based on user behaviour and demographics using machine learning techniques to gain a better understanding of post-editor styles. Our study extracts process unit features from the CasMaCat LS14 database from the CRITT Translation Process Research Database (TPR-DB). The analysis has two main research goals: We create n-gram models based on user activity and part-of-speech sequences
    to automatically cluster post-editors, and we use discriminative classifier models to characterize post-editors based on a diverse range of translation process features. The classification and clustering of participants resulting from our study suggest this type of exploration could be used as a tool to develop new translation tool features or customization possibilities.
    Original languageEnglish
    Title of host publicationProceedings of the Workshop on Interactive and Adaptive Machine Translation
    EditorsFrancisco Casacuberta, Marcello Federico, Philipp Koehn
    Number of pages10
    PublisherAssociation for Machine Translation in the Americas (AMTA)
    Publication date2014
    Publication statusPublished - 2014
    EventThe 11th Conference of the Association for Machine Translation in the Americas 2014 - Vancouver, Canada
    Duration: 22 Oct 201426 Oct 2014
    Conference number: 11


    ConferenceThe 11th Conference of the Association for Machine Translation in the Americas 2014
    Internet address

    Cite this