Abstract
In this paper, we analyse the usage of multiword expressions (MWE) in Statistical Machine Translation (SMT). We exploit the Moses SMT toolkit to train models for French-English and Czech-Russian language pairs. For each language pair, two models were built: a baseline model without additional MWE data and the model enhanced with information on MWE. For the French-English pair, we tried three methods of introducing the MWE data. For Czech-Russian pair, we used just one method – adding automatically extracted data as a parallel corpus.
Original language | English |
---|---|
Title of host publication | Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015 |
Editors | Gloria Corpas Pastor, Johanna Monti, Violeta Seretan, Ruslan Mitkov |
Place of Publication | Geneve |
Publisher | Tradulex |
Publication date | 2015 |
Pages | 87-91 |
ISBN (Print) | 9782970073697 |
Publication status | Published - 2015 |
Event | The 2nd Workshop on Multi-word Units in Machine Translation and Translation Technology. MUMTTT 2015 : Part of EUROPHRAS 2015 - Malaga, Spain Duration: 1 Jul 2015 → 2 Jul 2015 Conference number: 2 http://typo.uni-konstanz.de/parseme/index.php/events/118-mumttt-workshop-at-europhras-15 |
Conference
Conference | The 2nd Workshop on Multi-word Units in Machine Translation and Translation Technology. MUMTTT 2015 |
---|---|
Number | 2 |
Country/Territory | Spain |
City | Malaga |
Period | 01/07/2015 → 02/07/2015 |
Internet address |