Abstract
In this paper, we analyse the usage of multiword expressions (MWE) in Statistical Machine Translation (SMT). We exploit the Moses SMT toolkit to train models for French-English and Czech-Russian language pairs. For each language pair, two models were built: a baseline model without additional MWE data and the model enhanced with information on MWE. For the French-English pair, we tried three methods of introducing the MWE data. For Czech-Russian pair, we used just one method – adding automatically extracted data as a parallel corpus.
Originalsprog | Engelsk |
---|---|
Titel | Workshop Proceedings. Multi-Word Units in Machine Translation and Translation Technologies. MUMTTT 2015 |
Redaktører | Gloria Corpas Pastor, Johanna Monti, Violeta Seretan, Ruslan Mitkov |
Udgivelsessted | Geneve |
Forlag | Tradulex |
Publikationsdato | 2015 |
Sider | 87-91 |
ISBN (Trykt) | 9782970073697 |
Status | Udgivet - 2015 |
Begivenhed | The 2nd Workshop on Multi-word Units in Machine Translation and Translation Technology. MUMTTT 2015 : Part of EUROPHRAS 2015 - Malaga, Spanien Varighed: 1 jul. 2015 → 2 jul. 2015 Konferencens nummer: 2 http://typo.uni-konstanz.de/parseme/index.php/events/118-mumttt-workshop-at-europhras-15 |
Konference
Konference | The 2nd Workshop on Multi-word Units in Machine Translation and Translation Technology. MUMTTT 2015 |
---|---|
Nummer | 2 |
Land/Område | Spanien |
By | Malaga |
Periode | 01/07/2015 → 02/07/2015 |
Internetadresse |