Abstrakt
Automatic speech recognition (ASR) relies on three resources: audio, orthographic transcriptions and a pronunciation dictionary. The dictionary or lexicon maps orthographic words to sequences of phones or phonemes that represent the pronunciation of the corresponding word. The quality of a speech recognition system depends heavily on the dictionary and the transcriptions therein. This paper presents an analysis of phonetic/phonemic features that are salient for current Danish ASR systems. This preliminary study consists of a series of experiments using an ASR system trained on the DK-PAROLE corpus. The analysis indicates that transcribing e.g. stress or vowel duration has a negative impact on performance. The best performance is obtained with coarse phonetic annotation and improves performance 1% word error rate and 3.8% sentence error rate.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013) |
Redaktører | Stephan Oepen, Kristin Hagen, Janne Bondi Johannessen |
Udgivelsessted | Linköping |
Forlag | Linköping University Electronic Press |
Publikationsdato | 2013 |
Sider | 321-330 |
ISBN (Trykt) | 9789175195896 |
Status | Udgivet - 2013 |
Begivenhed | NODALIDA 2013: The 19th Nordic Conference of Computational Linguistics - University of Oslo, Oslo, Norge Varighed: 22 maj 2013 → 24 maj 2013 Konferencens nummer: 19 http://www.hf.uio.no/iln/english/research/news-and-events/events/conferences/2013/nodalida/index.html |
Konference
Konference | NODALIDA 2013 |
---|---|
Nummer | 19 |
Lokation | University of Oslo |
Land | Norge |
By | Oslo |
Periode | 22/05/2013 → 24/05/2013 |
Internetadresse |
Navn | NEALT (Northern European Association of Language Technology) Proceedings Series |
---|---|
Vol/bind | 16 |
ISSN | 1736-6305 |
Emneord
- Automatic speech recognition
- Phonetics
- Phonology
- Speech
- Phonetic transcription