Analysis of Phonetic Transcriptions for Danish Automatic Speech Recognition

Andreas Søeborg Kirkedal

    Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningpeer review

    Abstrakt

    Automatic speech recognition (ASR) relies on three resources: audio, orthographic transcriptions and a pronunciation dictionary. The dictionary or lexicon maps orthographic words to sequences of phones or phonemes that represent the pronunciation of the corresponding word. The quality of a speech recognition system depends heavily on the dictionary and the transcriptions therein. This paper presents an analysis of phonetic/phonemic features that are salient for current Danish ASR systems. This preliminary study consists of a series of experiments using an ASR system trained on the DK-PAROLE corpus. The analysis indicates that transcribing e.g. stress or vowel duration has a negative impact on performance. The best performance is obtained with coarse phonetic annotation and improves performance 1% word error rate and 3.8% sentence error rate.
    OriginalsprogEngelsk
    TitelProceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013)
    RedaktørerStephan Oepen, Kristin Hagen, Janne Bondi Johannessen
    UdgivelsesstedLinköping
    ForlagLinköping University Electronic Press
    Publikationsdato2013
    Sider321-330
    ISBN (Trykt)9789175195896
    StatusUdgivet - 2013
    BegivenhedNODALIDA 2013: The 19th Nordic Conference of Computational Linguistics - University of Oslo, Oslo, Norge
    Varighed: 22 maj 201324 maj 2013
    Konferencens nummer: 19
    http://www.hf.uio.no/iln/english/research/news-and-events/events/conferences/2013/nodalida/index.html

    Konference

    KonferenceNODALIDA 2013
    Nummer19
    LokationUniversity of Oslo
    LandNorge
    ByOslo
    Periode22/05/201324/05/2013
    Internetadresse
    NavnNEALT (Northern European Association of Language Technology) Proceedings Series
    Vol/bind16
    ISSN1736-6305

    Emneord

    • Automatic speech recognition
    • Phonetics
    • Phonology
    • Speech
    • Phonetic transcription

    Citationsformater