aalto1 untyped-item.component.html
Aalto system for the 2017 Arabic multi-genre broadcast challenge
Loading...
Access rights
openAccess
acceptedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Date
Major/Subject
Mcode
Degree programme
Language
en
Pages
Series
Automatic Speech Recognition and Understanding (ASRU), IEEE Workshop on, pp. 338-345
Abstract
We describe the speech recognition systems we have created for MGB-3, the 3rd Multi Genre Broadcast challenge, which this year consisted of a task of building a system for transcribing Egyptian Dialect Arabic speech, using a big audio corpus of primarily Modern Standard Arabic speech and only a small amount (5 hours) of Egyptian adaptation data. Our system, which was a combination of different acoustic models, language models and lexical units, achieved a Multi-Reference Word Error Rate of 29.25%, which was the lowest in the competition. Also on the old MGB-2 task, which was run again to indicate progress, we achieved the lowest error rate: 13.2%. The result is a combination of the application of state-of-the-art speech recognition methods such as simple dialect adaptation for a Time-Delay Neural Network (TDNN) acoustic model (-27% errors compared to the baseline), Recurrent Neural Network Language Model (RNNLM) rescoring (an additional -5%), and system combination with Minimum Bayes Risk (MBR) decoding (yet another -10%). We also explored the use of morph and character language models, which was particularly beneficial in providing a rich pool of systems for the MBR decoding.
Description
Keywords
Other note
Citation
Smit, P, Gangireddy, S, Enarvi, S, Virpioja, S & Kurimo, M 2018, Aalto system for the 2017 Arabic multi-genre broadcast challenge. in Automatic Speech Recognition and Understanding (ASRU), IEEE Workshop on. IEEE, pp. 338-345, IEEE Automatic Speech Recognition and Understanding Workshop, Okinawa, Japan, 16/12/2017. https://doi.org/10.1109/ASRU.2017.8268955