Aalto system for the 2017 Arabic multi-genre broadcast challenge

Smit, Peter; Gangireddy, Siva; Enarvi, Seppo; Virpioja, Sami; Kurimo, Mikko

doi:10.1109/ASRU.2017.8268955

aalto1 untyped-item.component.html

Aalto system for the 2017 Arabic multi-genre broadcast challenge

Files

smit2017mgb.pdf (307.08 KB)

Access rights

openAccess

acceptedVersion

A4 Artikkeli konferenssijulkaisussa

This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)

Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.

Authors

Date

2018

Department

Department of Signal Processing and Acoustics

Language

en

Series

Automatic Speech Recognition and Understanding (ASRU), IEEE Workshop on, pp. 338-345

Abstract

We describe the speech recognition systems we have created for MGB-3, the 3rd Multi Genre Broadcast challenge, which this year consisted of a task of building a system for transcribing Egyptian Dialect Arabic speech, using a big audio corpus of primarily Modern Standard Arabic speech and only a small amount (5 hours) of Egyptian adaptation data. Our system, which was a combination of different acoustic models, language models and lexical units, achieved a Multi-Reference Word Error Rate of 29.25%, which was the lowest in the competition. Also on the old MGB-2 task, which was run again to indicate progress, we achieved the lowest error rate: 13.2%. The result is a combination of the application of state-of-the-art speech recognition methods such as simple dialect adaptation for a Time-Delay Neural Network (TDNN) acoustic model (-27% errors compared to the baseline), Recurrent Neural Network Language Model (RNNLM) rescoring (an additional -5%), and system combination with Minimum Bayes Risk (MBR) decoding (yet another -10%). We also explored the use of morph and character language models, which was particularly beneficial in providing a rich pool of systems for the MBR decoding.

DOI

10.1109/ASRU.2017.8268955

Citation

Smit, P, Gangireddy, S, Enarvi, S, Virpioja, S & Kurimo, M 2018, Aalto system for the 2017 Arabic multi-genre broadcast challenge. in Automatic Speech Recognition and Understanding (ASRU), IEEE Workshop on. IEEE, pp. 338-345, IEEE Automatic Speech Recognition and Understanding Workshop, Okinawa, Japan, 16/12/2017. https://doi.org/10.1109/ASRU.2017.8268955

Permanent link to this item

https://urn.fi/URN:NBN:fi:aalto-201802091512

Collections

[article-cris] Sähkötekniikan korkeakoulu / ELEC

Show all metadata

Aalto system for the 2017 Arabic multi-genre broadcast challenge

Files

Access rights

URL

Journal Title

Journal ISSN

Volume Title

Authors

Date

Department

Major/Subject

Mcode

Degree programme

Language

Pages

Series

Abstract

Description

Keywords

Other note

DOI

Citation

Permanent link to this item

Collections

Endorsement

Review

Supplemented By

Referenced By