aalto1 untyped-item.component.html
Mispronunciation detection and diagnostics for spoken Finnish
Loading...
URL
Journal Title
Journal ISSN
Volume Title
School of Science |
Master's thesis
Electronic archive copy is available via Aalto Thesis Database.
Authors
Date
Department
Mcode
Language
en
Pages
60
Series
Abstract
This thesis investigates mispronunciation detection and diagnosis (MDD) for spoken Finnish, leveraging recent advances in end-to-end neural speech models. We explore how techniques such as entropy regularization and structured decoding (e.g., beam search) affect the ability of a system to detect and diagnose learner pronunciation errors. Our approach builds on the pretrained XLS-R model, fine-tuned using the Finnish Parliament Corpus and evaluated on the Digitala test set.
We frame MDD as a phoneme-level classification task and report detection and diagnosis metrics such as F1-score and Diagnosis Accuracy Rate (DAR). Additionally, we incorporate distance-based metrics and statistical significance testing (Wilcoxon signed-rank) to evaluate transcription fidelity and robustness. Results show that beam decoding significantly improves insertions and overall CER, while entropy regularization further stabilizes predictions. These improvements were especially evident in short utterances produced by L2 learners.
By revisiting and extending findings from prior work, we validate the potential of these modeling strategies in educational settings. This thesis contributes to Finnish language learning by providing a more transparent, learner-sensitive, and motivating feedback system, with insights applicable to other low-resource or morphologically rich languages.
Description
Supervisor
Kurimo, MikkoThesis advisor
Phan, NhanGrósz, Tamás