aalto1 untyped-item.component.html

Mispronunciation detection and diagnostics for spoken Finnish

Loading...
Thumbnail Image

URL

Journal Title

Journal ISSN

Volume Title

School of Science | Master's thesis
Electronic archive copy is available via Aalto Thesis Database.

Department

Mcode

Language

en

Pages

60

Series

Abstract

This thesis investigates mispronunciation detection and diagnosis (MDD) for spoken Finnish, leveraging recent advances in end-to-end neural speech models. We explore how techniques such as entropy regularization and structured decoding (e.g., beam search) affect the ability of a system to detect and diagnose learner pronunciation errors. Our approach builds on the pretrained XLS-R model, fine-tuned using the Finnish Parliament Corpus and evaluated on the Digitala test set. We frame MDD as a phoneme-level classification task and report detection and diagnosis metrics such as F1-score and Diagnosis Accuracy Rate (DAR). Additionally, we incorporate distance-based metrics and statistical significance testing (Wilcoxon signed-rank) to evaluate transcription fidelity and robustness. Results show that beam decoding significantly improves insertions and overall CER, while entropy regularization further stabilizes predictions. These improvements were especially evident in short utterances produced by L2 learners. By revisiting and extending findings from prior work, we validate the potential of these modeling strategies in educational settings. This thesis contributes to Finnish language learning by providing a more transparent, learner-sensitive, and motivating feedback system, with insights applicable to other low-resource or morphologically rich languages.

Description

Supervisor

Kurimo, Mikko

Thesis advisor

Phan, Nhan
Grósz, Tamás

Other note

Citation

Endorsement

Review

Supplemented By

Referenced By