Dysarthric speech classification from coded telephone speech using glottal features

Nonavinakere Prabhakera, Narendra; Alku, Paavo

doi:10.1016/j.specom.2019.04.003

aalto1 untyped-item.component.html

Dysarthric speech classification from coded telephone speech using glottal features

Files

ELEC_narendra_alku_speech_communication.pdf (660.97 KB)

Access rights

openAccess

acceptedVersion

A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)

Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.

Authors

Nonavinakere Prabhakera, Narendra

Alku, Paavo

Date

2019-07

Department

Department of Signal Processing and Acoustics

Language

en

Pages

9

Series

Speech Communication, Volume 110, pp. 47-55

Abstract

This paper proposes a new dysarthric speech classification method from coded telephone speech using glottal features. The proposed method utilizes glottal features, which are efficiently estimated from coded telephone speech using a recently proposed deep neural net-based glottal inverse filtering method. Two sets of glottal features were considered: (1) time- and frequency-domain parameters and (2) parameters based on principal component analysis (PCA). In addition, acoustic features are extracted from coded telephone speech using the openSMILE toolkit. The proposed method utilizes both acoustic and glottal features extracted from coded speech utterances and their corresponding dysarthric/healthy labels to train support vector machine classifiers. Separate classifiers are trained using both individual, and the combination of glottal and acoustic features. The coded telephone speech used in the experiments is generated using the adaptive multi-rate codec, which operates in two transmission bandwidths: narrowband (300 Hz - 3.4 kHz) and wideband (50 Hz - 7 kHz). The experiments were conducted using dysarthric and healthy speech utterances of the TORGO and universal access speech (UA-Speech) databases. Classification accuracy results indicated the effectiveness of glottal features in the identification of dysarthria from coded telephone speech. The results also showed that the glottal features in combination with the openSMILE-based acoustic features resulted in improved classification accuracies, which validate the complementary nature of glottal features. The proposed dysarthric speech classification method can potentially be employed in telemonitoring application for identifying the presence of dysarthria from coded telephone speech.

Keywords

Dysarthric speech, glottal parameters, glottal source estimation, glottal inverse ﬁltering, openSMILE, support vector machines, telemonitoring

DOI

10.1016/j.specom.2019.04.003

Citation

Nonavinakere Prabhakera, N & Alku, P 2019, 'Dysarthric speech classification from coded telephone speech using glottal features', Speech Communication, vol. 110, pp. 47-55. https://doi.org/10.1016/j.specom.2019.04.003

Permanent link to this item

https://urn.fi/URN:NBN:fi:aalto-201907304483

Collections

[article-cris] Sähkötekniikan korkeakoulu / ELEC

Show all metadata

Dysarthric speech classification from coded telephone speech using glottal features

Files

Access rights

URL

Journal Title

Journal ISSN

Volume Title

Authors

Date

Department

Major/Subject

Mcode

Degree programme

Language

Pages

Series

Abstract

Description

Keywords

Other note

DOI

Citation

Permanent link to this item

Collections

Endorsement

Review

Supplemented By

Referenced By