Using cepstral coefficients for Inhalation pause detection in spontaneous speech

Research output: Chapter in Book/Report/Conference proceedingPaper in conference proceeding

Abstract

A method for recognizing inhalations in spontaneous speech is presented. It is similar to the template matching technique; a distance measure is calculated between a reference sound and an equally long portion of the same sound being tracked. A feature representation consisting of the standard Mel Frequency Cepstral Coefficients (MFCC), obtained by performing a discrete Cosine Transform of the mel-scaled filterbank spectrum is used. MFCC's are calculated every 5 ms. The comparison is then done by computing the euclidian distance between the cepstral coefficients of each frame of the two sounds. A low distance value means that the two compared inhalations are likely to be similar. The method can detect inhalations in both male and female spontaneous speech. The method is most suited for signals with low noise and high average intensity (studio recording) but can also be used on noisier recordings with lower average intensity, albeit with poorer results.

Details

Authors
Organisations
Research areas and keywords

Subject classification (UKÄ) – MANDATORY

  • General Language Studies and Linguistics

Keywords

  • breathing pauses, inhalations, inhalation pause, cepstral coefficient, pause, spontaneous speech
Original languageEnglish
Title of host publicationProceedings of SPECOM 2005
EditorsG. Kokkinakis, N. Fakotakis, E. Dermatas, R. Potapova
PublisherUniversity of Patras
Pages143-146
Volume1
ISBN (Print)5-7452-0110-x
Publication statusPublished - 2005
Publication categoryResearch
Peer-reviewedYes
EventSPECOM 2005 - Patras, Greece
Duration: 0001 Jan 2 → …

Publication series

Name
Volume1

Conference

ConferenceSPECOM 2005
CountryGreece
CityPatras
Period0001/01/02 → …

Bibliographic note

The information about affiliations in this record was updated in December 2015. The record was previously connected to the following departments: Linguistics and Phonetics (015010003), Structural Mechanics (011032000)

Total downloads

No data available

Related projects

View all (1)