What Else is New Than the Hamming Window? Robust MFCCs for Speaker Recognition via Multitapering

Tomi Kinnunen, Rahim Saeidi, Johan Sandberg, Maria Sandsten

Research output: Chapter in Book/Report/Conference proceedingPaper in conference proceedingpeer-review

Abstract

Usually the mel-frequency cepstral coefficients (MFCCs) are derived via Hamming windowed DFT spectrum. In this paper, we advocate to use a so-called multitaper method instead. Multitaper methods form a spectrum estimate using multiple window functions and frequency-domain averaging. Multitapers provide a robust spectrum estimate but have not received much attention in speech processing. Our speaker recognition experiment on NIST 2002 yields equal error rates (EERs) of 9.66 % (clean data) and 16.41 % (-10 dB SNR) for the conventional Hamming method and 8.13 % (clean data) and 14.63 % (-10 dB SNR) using multitapers. Multitapering is a simple and robust alternative to the Hamming window method.
Original languageEnglish
Title of host publicationInterSpecch 2010
Pages2734-2737
Publication statusPublished - 2010
EventInterspeech 2010 - Makuhari, Japan
Duration: 0001 Jan 2 → …

Conference

ConferenceInterspeech 2010
Country/TerritoryJapan
CityMakuhari
Period0001/01/02 → …

Subject classification (UKÄ)

  • Probability Theory and Statistics

Free keywords

  • speaker verification
  • multiple window method

Fingerprint

Dive into the research topics of 'What Else is New Than the Hamming Window? Robust MFCCs for Speaker Recognition via Multitapering'. Together they form a unique fingerprint.

Cite this