Abstract
Many state-of-the-art multichannel speech enhancement methods rely on second-order statistics of the desired speech signal, the noise signal, or both. Estimation of those are difficult in practice, resulting in a practical performance that is typically much lower than their potential theoretical performance. We propose two multichannel enhancement techniques that instead rely on a model for voiced speech. That is, the proposed methods are driven by the signals' fundamental frequencies, which may be accurately estimated even in noisy scenarios. The first method is designed independently of the microphone array geometry and source position, whereas these are utilized in the second approach. Thereby, we can investigate when to exploit such information in the case of localization errors and violations of the spatial assumptions. Numerical results show that the proposed method is able to outperform competing methods in terms of both output SNRs and PESQ scores.
Original language | English |
---|---|
Title of host publication | 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings |
Publisher | IEEE - Institute of Electrical and Electronics Engineers Inc. |
Pages | 501-505 |
Number of pages | 5 |
ISBN (Electronic) | 9781509041176 |
DOIs | |
Publication status | Published - 2017 Jun 16 |
Event | 42nd IEEE International Conference on Audio, Speech, and Signals Processing, ICASSP 2017 - New Orleans, United States Duration: 2017 Mar 5 → 2017 Mar 9 http://www.ieee-icassp2017.org/ |
Conference
Conference | 42nd IEEE International Conference on Audio, Speech, and Signals Processing, ICASSP 2017 |
---|---|
Abbreviated title | ICASSP 2017 |
Country/Territory | United States |
City | New Orleans |
Period | 2017/03/05 → 2017/03/09 |
Internet address |
Subject classification (UKÄ)
- Probability Theory and Statistics
- Signal Processing
Free keywords
- multichannel speech enhancement
- voiced speech
- MMSE filtering
- harmonic filters
- DOA mismatch