Speech Signal Processing

Audio Examples

Improved MMSE-Based Noise PSD Tracking Using Temporal Cepstrum Smoothing

Here we compare the proposed noise power spectral density estimators [1]-[2] to the bias compensated MMSE approach [3] and the Minimum Statistics approach [4]. The approach based on temporal cepstrum smoothing [1] efficiently reduces the musical noise phenomenon, even when – as in this example – the decision-directed approach is used for estimating the a priori SNR. The approach based on speech presence probability estimation [2] is similar in performance to the MMSE-BC approach [3], meaning that it exhibits fast noise tracking performance. Further, this approach exhibits an extremely low computational complexity, and is thus well suited for mobil applications. The code for this approach can be found here.

Modulated Gaussian noise

TCS-based noise [1]
SPP-based noise [2]
Minimum Statistics [4]


[1]  Timo Gerkmann, Richard C. Hendriks, "Improved MMSE-Based Noise PSD Tracking Using Temporal Cepstrum Smoothing", IEEE Int. Conf. Acoustics, Speech, Signal Processing, Kyoto, Japan, Mar. 2012.

[2]  Timo Gerkmann, Richard C. Hendriks, "Unbiased MMSE-based Noise Power Estimation with Low Complexity and Low Tracking Delay", IEEE Trans. Audio, Speech and Language Processing, Vol. 20, No. 4, pp. 1383 - 1393, May 2012.

[3] R. C. Hendriks, R. Heusdens, and J. Jensen, "MMSE based noise PSD tracking with low complexity," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4266-4269, Mar. 2010. 

[4] R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Transactions on Speech and Audio Processing, vol. 9, no. 5, pp. 504-512, Jul. 2001.