Skip to main content

Multi-band long-term signal variability features for robust voice activity detection

Publication ,  Conference
Tsiartas, A; Chaspari, T; Katsamanis, N; Ghosh, P; Li, M; Van Segbroeck, M; Potamianos, A; Narayanan, SS
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
January 1, 2013

In this paper, we propose robust features for the problem of voice activity detection (VAD). In particular, we extend the long term signal variability (LTSV) feature to accommodate multiple spectral bands. The motivation of the multi-band approach stems from the non-uniform frequency scale of speech phonemes and noise characteristics. Our analysis shows that the multi-band approach offers advantages over the single band LTSV for voice activity detection. In terms of classification accuracy, we show 0.3%-61.2% relative improvement over the best accuracy of the baselines considered for 7 out 8 different noisy channels. Experimental results, and error analysis, are reported on the DARPA RATS corpora of noisy speech. Copyright © 2013 ISCA.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2013

Start / End Page

718 / 722
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Tsiartas, A., Chaspari, T., Katsamanis, N., Ghosh, P., Li, M., Van Segbroeck, M., … Narayanan, S. S. (2013). Multi-band long-term signal variability features for robust voice activity detection. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 718–722).
Tsiartas, A., T. Chaspari, N. Katsamanis, P. Ghosh, M. Li, M. Van Segbroeck, A. Potamianos, and S. S. Narayanan. “Multi-band long-term signal variability features for robust voice activity detection.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 718–22, 2013.
Tsiartas A, Chaspari T, Katsamanis N, Ghosh P, Li M, Van Segbroeck M, et al. Multi-band long-term signal variability features for robust voice activity detection. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2013. p. 718–22.
Tsiartas, A., et al. “Multi-band long-term signal variability features for robust voice activity detection.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2013, pp. 718–22.
Tsiartas A, Chaspari T, Katsamanis N, Ghosh P, Li M, Van Segbroeck M, Potamianos A, Narayanan SS. Multi-band long-term signal variability features for robust voice activity detection. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2013. p. 718–722.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2013

Start / End Page

718 / 722