Skip to main content

Speech bandwidth expansion based on deep neural networks

Publication ,  Conference
Wang, Y; Zhao, S; Liu, W; Li, M; Kuang, J
Published in: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
January 1, 2015

This paper proposes a new speech bandwidth expansion method, which uses Deep Neural Networks (DNNs) to build high-order eigenspaces between the low frequency components and the high frequency components of the speech signal. A four-layer DNN is trained layer-by-layer from a cascade of Neural Networks (NNs) and two Gaussian-Bernoulli Restricted Boltzmann Machines (GBRBMs). The GBRBMs are adopted to model the distribution of spectral envelopes of the low frequency and the high frequency respectively. The NNs are used to model the joint distribution of hidden variables extracted from the two GBRBMs. The proposed method takes advantage of the strong modeling ability of GBRBMs in modeling the distribution of the spectral envelopes. And both the objective and subjective test results show that the proposed method outperforms the conventional GMM based method.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2015

Volume

2015-January

Start / End Page

2593 / 2597
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Wang, Y., Zhao, S., Liu, W., Li, M., & Kuang, J. (2015). Speech bandwidth expansion based on deep neural networks. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2015-January, pp. 2593–2597).
Wang, Y., S. Zhao, W. Liu, M. Li, and J. Kuang. “Speech bandwidth expansion based on deep neural networks.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2015-January:2593–97, 2015.
Wang Y, Zhao S, Liu W, Li M, Kuang J. Speech bandwidth expansion based on deep neural networks. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2015. p. 2593–7.
Wang, Y., et al. “Speech bandwidth expansion based on deep neural networks.” Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2015-January, 2015, pp. 2593–97.
Wang Y, Zhao S, Liu W, Li M, Kuang J. Speech bandwidth expansion based on deep neural networks. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2015. p. 2593–2597.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

EISSN

1990-9772

ISSN

2308-457X

Publication Date

January 1, 2015

Volume

2015-January

Start / End Page

2593 / 2597