Skip to main content

LSPnet: an ultra-low bitrate hybrid neural codec

Publication ,  Conference
Zhang, B; McLoughlin, I; Miao, X; Madhukumar, AS
Published in: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech
January 1, 2025

This paper presents an ultra-low bitrate speech codec that achieves high-fidelity speech coding at 1.2kbps while maintaining low computational complexity. Building upon the LPCNet framework, combined with a parametric encoder, we introduce several key improvements by incorporating line spectral pairs (LSP) to improve quantization error performance and eliminate explicit LPC estimation by directly predicting the probability distribution of audio samples using a deep neural network, and employing a joint time-frequency training strategy combining short-time Fourier transform (STFT) loss with cross-entropy (CE) loss. The codec is suitable for real-time applications in resource-constrained environments. Experimental results show that the proposed codec not only outperforms traditional speech codecs but also achieves superior speech quality compared to state-of-the-art end-to-end codecs, offering a compelling balance between quality and computational cost.

Duke Scholars

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

EISSN

2958-1796

ISSN

2308-457X

Publication Date

January 1, 2025

Start / End Page

614 / 618
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhang, B., McLoughlin, I., Miao, X., & Madhukumar, A. S. (2025). LSPnet: an ultra-low bitrate hybrid neural codec. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech (pp. 614–618). https://doi.org/10.21437/Interspeech.2025-1335
Zhang, B., I. McLoughlin, X. Miao, and A. S. Madhukumar. “LSPnet: an ultra-low bitrate hybrid neural codec.” In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 614–18, 2025. https://doi.org/10.21437/Interspeech.2025-1335.
Zhang B, McLoughlin I, Miao X, Madhukumar AS. LSPnet: an ultra-low bitrate hybrid neural codec. In: Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2025. p. 614–8.
Zhang, B., et al. “LSPnet: an ultra-low bitrate hybrid neural codec.” Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, 2025, pp. 614–18. Scopus, doi:10.21437/Interspeech.2025-1335.
Zhang B, McLoughlin I, Miao X, Madhukumar AS. LSPnet: an ultra-low bitrate hybrid neural codec. Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. 2025. p. 614–618.

Published In

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

DOI

EISSN

2958-1796

ISSN

2308-457X

Publication Date

January 1, 2025

Start / End Page

614 / 618