Skip to main content

Real-time online singing voice separation from monaural recordings using robust low-rank modeling

Publication ,  Conference
Sprechmann, P; Bronstein, A; Sapiro, G
Published in: Proceedings of the 13th International Society for Music Information Retrieval Conference Ismir 2012
December 1, 2012

Separating the leading vocals from the musical accompaniment is a challenging task that appears naturally in several music processing applications. Robust principal component analysis (RPCA) has been recently employed to this problem producing very successful results. The method decomposes the signal into a low-rank component corresponding to the accompaniment with its repetitive structure, and a sparse component corresponding to the voice with its quasi-harmonic structure. In this paper we first introduce a non-negative variant of RPCA, termed as robust low-rank non-negative matrix factorization (RNMF). This new framework better suits audio applications. We then propose two efficient feed-forward architectures that approximate the RPCA and RNMF with low latency and a fraction of the complexity of the original optimization method. These approximants allow incorporating elements of unsupervised, semi- and fully-supervised learning into the RPCA and RNMF frameworks. Our basic implementation shows several orders of magnitude speedup compared to the exact solvers with no performance degradation, and allows online and faster-than-real-time processing. Evaluation on the MIR-1K dataset demonstrates state-of-the-art performance. © 2012 International Society for Music Information Retrieval.

Duke Scholars

Published In

Proceedings of the 13th International Society for Music Information Retrieval Conference Ismir 2012

Publication Date

December 1, 2012

Start / End Page

67 / 72
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Sprechmann, P., Bronstein, A., & Sapiro, G. (2012). Real-time online singing voice separation from monaural recordings using robust low-rank modeling. In Proceedings of the 13th International Society for Music Information Retrieval Conference Ismir 2012 (pp. 67–72).
Sprechmann, P., A. Bronstein, and G. Sapiro. “Real-time online singing voice separation from monaural recordings using robust low-rank modeling.” In Proceedings of the 13th International Society for Music Information Retrieval Conference Ismir 2012, 67–72, 2012.
Sprechmann P, Bronstein A, Sapiro G. Real-time online singing voice separation from monaural recordings using robust low-rank modeling. In: Proceedings of the 13th International Society for Music Information Retrieval Conference Ismir 2012. 2012. p. 67–72.
Sprechmann, P., et al. “Real-time online singing voice separation from monaural recordings using robust low-rank modeling.” Proceedings of the 13th International Society for Music Information Retrieval Conference Ismir 2012, 2012, pp. 67–72.
Sprechmann P, Bronstein A, Sapiro G. Real-time online singing voice separation from monaural recordings using robust low-rank modeling. Proceedings of the 13th International Society for Music Information Retrieval Conference Ismir 2012. 2012. p. 67–72.

Published In

Proceedings of the 13th International Society for Music Information Retrieval Conference Ismir 2012

Publication Date

December 1, 2012

Start / End Page

67 / 72