Skip to main content

Audio-Based Piano Performance Evaluation for Beginners with Convolutional Neural Network and Attention Mechanism

Publication ,  Journal Article
Wang, W; Pan, J; Yi, H; Song, Z; Li, M
Published in: IEEE/ACM Transactions on Audio Speech and Language Processing
January 1, 2021

In this paper, we propose two different audio-based piano performance evaluation systems for beginners. The first is a sequential and modularized system, including three steps: Convolutional Neural Network (CNN)-based acoustic feature extraction, matching via dynamic time warping (DTW), and performance score regression. The second system is an end-to-end system with CNNs and the attention mechanism. It takes two acoustic feature sequences as input and directly predicts a performance score. We evaluate two proposed methods with our new open-access Yingcai Piano Performance Evaluation Phase III Dataset (YCU-PPE-III) that contains more than 2000 piano audio pieces recorded in multiple real test sessions. Experimental results show that the modularized system achieves a mean absolute error (MAE) of 3.79 in a 0-100-point range. Another end-to-end system also achieves an MAE of 4.40, which shows that it is possible to train a robust end-to-end piano performance evaluation system with only two thousand audio pieces.

Duke Scholars

Published In

IEEE/ACM Transactions on Audio Speech and Language Processing

DOI

EISSN

2329-9304

ISSN

2329-9290

Publication Date

January 1, 2021

Volume

29

Start / End Page

1119 / 1133
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Wang, W., Pan, J., Yi, H., Song, Z., & Li, M. (2021). Audio-Based Piano Performance Evaluation for Beginners with Convolutional Neural Network and Attention Mechanism. IEEE/ACM Transactions on Audio Speech and Language Processing, 29, 1119–1133. https://doi.org/10.1109/TASLP.2021.3061267
Wang, W., J. Pan, H. Yi, Z. Song, and M. Li. “Audio-Based Piano Performance Evaluation for Beginners with Convolutional Neural Network and Attention Mechanism.” IEEE/ACM Transactions on Audio Speech and Language Processing 29 (January 1, 2021): 1119–33. https://doi.org/10.1109/TASLP.2021.3061267.
Wang W, Pan J, Yi H, Song Z, Li M. Audio-Based Piano Performance Evaluation for Beginners with Convolutional Neural Network and Attention Mechanism. IEEE/ACM Transactions on Audio Speech and Language Processing. 2021 Jan 1;29:1119–33.
Wang, W., et al. “Audio-Based Piano Performance Evaluation for Beginners with Convolutional Neural Network and Attention Mechanism.” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 29, Jan. 2021, pp. 1119–33. Scopus, doi:10.1109/TASLP.2021.3061267.
Wang W, Pan J, Yi H, Song Z, Li M. Audio-Based Piano Performance Evaluation for Beginners with Convolutional Neural Network and Attention Mechanism. IEEE/ACM Transactions on Audio Speech and Language Processing. 2021 Jan 1;29:1119–1133.

Published In

IEEE/ACM Transactions on Audio Speech and Language Processing

DOI

EISSN

2329-9304

ISSN

2329-9290

Publication Date

January 1, 2021

Volume

29

Start / End Page

1119 / 1133