Music structural segmentation by combining harmonic and timbral information
We propose a novel model for music structural segmentation aiming at combining harmonic and timbral information. We use two-level clustering with splitting initialization and random turbulence to produce segment labels using chroma and MFCC separately as feature. We construct a score matrix to combine segment labels from both aspects. Finally Nonnegative Matrix Factorization and Maximum Likelihood are applied to extract the final segment labels. By comparing sparseness, our method is capable of automatically determining the number of segment types in a given song. The pairwise F-measure of our algorithm can reach 0.63 without rules of music knowledge, running on 180 Beatles songs. We show our model can be easily associated with more sophisticated structural segmentation algorithms and extended to probabilistic models. © 2011 International Society for Music Information Retrieval.