An iterative framework for unsupervised learning in the PLDA based speaker verification
We present an iterative and unsupervised learning approach for the speaker verification task. In conventional speaker verification, Probabilistic Linear Discriminant Analysis (PLDA) has been widely used as a supervised backend. However, PLDA requires fully labeled training data, which is often difficult to obtain in reality. To automatically retrieve the speaker labels of unlabeled training data, we propose to use the Affinity Propagation (AP) - a clustering method that takes pairwise data similarity as input - to generate the labels for the PLDA modeling. We further propose an iterative refinement strategy that incrementally updates the similarity input of the AP clustering with the previous iteration's PLDA scoring outputs. Moreover, we evaluate the performance of different PLDA scoring methods for the multiple enrollment task and show that the generalized hypothesis testing achieves the best results. Experiments were conducted on the NIST SRE 2010 and the 2014 i-vector challenge database. The results show that our proposed iterative and unsupervised PLDA model learning approach outperformed the cosine similarity baseline by 35% relatively.