The DKU-Lenovo systems for the INTERSPEECH 2019 computational paralinguistic challenge
This paper introduces our approaches for the orca activity and continuous sleepiness tasks in the Interspeech ComParE Challenge 2019. For the orca activity detection task, we extract deep embeddings using several deep convolutional neural networks, followed by the Support Vector Machine (SVM) based back end classifier. Both STFT spectrogram and log mel-spectrogram are explored as input features. To increase the size of training data and deal with the data imbalance, we propose four kinds of data augmentation. We also investigate the different ways of fusion for multi-channel input data. Besides the official baseline system, to better evaluate the performance of our deep embedding system, we employ the Fisher Vector (FV) encoding on various kinds of acoustic features as an alternative baseline. Experimental results show that our proposed methods significantly outperform the baselines and achieve 0.948 AUC and 0.365 Spearman's Correlation Coefficient on the orca activity and continuous sleepiness evaluation data, respectively.