Supervised Self-taught Learning: Actively transferring knowledge from unlabeled data
We consider the task of Self-taught Learning (STL) from unlabeled data. In contrast to semi-supervised learning, which requires unlabeled data to have the same set of class labels as labeled data, STL can transfer knowledge from different types of unlabeled data. STL uses a three-step strategy: (1) learning high-level representations from unlabeled data only, (2) re-constructing the labeled data via such representations and (3) building a classifier over the re-constructed labeled data. However, the high-level representations which are exclusively determined by the unlabeled data, may be inappropriate or even misleading for the latter classifier training process. In this paper, we propose a novel Supervised Self-taught Learning (SSTL) framework that successfully integrates the three isolated steps of STL into a single optimization problem. Benefiting from the interaction between the classifier optimization and the process of choosing high-level representations, the proposed model is able to select those discriminative representations which are more appropriate for classification. One important feature of our novel framework is that the final optimization can be iteratively solved with convergence guaranteed. We evaluate our novel framework on various data sets. The experimental results show that the proposed SSTL can outperform STL and traditional supervised learning methods in certain instances. © 2009 IEEE.