Skip to main content

Semi-supervised learning from general unlabeled data

Publication ,  Conference
Huang, K; Xu, Z; King, I; Lyu, MR
Published in: Proceedings - IEEE International Conference on Data Mining, ICDM
December 1, 2008

We consider the problem of Semi-supervised Learning (SSL) from general unlabeled data, which may contain irrelevant samples. Within the binary setting, our model manages to better utilize the information from unlabeled data by formulating them as a three-class (-1, +1, 0) mixture, where class 0 represents the irrelevant data. This distinguishes our work from the traditional SSL problem where unlabeled data are assumed to contain relevant samples only, either +1 or -1, which are forced to be the same as the given labeled samples. This work is also different from another family of popular models, universum learning (universum means "irrelevant" data), in that the universum need not to be specified beforehand. One significant contribution of our proposed framework is that such irrelevant samples can be automatically detected from the available unlabeled data, even though they are mixed with relevant data. This hence presents a general SSL framework that does not force "clean" unlabeled data. More importantly, we formulate this general learning framework as a Semidefinite Programming problem, making it solvable in polynomial time. A series of experiments demonstrate that the proposed framework can outperform the traditional SSL on both synthetic and real data. ©2008 IEEE.

Duke Scholars

Published In

Proceedings - IEEE International Conference on Data Mining, ICDM

DOI

ISSN

1550-4786

Publication Date

December 1, 2008

Start / End Page

273 / 282
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Huang, K., Xu, Z., King, I., & Lyu, M. R. (2008). Semi-supervised learning from general unlabeled data. In Proceedings - IEEE International Conference on Data Mining, ICDM (pp. 273–282). https://doi.org/10.1109/ICDM.2008.61
Huang, K., Z. Xu, I. King, and M. R. Lyu. “Semi-supervised learning from general unlabeled data.” In Proceedings - IEEE International Conference on Data Mining, ICDM, 273–82, 2008. https://doi.org/10.1109/ICDM.2008.61.
Huang K, Xu Z, King I, Lyu MR. Semi-supervised learning from general unlabeled data. In: Proceedings - IEEE International Conference on Data Mining, ICDM. 2008. p. 273–82.
Huang, K., et al. “Semi-supervised learning from general unlabeled data.” Proceedings - IEEE International Conference on Data Mining, ICDM, 2008, pp. 273–82. Scopus, doi:10.1109/ICDM.2008.61.
Huang K, Xu Z, King I, Lyu MR. Semi-supervised learning from general unlabeled data. Proceedings - IEEE International Conference on Data Mining, ICDM. 2008. p. 273–282.

Published In

Proceedings - IEEE International Conference on Data Mining, ICDM

DOI

ISSN

1550-4786

Publication Date

December 1, 2008

Start / End Page

273 / 282