A Unified Framework for Automatic Distributed Active Learning.
We propose a novel unified frameork for automated distributed active learning (AutoDAL) to address multiple challenging problems in active learning such as limited labeled data, imbalanced datasets, automatic hyperparameter selection as well as scalability to big data. First, automated graph-based semi-supervised learning is conducted by aggregating the proposed cost functions from different compute nodes and jointly optimizing hyperparameters in both the classification and query selection stages. For dense datasets, clustering-based uncertainty sampling with maximum entropy (CME) loss is applied in the optimization. For sparse and imbalanced datasets, shrinkage optimized KL-divergence regularization and local selection based active learning (SOAR) loss are further naturally adapted in AutoDAL. The optimization is efficiently resolved by iteratively executing a genetic algorithm (GA) refined with a local generating set search (GSS) and solving an integer linear programming (ILP) problem. Moreover, we propose an efficient distributed active learning algorithm which is scalable for big data. The proposed AutoDAL algorithm is applied to multiple benchmark datasets and two real-world datasets including an electrocardiogram (ECG) dataset and a credit fraud detection dataset for classification. We demonstrate that the proposed AutoDAL algorithm is capable of achieving significantly better performance compared to several state-of-the-art AutoML approaches and active learning algorithms.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Supervised Machine Learning
- Cluster Analysis
- Artificial Intelligence & Image Processing
- Algorithms
- 4611 Machine learning
- 4603 Computer vision and multimedia computation
- 0906 Electrical and Electronic Engineering
- 0806 Information Systems
- 0801 Artificial Intelligence and Image Processing
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Supervised Machine Learning
- Cluster Analysis
- Artificial Intelligence & Image Processing
- Algorithms
- 4611 Machine learning
- 4603 Computer vision and multimedia computation
- 0906 Electrical and Electronic Engineering
- 0806 Information Systems
- 0801 Artificial Intelligence and Image Processing