Kernel-matching pursuits with arbitrary loss functions.
The purpose of this research is to develop a classifier capable of state-of-the-art performance in both computational efficiency and generalization ability while allowing the algorithm designer to choose arbitrary loss functions as appropriate for a give problem domain. This is critical in applications involving heavily imbalanced, noisy, or non-Gaussian distributed data. To achieve this goal, a kernel-matching pursuit (KMP) framework is formulated where the objective is margin maximization rather than the standard error minimization. This approach enables excellent performance and computational savings in the presence of large, imbalanced training data sets and facilitates the development of two general algorithms. These algorithms support the use of arbitrary loss functions allowing the algorithm designer to control the degree to which outliers are penalized and the manner in which non-Gaussian distributed data is handled. Example loss functions are provided and algorithm performance is illustrated in two groups of experimental results. The first group demonstrates that the proposed algorithms perform equivalent to several state-of-the-art machine learning algorithms on well-published, balanced data. The second group of results illustrates superior performance by the proposed algorithms on imbalanced, non-Gaussian data achieved by employing loss functions appropriate for the data characteristics and problem domain.
Stack, JR; Dobeck, GJ; Liao, X; Carin, L
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)