Kernel-matching pursuits with arbitrary loss functions.


Journal Article

The purpose of this research is to develop a classifier capable of state-of-the-art performance in both computational efficiency and generalization ability while allowing the algorithm designer to choose arbitrary loss functions as appropriate for a give problem domain. This is critical in applications involving heavily imbalanced, noisy, or non-Gaussian distributed data. To achieve this goal, a kernel-matching pursuit (KMP) framework is formulated where the objective is margin maximization rather than the standard error minimization. This approach enables excellent performance and computational savings in the presence of large, imbalanced training data sets and facilitates the development of two general algorithms. These algorithms support the use of arbitrary loss functions allowing the algorithm designer to control the degree to which outliers are penalized and the manner in which non-Gaussian distributed data is handled. Example loss functions are provided and algorithm performance is illustrated in two groups of experimental results. The first group demonstrates that the proposed algorithms perform equivalent to several state-of-the-art machine learning algorithms on well-published, balanced data. The second group of results illustrates superior performance by the proposed algorithms on imbalanced, non-Gaussian data achieved by employing loss functions appropriate for the data characteristics and problem domain.

Full Text

Duke Authors

Cited Authors

  • Stack, JR; Dobeck, GJ; Liao, X; Carin, L

Published Date

  • March 2009

Published In

Volume / Issue

  • 20 / 3

Start / End Page

  • 395 - 405

PubMed ID

  • 19179248

Pubmed Central ID

  • 19179248

Electronic International Standard Serial Number (EISSN)

  • 1941-0093

International Standard Serial Number (ISSN)

  • 1045-9227

Digital Object Identifier (DOI)

  • 10.1109/tnn.2008.2008337


  • eng