Cost-Sensitive Learning for Medical Insurance Fraud Detection With Temporal Information
Fraudulent activities within the U.S. healthcare system cost billions of dollars each year and harm the wellbeing of many qualifying beneficiaries. The implementation of an effective fraud detection method has become imperative to secure the welfare of the general public. In this article, we focus on the problem of fraud detection using the current year's Medicare claims data from the perspective of utilizing temporal information from the previous years. We group the data into temporal trajectories of the key covariates and base our feature engineering around these trajectories. For effective feature engineering on the temporal data, we propose to use the functional principal component analysis (FPCA) method for analyzing the temporal covariates' trajectory as well as the distributional FPCA for extracting features from the empirical probability density curve of the covariates. Moreover, we introduce the framework of cost-sensitive learning for analyzing the Medicare database to allow for asymmetrical losses in the confusion matrix, such that the classification rule reflects the realistic tradeoff between the fixed cost and the fraud cost. The issue of class imbalance in the database is tackled through the random undersampling scheme. Our results confirm that the trained classifier has a reasonably good prediction performance and a significant percentage of cost savings can be achieved by taking into account the financial cost.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Information Systems
- 46 Information and computing sciences
- 08 Information and Computing Sciences
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Information Systems
- 46 Information and computing sciences
- 08 Information and Computing Sciences