Sensitivity and specificity can change in opposite directions when new predictive markers are added to risk models.
When comparing prediction models, it is essential to estimate the magnitude of change in performance rather than rely solely on statistical significance. In this paper we investigate measures that estimate change in classification performance, assuming 2-group classification based on a single risk threshold. We study the value of a new biomarker when added to a baseline risk prediction model. First, simulated data are used to investigate the change in sensitivity and specificity (ΔSe and ΔSp). Second, the influence of ΔSe and ΔSp on the net reclassification improvement (NRI; sum of ΔSe and ΔSp) and on decision-analytic measures (net benefit or relative utility) is studied. We assume normal distributions for the predictors and assume correctly specified models such that the extended model has a dominating receiver operating characteristic curve relative to the baseline model. Remarkably, we observe that even when a strong marker is added it is possible that either sensitivity (for thresholds below the event rate) or specificity (for thresholds above the event rate) decreases. In these cases, decision-analytic measures provide more modest support for improved classification than NRI, even though all measures confirm that adding the marker improved classification accuracy. Our results underscore the necessity of reporting ΔSe and ΔSp separately. When a single summary is desired, decision-analytic measures allow for a simple incorporation of the misclassification costs.
Van Calster, B; Steyerberg, EW; D'Agostino, RB; Pencina, MJ
Volume / Issue
Start / End Page
Electronic International Standard Serial Number (EISSN)
Digital Object Identifier (DOI)