To evaluate the reliability of manual annotation when quantifying cornea anatomical and microbial keratitis (MK) morphological features on slit-lamp photography (SLP) images.
Prospectively enrolled patients with MK underwent SLP at initial encounter at 2 academic eye hospitals. Patients who presented with an epithelial defect (ED) were eligible for analysis. Features, which included ED, corneal limbus (L), pupil (P), stromal infiltrate (SI), white blood cell (WBC) infiltration at the SI edge, and hypopyon (H), were annotated independently by 2 physicians on SLP images. Intraclass correlation coefficients (ICCs) were applied for reliability assessment; dice similarity coefficients (DSCs) were used to investigate the area overlap between readers.
Seventy-five MK patients with an ED received SLP. DSCs indicate good to fair annotation overlap between graders (L = 0.97, P = 0.80, ED = 0.94, SI = 0.82, H = 0.82, WBC = 0.83) and between repeat annotations by the same grader (L = 0.97, P = 0.81, ED = 0.94, SI = 0.85, H = 0.84, WBC = 0.82). ICC scores showed good intergrader (L = 0.98, P = 0.78, ED = 1.00, SI = 0.67, H = 0.97, WBC = 0.86) and intragrader (L = 0.99, P = 0.92, ED = 0.99, SI = 0.94, H = 0.99, WBC = 0.92) reliabilities. When reliability statistics were recalculated for annotated SI area in the subset of cases where both graders agreed WBC infiltration was present/absent, intergrader ICC improved to 0.91 and DSC improved to 0.86 and intragrader ICC remained the same, whereas DSC improved to 0.87.
Manual annotation indicates usefulness of area quantification in the evaluation of MK. However, variability is intrinsic to the task. Thus, there is a need for optimization of annotation protocols. Future directions may include using multiple annotators per image or automated annotation software.