Automated characterization of perceptual quality of clinical chest radiographs: validation and calibration to observer preference.
PURPOSE: The authors previously proposed an image-based technique [Y. Lin et al. Med. Phys. 39, 7019-7031 (2012)] to assess the perceptual quality of clinical chest radiographs. In this study, an observer study was designed and conducted to validate the output of the program against rankings by expert radiologists and to establish the ranges of the output values that reflect the acceptable image appearance so the program output can be used for image quality optimization and tracking. METHODS: Using an IRB-approved protocol, 2500 clinical chest radiographs (PA/AP) were collected from our clinical operation. The images were processed through our perceptual quality assessment program to measure their appearance in terms of ten metrics of perceptual image quality: lung gray level, lung detail, lung noise, rib-lung contrast, rib sharpness, mediastinum detail, mediastinum noise, mediastinum alignment, subdiaphragm-lung contrast, and subdiaphragm area. From the results, for each targeted appearance attribute/metric, 18 images were selected such that the images presented a relatively constant appearance with respect to all metrics except the targeted one. The images were then incorporated into a graphical user interface, which displayed them into three panels of six in a random order. Using a DICOM calibrated diagnostic display workstation and under low ambient lighting conditions, each of five participating attending chest radiologists was tasked to spatially order the images based only on the targeted appearance attribute regardless of the other qualities. Once ordered, the observer also indicated the range of image appearances that he/she considered clinically acceptable. The observer data were analyzed in terms of the correlations between the observer and algorithmic rankings and interobserver variability. An observer-averaged acceptable image appearance was also statistically derived for each quality attribute based on the collected individual acceptable ranges. RESULTS: The observer study indicated that, for each image quality attribute, the averaged observer ranking strongly correlated with the algorithmic ranking (linear correlation coefficient R > 0.92), with highest correlation (R = 1) for lung gray level and the lowest (R = 0.92) for mediastinum noise. There was a strong concordance between the observers in terms of their rankings (i.e., Kendall's tau agreement > 0.84). The observers also generally indicated similar tolerance and preference levels in terms of acceptable ranges, as 85% of the values were close to the overall tolerance or preference levels and the differences were smaller than 0.15. CONCLUSIONS: The observer study indicates that the previously proposed technique provides a robust reflection of the perceptual image quality in clinical images. The results established the range of algorithmic outputs for each metric that can be used to quantitatively assess and qualify the appearance quality of clinical chest radiographs.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Radiography, Thoracic
- Quality Control
- Observer Variation
- Nuclear Medicine & Medical Imaging
- Humans
- Calibration
- Automation
- 5105 Medical and biological physics
- 4003 Biomedical engineering
- 1112 Oncology and Carcinogenesis
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Radiography, Thoracic
- Quality Control
- Observer Variation
- Nuclear Medicine & Medical Imaging
- Humans
- Calibration
- Automation
- 5105 Medical and biological physics
- 4003 Biomedical engineering
- 1112 Oncology and Carcinogenesis