Evaluation of unified harmonization of CT images across multiple tasks: A step towards AI generalizability.
BACKGROUND: In medical imaging, harmonization is pivotal for mitigating variability stemming from diverse imaging devices and protocols. Virtual imaging trials (VITs) can provide a way to simulate diverse imaging conditions in silico and thus provide a unique opportunity to assess the impact of such data variability on the performance of artificial intelligence (AI) models and quantitative analyses, and further to harmonize across these sources of variability. This variability underscores the need for systematic harmonization. Systematic harmonization helps to ensure data consistency for AI and quantitative analyses, ultimately enabling more generalizable model performance. PURPOSE: To investigate the performance of a physics-informed harmonizer of lung CT exams across multiple imaging tasks, using a VIT platform. METHODS: We undertook a multi-objective evaluation comprised of four task-based scenarios: lung structure segmentation, emphysema quantification, emphysema differential across time, and lung nodule quantification. Three simulated datasets were generated by virtually scanning XCAT phantoms, comprised of 160 images for lung structure segmentation, 504 images for emphysema quantification and differential, and 1,044 images for lung nodule quantification. A physics-informed deep neural network served as the unified harmonization model for all tasks. For AI-based segmentation, the data were divided into 70/30 training/testing. Metrics of outcome were Dice coefficient and Hausdorff Distance at 95th percentile (HD95) for segmentation, mean difference and standard deviation from Bland-Altman plot for emphysema quantification, longitudinal harmonization metric (LHM) for longitudinal emphysema differential, and intraclass correlation coefficient (ICC) of radiomics features and the distribution for lung nodule quantification. RESULTS: The harmonization algorithm produced better metrics (i.e. Dice score, HD95, mean difference, LHM, number of high ICCs) with specific marked improvements in image quantification. The analysis of pre- and post-harmonization led to improved Dice scores of 1.0% and 11.0% and reduced HD95 of 5.2% and 32.0% for lung parenchyma and pulmonary vessel segmentation, respectively. The harmonization decreased variability by 2.5% to 50.0% in biomarkers and radiomics features for emphysema quantification. It also lowered LHMs by 12.2% to 82.4% in emphysema differential and increased proportions of high ICC-based radiomics features (ICC ≥ 0.75) by 4.6% and 19.7% for lung nodule quantification across scanner and kernel variabilities, respectively. CONCLUSIONS: The results indicated that the VIT methodology can play a significant role in harmonizing and aligning variabilities across multiple task-based scenarios, underscoring the feasibility of the synthetic dataset curation via VITs for model development. The study can also provide a benchmark in the development of effective harmonizers.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Tomography, X-Ray Computed
- Phantoms, Imaging
- Nuclear Medicine & Medical Imaging
- Lung
- Image Processing, Computer-Assisted
- Humans
- Artificial Intelligence
- 5105 Medical and biological physics
- 4003 Biomedical engineering
- 1112 Oncology and Carcinogenesis
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Tomography, X-Ray Computed
- Phantoms, Imaging
- Nuclear Medicine & Medical Imaging
- Lung
- Image Processing, Computer-Assisted
- Humans
- Artificial Intelligence
- 5105 Medical and biological physics
- 4003 Biomedical engineering
- 1112 Oncology and Carcinogenesis