Skip to main content

Interpretable and Performant Multimodal Nasopharyngeal Carcinoma GTV Segmentation with Clinical Priors Guided 3D-Gaussian-Prompted Diffusion Model (3DGS-PDM)

Publication ,  Journal Article
Zhu, J; Ma, Z; Ren, G; Cai, J
Published in: Cancers
November 1, 2025

Background: Gross tumor volume (GTV) segmentation of Nasopharyngeal Carcinoma (NPC) crucially determines the precision of image-guided radiation therapy (IGRT) for NPC. Compared to other cancers, the clinical delineation of NPC is especially challenging due to its capricious infiltration of the adjacent rich tissues and bones, and it routinely requires multimodal information from CT and MRI series to identify its ambiguous tumor boundary. However, the conventional deep learning-based multimodal segmentation method suffers from limited prediction accuracy and frequently performs as well as or worse than single-modality segmentation models. The limited multimodal prediction performance indicates defective information extraction and integration from the input channels. This study aims to develop a 3D Gaussian-prompted Diffusion Model (3DG-PDM) for more clinically targeted information extraction and effective multimodal information integration, thereby facilitating more accurate and clinically interpretable GTV segmentation for NPC. Methods: We propose a 3D-Gaussian-Prompted Diffusion Model (3DGS-PDM) that operates NPC tumor contouring in multimodal clinical priors through a guided stepwise process. The proposed model contains two modules: a Gaussian Initialization Module that utilizes a 3D-Gaussian-Splatting technique to distill 3D-Gaussian representations based on clinical priors from CT, MRI-t2 and MRI-t1-contract-enhanced-fat-suppression (MRI-t1-cefs), respectively, and a Diffusion Segmentation Module that generates tumor segmentation step-by-step from the fused 3D-Gaussians prompts. We retrospectively collected data on 600 NPC patients from four hospitals through paired CT, MRI series and clinical GTV annotations, and divided that dataset into 480 training volumes and 120 testing volumes. Results: Our proposed method can achieve a mean dice similarity cofficient (DSC) of 84.29 ± 7.33, a mean average symmetric surface distance (ASSD) of 1.31 ± 0.63, and a 95th percentile of Hausdorff (HD95) of 4.76 ± 1.98 on primary NPC tumor (GTVp) segmentation, and a DSC of 79.25 ± 10.01, an ASSD of 1.19 ± 0.72 and an HD95 of 4.76 ± 1.71 on metastasis NPC tumor (GTVnd) segmentation. Comparative experiments further demonstrate that our method can significantly improve the multimodal segmentation performance on NPC tumors, with superior advantages over five other state-of-the-art comparative methods. Visual evaluation on the segmentation prediction process and a three-step ablation study on input channels further demonstrate the interpretability of our proposed method. Conclusions: This study proposes a performant and interpretable multimodal segmentation method for GTV of NPC, contributing greatly to precision improvement for NPC therapy treatment.

Duke Scholars

Published In

Cancers

DOI

EISSN

2072-6694

Publication Date

November 1, 2025

Volume

17

Issue

22

Related Subject Headings

  • 3211 Oncology and carcinogenesis
  • 1112 Oncology and Carcinogenesis
 

Published In

Cancers

DOI

EISSN

2072-6694

Publication Date

November 1, 2025

Volume

17

Issue

22

Related Subject Headings

  • 3211 Oncology and carcinogenesis
  • 1112 Oncology and Carcinogenesis