Skip to main content

An automatic laryngoscopic image segmentation system based on SAM prompt engineering: from glottis annotation to vocal fold segmentation.

Publication ,  Journal Article
Zhang, Y; Song, Y; Liu, J; Li, M
Published in: Frontiers in molecular biosciences
January 2025

Laryngeal high-speed video (HSV) is a widely used technique for diagnosing laryngeal diseases. Among various analytical approaches, segmentation of glottis regions has proven effective in evaluating vocal fold vibration patterns and detecting related disorders. However, the specific task of vocal fold segmentation remains underexplored in the literature.In this study, we propose a novel automatic vocal fold segmentation system that relies solely on glottis information. The system leverages prompt engineering techniques tailored for the Segment Anything Model (SAM). Specifically, vocal fold-related features are extracted from U-Net-generated glottis masks, which are enhanced via brightness contrast adjustment and morphological closing. A coarse bounding box of the laryngeal region is also produced using the YOLO-v5 model. These components are integrated to form a bounding box prompt. Furthermore, a point prompt is derived by identifying local extrema in the first derivative of grayscale intensity along lines intersecting the glottis, offering additional guidance on vocal fold locations.Experimental evaluation demonstrates that our method, which does not require labeled vocal fold training data, achieves competitive segmentation performance. The proposed approach reaches a Dice Coefficient of 0.91, which is comparable to fully supervised methods.Our results suggest that it is feasible to achieve accurate vocal fold segmentation using only glottis-based prompts and without supervised vocal fold annotations. Extracted features on the resulting masks further validate the effectiveness of the proposed system. To encourage further research, we release our code at: https://github.com/yucongzh/Laryngoscopic-Image-Segmentation-Toolkit.

Duke Scholars

Published In

Frontiers in molecular biosciences

DOI

EISSN

2296-889X

ISSN

2296-889X

Publication Date

January 2025

Volume

12

Start / End Page

1616271

Related Subject Headings

  • 3205 Medical biochemistry and metabolomics
  • 3101 Biochemistry and cell biology
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhang, Y., Song, Y., Liu, J., & Li, M. (2025). An automatic laryngoscopic image segmentation system based on SAM prompt engineering: from glottis annotation to vocal fold segmentation. Frontiers in Molecular Biosciences, 12, 1616271. https://doi.org/10.3389/fmolb.2025.1616271
Zhang, Yucong, Yuchen Song, Juan Liu, and Ming Li. “An automatic laryngoscopic image segmentation system based on SAM prompt engineering: from glottis annotation to vocal fold segmentation.Frontiers in Molecular Biosciences 12 (January 2025): 1616271. https://doi.org/10.3389/fmolb.2025.1616271.
Zhang, Yucong, et al. “An automatic laryngoscopic image segmentation system based on SAM prompt engineering: from glottis annotation to vocal fold segmentation.Frontiers in Molecular Biosciences, vol. 12, Jan. 2025, p. 1616271. Epmc, doi:10.3389/fmolb.2025.1616271.

Published In

Frontiers in molecular biosciences

DOI

EISSN

2296-889X

ISSN

2296-889X

Publication Date

January 2025

Volume

12

Start / End Page

1616271

Related Subject Headings

  • 3205 Medical biochemistry and metabolomics
  • 3101 Biochemistry and cell biology