Skip to main content

Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers

Publication ,  Journal Article
Kramer, RJ; Rhodin, KE; Therien, A; Raman, V; Eckhoff, A; Thompson, C; Tong, BC; Blazer, DG; Lidsky, ME; D'Amico, T; Nussbaum, DP
Published in: Surgical Oncology Insight
March 1, 2024

Objective: Patients with gastrointestinal malignancies represent a heterogenous population, even among those with similar stage and treatment pathways. Here, we used dimensionality reduction in the National Cancer Database (NCDB) to inform unsupervised clustering of patients with three gastrointestinal malignancies and examined outcomes among these computationally-derived groups. Methods: The NCDB was queried for three cohorts of patients receiving multimodal therapy: stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. Multiple correspondence analysis (MCA), a dimensionality reduction technique well-suited for categorical variables such as demographic data in the NCDB, was performed on this cohort with variables including demographic and tumor characteristics. Principal components were analyzed to derive clusters. Outcomes for each cluster were compared using Kaplan-Meier survival methods. Results: For esophageal (n = 11,399), gastric (n = 2033), and colon (n = 72,057) cancer, the same four variables were identified as highly representative. The principal variables were income quartile, education quartile, age quartile, and insurance type. Survival analysis demonstrated significant differences in overall survival between clusters in esophageal (p < 0.0001) and colon (p < 0.0001) cancer, but not gastric cancer (p = 0.56). Clusters defined by high income, high education, younger age, and private insurance fared better. Conclusions: Using MCA, we identified combinations of 4 demographic variables in the NCDB with stage II/III esophageal cancer, stage II/III gastric cancer, and stage III colon cancer. These groupings had significantly different survival outcomes in colon and esophageal cancer. This work serves as proof-of-concept for the utility of unsupervised clustering for outcomes research in surgical malignancies and identifies at-risk populations.

Duke Scholars

Published In

Surgical Oncology Insight

DOI

EISSN

2950-2470

Publication Date

March 1, 2024

Volume

1

Issue

1
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Kramer, R. J., Rhodin, K. E., Therien, A., Raman, V., Eckhoff, A., Thompson, C., … Nussbaum, D. P. (2024). Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers. Surgical Oncology Insight, 1(1). https://doi.org/10.1016/j.soi.2024.100009
Kramer, R. J., K. E. Rhodin, A. Therien, V. Raman, A. Eckhoff, C. Thompson, B. C. Tong, et al. “Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers.” Surgical Oncology Insight 1, no. 1 (March 1, 2024). https://doi.org/10.1016/j.soi.2024.100009.
Kramer RJ, Rhodin KE, Therien A, Raman V, Eckhoff A, Thompson C, et al. Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers. Surgical Oncology Insight. 2024 Mar 1;1(1).
Kramer, R. J., et al. “Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers.” Surgical Oncology Insight, vol. 1, no. 1, Mar. 2024. Scopus, doi:10.1016/j.soi.2024.100009.
Kramer RJ, Rhodin KE, Therien A, Raman V, Eckhoff A, Thompson C, Tong BC, Blazer DG, Lidsky ME, D’Amico T, Nussbaum DP. Unsupervised clustering using multiple correspondence analysis reveals clinically-relevant demographic variables across multiple gastrointestinal cancers. Surgical Oncology Insight. 2024 Mar 1;1(1).

Published In

Surgical Oncology Insight

DOI

EISSN

2950-2470

Publication Date

March 1, 2024

Volume

1

Issue

1