Skip to main content
Journal cover image

Forty-two Million Ways to Describe Pain: Topic Modeling of 200,000 PubMed Pain-Related Abstracts Using Natural Language Processing and Deep Learning-Based Text Generation.

Publication ,  Journal Article
Tighe, PJ; Sannapaneni, B; Fillingim, RB; Doyle, C; Kent, M; Shickel, B; Rashidi, P
Published in: Pain Med
November 1, 2020

OBJECTIVE: Recent efforts to update the definitions and taxonomic structure of concepts related to pain have revealed opportunities to better quantify topics of existing pain research subject areas. METHODS: Here, we apply basic natural language processing (NLP) analyses on a corpus of >200,000 abstracts published on PubMed under the medical subject heading (MeSH) of "pain" to quantify the topics, content, and themes on pain-related research dating back to the 1940s. RESULTS: The most common stemmed terms included "pain" (601,122 occurrences), "patient" (508,064 occurrences), and "studi-" (208,839 occurrences). Contrarily, terms with the highest term frequency-inverse document frequency included "tmd" (6.21), "qol" (6.01), and "endometriosis" (5.94). Using the vector-embedded model of term definitions available via the "word2vec" technique, the most similar terms to "pain" included "discomfort," "symptom," and "pain-related." For the term "acute," the most similar terms in the word2vec vector space included "nonspecific," "vaso-occlusive," and "subacute"; for the term "chronic," the most similar terms included "persistent," "longstanding," and "long-standing." Topic modeling via Latent Dirichlet analysis identified peak coherence (0.49) at 40 topics. Network analysis of these topic models identified three topics that were outliers from the core cluster, two of which pertained to women's health and obstetrics and were closely connected to one another, yet considered distant from the third outlier pertaining to age. A deep learning-based gated recurrent units abstract generation model successfully synthesized several unique abstracts with varying levels of believability, with special attention and some confusion at lower temperatures to the roles of placebo in randomized controlled trials. CONCLUSIONS: Quantitative NLP models of published abstracts pertaining to pain may point to trends and gaps within pain research communities.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Pain Med

DOI

EISSN

1526-4637

Publication Date

November 1, 2020

Volume

21

Issue

11

Start / End Page

3133 / 3160

Location

England

Related Subject Headings

  • Publications
  • PubMed
  • Pain
  • Natural Language Processing
  • Humans
  • Female
  • Deep Learning
  • Anesthesiology
  • 5203 Clinical and health psychology
  • 4203 Health services and systems
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Tighe, P. J., Sannapaneni, B., Fillingim, R. B., Doyle, C., Kent, M., Shickel, B., & Rashidi, P. (2020). Forty-two Million Ways to Describe Pain: Topic Modeling of 200,000 PubMed Pain-Related Abstracts Using Natural Language Processing and Deep Learning-Based Text Generation. Pain Med, 21(11), 3133–3160. https://doi.org/10.1093/pm/pnaa061
Tighe, Patrick J., Bharadwaj Sannapaneni, Roger B. Fillingim, Charlie Doyle, Michael Kent, Ben Shickel, and Parisa Rashidi. “Forty-two Million Ways to Describe Pain: Topic Modeling of 200,000 PubMed Pain-Related Abstracts Using Natural Language Processing and Deep Learning-Based Text Generation.Pain Med 21, no. 11 (November 1, 2020): 3133–60. https://doi.org/10.1093/pm/pnaa061.
Tighe, Patrick J., et al. “Forty-two Million Ways to Describe Pain: Topic Modeling of 200,000 PubMed Pain-Related Abstracts Using Natural Language Processing and Deep Learning-Based Text Generation.Pain Med, vol. 21, no. 11, Nov. 2020, pp. 3133–60. Pubmed, doi:10.1093/pm/pnaa061.
Tighe PJ, Sannapaneni B, Fillingim RB, Doyle C, Kent M, Shickel B, Rashidi P. Forty-two Million Ways to Describe Pain: Topic Modeling of 200,000 PubMed Pain-Related Abstracts Using Natural Language Processing and Deep Learning-Based Text Generation. Pain Med. 2020 Nov 1;21(11):3133–3160.
Journal cover image

Published In

Pain Med

DOI

EISSN

1526-4637

Publication Date

November 1, 2020

Volume

21

Issue

11

Start / End Page

3133 / 3160

Location

England

Related Subject Headings

  • Publications
  • PubMed
  • Pain
  • Natural Language Processing
  • Humans
  • Female
  • Deep Learning
  • Anesthesiology
  • 5203 Clinical and health psychology
  • 4203 Health services and systems