Skip to main content
Journal cover image

Using supervised machine learning classifiers to estimate likelihood of participating in clinical trials of a de-identified version of ResearchMatch

Publication ,  Journal Article
Vazquez, J; Abdelrahman, S; Byrne, LM; Russell, M; Harris, P; Facelli, JC
Published in: Journal of Clinical and Translational Science
January 1, 2021

Introduction: Lack of participation in clinical trials (CTs) is a major barrier for the evaluation of new pharmaceuticals and devices. Here we report the results of the analysis of a dataset from ResearchMatch, an online clinical registry, using supervised machine learning approaches and a deep learning approach to discover characteristics of individuals more likely to show an interest in participating in CTs. Methods: We trained six supervised machine learning classifiers (Logistic Regression (LR), Decision Tree (DT), Gaussian Naïve Bayes (GNB), K-Nearest Neighbor Classifier (KNC), Adaboost Classifier (ABC) and a Random Forest Classifier (RFC)), as well as a deep learning method, Convolutional Neural Network (CNN), using a dataset of 841,377 instances and 20 features, including demographic data, geographic constraints, medical conditions and ResearchMatch visit history. Our outcome variable consisted of responses showing specific participant interest when presented with specific clinical trial opportunity invitations (‘yes’ or ‘no’). Furthermore, we created four subsets from this dataset based on top self-reported medical conditions and gender, which were separately analysed. Results: The deep learning model outperformed the machine learning classifiers, achieving an area under the curve (AUC) of 0.8105. Conclusions: The results show sufficient evidence that there are meaningful correlations amongst predictor variables and outcome variable in the datasets analysed using the supervised machine learning classifiers. These approaches show promise in identifying individuals who may be more likely to participate when offered an opportunity for a clinical trial.

Duke Scholars

Published In

Journal of Clinical and Translational Science

DOI

EISSN

2059-8661

Publication Date

January 1, 2021

Volume

5

Issue

1
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Vazquez, J., Abdelrahman, S., Byrne, L. M., Russell, M., Harris, P., & Facelli, J. C. (2021). Using supervised machine learning classifiers to estimate likelihood of participating in clinical trials of a de-identified version of ResearchMatch. Journal of Clinical and Translational Science, 5(1). https://doi.org/10.1017/cts.2020.535
Vazquez, J., S. Abdelrahman, L. M. Byrne, M. Russell, P. Harris, and J. C. Facelli. “Using supervised machine learning classifiers to estimate likelihood of participating in clinical trials of a de-identified version of ResearchMatch.” Journal of Clinical and Translational Science 5, no. 1 (January 1, 2021). https://doi.org/10.1017/cts.2020.535.
Vazquez J, Abdelrahman S, Byrne LM, Russell M, Harris P, Facelli JC. Using supervised machine learning classifiers to estimate likelihood of participating in clinical trials of a de-identified version of ResearchMatch. Journal of Clinical and Translational Science. 2021 Jan 1;5(1).
Vazquez, J., et al. “Using supervised machine learning classifiers to estimate likelihood of participating in clinical trials of a de-identified version of ResearchMatch.” Journal of Clinical and Translational Science, vol. 5, no. 1, Jan. 2021. Scopus, doi:10.1017/cts.2020.535.
Vazquez J, Abdelrahman S, Byrne LM, Russell M, Harris P, Facelli JC. Using supervised machine learning classifiers to estimate likelihood of participating in clinical trials of a de-identified version of ResearchMatch. Journal of Clinical and Translational Science. 2021 Jan 1;5(1).
Journal cover image

Published In

Journal of Clinical and Translational Science

DOI

EISSN

2059-8661

Publication Date

January 1, 2021

Volume

5

Issue

1