Skip to main content
Journal cover image

Demographic reporting in biosignal datasets: a comprehensive analysis of the PhysioNet open access database.

Publication ,  Journal Article
Jiang, S; Ashar, P; Shandhi, MMH; Dunn, J
Published in: The Lancet. Digital health
November 2024

The PhysioNet open access database (PND) is one of the world's largest and most comprehensive repositories of biosignal data and is widely used by researchers to develop, train, and validate algorithms. To contextualise the results of such algorithms, understanding the underlying demographic distribution of the data is crucial-specifically, the race, ethnicity, sex or gender, and age of study participants. We sought to understand the underlying reporting patterns and characteristics of the demographic data of the datasets available on PND. Of the 181 unique datasets present in the PND as of July 6, 2023, 175 involved human participants, with less than 7% of studies reporting on all four of the key demographic variables. Furthermore, we found a higher rate of reporting sex or gender and age than race and ethnicity. In the studies that did include participant sex or gender, the samples were mostly male. Additionally, we found that most studies were done in North America, particularly in the USA. These imbalances and poor reporting of representation raise concerns regarding potential embedded biases in the algorithms that rely on these datasets. They also underscore the need for universal and comprehensive reporting practices to ensure equitable development and deployment of artificial intelligence and machine learning tools in medicine.

Duke Scholars

Published In

The Lancet. Digital health

DOI

EISSN

2589-7500

ISSN

2589-7500

Publication Date

November 2024

Volume

6

Issue

11

Start / End Page

e871 / e878

Related Subject Headings

  • Racial Groups
  • Male
  • Humans
  • Female
  • Ethnicity
  • Demography
  • Databases, Factual
  • Algorithms
  • Adult
  • Access to Information
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Jiang, S., Ashar, P., Shandhi, M. M. H., & Dunn, J. (2024). Demographic reporting in biosignal datasets: a comprehensive analysis of the PhysioNet open access database. The Lancet. Digital Health, 6(11), e871–e878. https://doi.org/10.1016/s2589-7500(24)00170-5
Jiang, Sarah, Perisa Ashar, Md Mobashir Hasan Shandhi, and Jessilyn Dunn. “Demographic reporting in biosignal datasets: a comprehensive analysis of the PhysioNet open access database.The Lancet. Digital Health 6, no. 11 (November 2024): e871–78. https://doi.org/10.1016/s2589-7500(24)00170-5.
Jiang S, Ashar P, Shandhi MMH, Dunn J. Demographic reporting in biosignal datasets: a comprehensive analysis of the PhysioNet open access database. The Lancet Digital health. 2024 Nov;6(11):e871–8.
Jiang, Sarah, et al. “Demographic reporting in biosignal datasets: a comprehensive analysis of the PhysioNet open access database.The Lancet. Digital Health, vol. 6, no. 11, Nov. 2024, pp. e871–78. Epmc, doi:10.1016/s2589-7500(24)00170-5.
Jiang S, Ashar P, Shandhi MMH, Dunn J. Demographic reporting in biosignal datasets: a comprehensive analysis of the PhysioNet open access database. The Lancet Digital health. 2024 Nov;6(11):e871–e878.
Journal cover image

Published In

The Lancet. Digital health

DOI

EISSN

2589-7500

ISSN

2589-7500

Publication Date

November 2024

Volume

6

Issue

11

Start / End Page

e871 / e878

Related Subject Headings

  • Racial Groups
  • Male
  • Humans
  • Female
  • Ethnicity
  • Demography
  • Databases, Factual
  • Algorithms
  • Adult
  • Access to Information