Skip to main content
Journal cover image

Development and validation of an ensemble machine learning framework for detection of all-cause advanced hepatic fibrosis: a retrospective cohort study.

Publication ,  Journal Article
Sarvestany, SS; Kwong, JC; Azhie, A; Dong, V; Cerocchi, O; Ali, AF; Karnam, RS; Kuriry, H; Shengir, M; Candido, E; Duchen, R; Sebastiani, G ...
Published in: Lancet Digit Health
March 2022

BACKGROUND: Cirrhosis is the result of advanced scarring (or fibrosis) of the liver, and is often diagnosed once decompensation with associated complications has occurred. Current non-invasive tests to detect advanced liver fibrosis have limited performance, with many indeterminate classifications. We aimed to identify patients with advanced liver fibrosis of all-causes using machine learning algorithms (MLAs). METHODS: In this retrospective study of routinely collected laboratory, clinical, and demographic data, we trained six MLAs (support vector machine, random forest classifier, gradient boosting classifier, logistic regression, artificial neural network, and an ensemble of all these algorithms) to detect advanced fibrosis using 1703 liver biopsies from patients seen at the Toronto Liver Clinic (TLC) between Jan 1, 2000, and Dec 20, 2014. Performance was validated using five datasets derived from patient data provided by the TLC (n=104 patients with a biopsy sample taken between March 24, 2014, and Dec 31, 2017) and McGill University Health Centre (MUHC; n=404). Patients with decompensated cirrhosis were excluded. Performance was benchmarked against aspartate aminotransferase-to-platelet ratio index (APRI), fibrosis-4 index (FIB-4), non-alcoholic fatty liver disease fibrosis score (NFS), transient elastography, and an independent panel of five hepatology experts (MB, GS, HK, KP, and RSK). MLA performance was evaluated using the area under the receiver operating characteristic curve (AUROC) and the percentage of determinate classifications. FINDINGS: The best MLA was an ensemble algorithm of support vector machine, random forest classifier, gradient boosting classifier, logistic regression, and neural network algorithms, which achieved 100% determinate classifications (95% CI 100·0-100·0), an AUROC score of 0·870 (95% CI 0·797-0·931) on the TLC validation set (fibrosis stages F0 and F1 vs F4), and an AUROC of 0·716 (95% CI 0·664-0·766) on the MUHC validation set (fibrosis stages F0, F1, and F2 vs F3 and F4). The ensemble MLA outperformed all routinely used biomarkers and achieved comparable performance to hepatologists as measured by AUROC and percentage of indeterminate classifications in both the TLC validation dataset (APRI AUROC score 0·719 [95% CI 0·611-0·820], 83·7% determinate [95% CI 76·0-90·4]; FIB-4 AUROC score 0·825 [95% CI 0·730-0·912], 72·1% determinate [95% CI 63·5-80·8]) and the MUHC validation dataset (APRI AUROC score 0·618 [95% CI 0·548-0·691], 75·5% determinate [95% CI 71·5-79·2]; FIB-4 AUROC score 0·717 (95% CI 0·652-0·776), 75·5% determinate [95% CI 0·713-0·797]), and achieving only slightly lower AUROC than transient elastography (0·773 [95% CI 0·699-0·834] vs 0·826 [95% CI 0·758-0·889]). INTERPRETATION: We have shown that an ensemble MLA outperforms non-imaging-based methods in detecting advanced fibrosis across different causes of liver disease. Our MLA was superior to APRI, FIB-4, and NFS with no indeterminate classifications, while achieving performance comparable to an independent panel of experts. MLAs using routinely collected data could identify patients at high-risk of advanced hepatic fibrosis and cirrhosis among patients with chronic liver disease, allowing intervention before onset of decompensation. FUNDING: Toronto General Hospital Foundation.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Lancet Digit Health

DOI

EISSN

2589-7500

Publication Date

March 2022

Volume

4

Issue

3

Start / End Page

e188 / e199

Location

England

Related Subject Headings

  • Retrospective Studies
  • Machine Learning
  • Liver Cirrhosis
  • Humans
  • Fibrosis
  • Aspartate Aminotransferases
  • 4203 Health services and systems
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Sarvestany, S. S., Kwong, J. C., Azhie, A., Dong, V., Cerocchi, O., Ali, A. F., … Bhat, M. (2022). Development and validation of an ensemble machine learning framework for detection of all-cause advanced hepatic fibrosis: a retrospective cohort study. Lancet Digit Health, 4(3), e188–e199. https://doi.org/10.1016/S2589-7500(21)00270-3
Sarvestany, Soren Sabet, Jeffrey C. Kwong, Amirhossein Azhie, Victor Dong, Orlando Cerocchi, Ahmed Fuad Ali, Ravikiran S. Karnam, et al. “Development and validation of an ensemble machine learning framework for detection of all-cause advanced hepatic fibrosis: a retrospective cohort study.Lancet Digit Health 4, no. 3 (March 2022): e188–99. https://doi.org/10.1016/S2589-7500(21)00270-3.
Sarvestany SS, Kwong JC, Azhie A, Dong V, Cerocchi O, Ali AF, et al. Development and validation of an ensemble machine learning framework for detection of all-cause advanced hepatic fibrosis: a retrospective cohort study. Lancet Digit Health. 2022 Mar;4(3):e188–99.
Sarvestany, Soren Sabet, et al. “Development and validation of an ensemble machine learning framework for detection of all-cause advanced hepatic fibrosis: a retrospective cohort study.Lancet Digit Health, vol. 4, no. 3, Mar. 2022, pp. e188–99. Pubmed, doi:10.1016/S2589-7500(21)00270-3.
Sarvestany SS, Kwong JC, Azhie A, Dong V, Cerocchi O, Ali AF, Karnam RS, Kuriry H, Shengir M, Candido E, Duchen R, Sebastiani G, Patel K, Goldenberg A, Bhat M. Development and validation of an ensemble machine learning framework for detection of all-cause advanced hepatic fibrosis: a retrospective cohort study. Lancet Digit Health. 2022 Mar;4(3):e188–e199.
Journal cover image

Published In

Lancet Digit Health

DOI

EISSN

2589-7500

Publication Date

March 2022

Volume

4

Issue

3

Start / End Page

e188 / e199

Location

England

Related Subject Headings

  • Retrospective Studies
  • Machine Learning
  • Liver Cirrhosis
  • Humans
  • Fibrosis
  • Aspartate Aminotransferases
  • 4203 Health services and systems