Interobserver Reproducibility of the PI-RADS Version 2 Lexicon: A Multicenter Study of Six Experienced Prostate Radiologists.

Journal Article (Journal Article;Multicenter Study)

Purpose To determine the interobserver reproducibility of the Prostate Imaging Reporting and Data System (PI-RADS) version 2 lexicon. Materials and Methods This retrospective HIPAA-compliant study was institutional review board-approved. Six radiologists from six separate institutions, all experienced in prostate magnetic resonance (MR) imaging, assessed prostate MR imaging examinations performed at a single center by using the PI-RADS lexicon. Readers were provided screen captures that denoted the location of one specific lesion per case. Analysis entailed two sessions (40 and 80 examinations per session) and an intersession training period for individualized feedback and group discussion. Percent agreement (fraction of pairwise reader combinations with concordant readings) was compared between sessions. κ coefficients were computed. Results No substantial difference in interobserver agreement was observed between sessions, and the sessions were subsequently pooled. Agreement for PI-RADS score of 4 or greater was 0.593 in peripheral zone (PZ) and 0.509 in transition zone (TZ). In PZ, reproducibility was moderate to substantial for features related to diffusion-weighted imaging (κ = 0.535-0.619); fair to moderate for features related to dynamic contrast material-enhanced (DCE) imaging (κ = 0.266-0.439); and fair for definite extraprostatic extension on T2-weighted images (κ = 0.289). In TZ, reproducibility for features related to lesion texture and margins on T2-weighted images ranged from 0.136 (moderately hypointense) to 0.529 (encapsulation). Among 63 lesions that underwent targeted biopsy, classification as PI-RADS score of 4 or greater by a majority of readers yielded tumor with a Gleason score of 3+4 or greater in 45.9% (17 of 37), without missing any tumor with a Gleason score of 3+4 or greater. Conclusion Experienced radiologists achieved moderate reproducibility for PI-RADS version 2, and neither required nor benefitted from a training session. Agreement tended to be better in PZ than TZ, although was weak for DCE in PZ. The findings may help guide future PI-RADS lexicon updates. (©) RSNA, 2016 Online supplemental material is available for this article.

Full Text

Duke Authors

Cited Authors

  • Rosenkrantz, AB; Ginocchio, LA; Cornfeld, D; Froemming, AT; Gupta, RT; Turkbey, B; Westphalen, AC; Babb, JS; Margolis, DJ

Published Date

  • September 2016

Published In

Volume / Issue

  • 280 / 3

Start / End Page

  • 793 - 804

PubMed ID

  • 27035179

Pubmed Central ID

  • PMC5006735

Electronic International Standard Serial Number (EISSN)

  • 1527-1315

Digital Object Identifier (DOI)

  • 10.1148/radiol.2016152542


  • eng

Conference Location

  • United States