Establishing Inter- and Intrarater Reliability for High-Stakes Testing Using Simulation.

Journal Article (Journal Article)

This article reports one method to develop a standardized training method to establish the inter- and intrarater reliability of a group of raters for high-stakes testing.Simulation is used increasingly for high-stakes testing, but without research into the development of inter- and intrarater reliability for raters.Eleven raters were trained using a standardized methodology. Raters scored 28 student videos over a six-week period. Raters then rescored all videos over a two-day period to establish both intra- and interrater reliability.One rater demonstrated poor intrarater reliability; a second rater failed all students. Kappa statistics improved from the moderate to substantial agreement range with the exclusion of the two outlier raters' scores.There may be faculty who, for different reasons, should not be included in high-stakes testing evaluations. All faculty are content experts, but not all are expert evaluators.

Full Text

Duke Authors

Cited Authors

  • Kardong-Edgren, S; Oermann, MH; Rizzolo, MA; Odom-Maryon, T

Published Date

  • March 2017

Published In

Volume / Issue

  • 38 / 2

Start / End Page

  • 63 - 68

PubMed ID

  • 29194298

International Standard Serial Number (ISSN)

  • 1536-5026

Digital Object Identifier (DOI)

  • 10.1097/01.nep.0000000000000114


  • eng