Establishing Inter- and Intrarater Reliability for High-Stakes Testing Using Simulation.
This article reports one method to develop a standardized training method to establish the inter- and intrarater reliability of a group of raters for high-stakes testing.Simulation is used increasingly for high-stakes testing, but without research into the development of inter- and intrarater reliability for raters.Eleven raters were trained using a standardized methodology. Raters scored 28 student videos over a six-week period. Raters then rescored all videos over a two-day period to establish both intra- and interrater reliability.One rater demonstrated poor intrarater reliability; a second rater failed all students. Kappa statistics improved from the moderate to substantial agreement range with the exclusion of the two outlier raters' scores.There may be faculty who, for different reasons, should not be included in high-stakes testing evaluations. All faculty are content experts, but not all are expert evaluators.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Videotape Recording
- Simulation Training
- Reproducibility of Results
- Nursing
- Humans
- Faculty, Nursing
- Educational Measurement
- Education, Nursing
- 4205 Nursing
- 1302 Curriculum and Pedagogy
Citation
Published In
DOI
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Videotape Recording
- Simulation Training
- Reproducibility of Results
- Nursing
- Humans
- Faculty, Nursing
- Educational Measurement
- Education, Nursing
- 4205 Nursing
- 1302 Curriculum and Pedagogy