Severin classification system for evaluation of the results of operative treatment of congenital dislocation of the hip. A study of intraobserver and interobserver reliability.
The Severin classification system frequently is used to evaluate the radiographic results of operations performed for the treatment of congenital dislocation of the hip. However, the reliability of this classification scheme has not been established, to our knowledge. Ideally, a classification system should be validated before it is used to promote therapeutic guidelines or to compare results of treatment; the purpose of the present study was to establish the intraobserver and interobserver reliability of the Severin classification system. Four blinded raters and the operating surgeon independently used the Severin system to evaluate the most recent radiographs of thirty-seven children (fifty-six hips) who had been managed, an average of nine years previously, with a medial open reduction for congenital dislocation of the hip. Three of the raters evaluated the same radiographs again under similar testing circumstances eight weeks later. Ten paired interobserver and three intraobserver comparisons then were analyzed with use of the Cohen kappa coefficient (kappa). The average kappa coefficient for the six pairwise comparisons between the four blinded raters was 0.15 (range, -0.05 to 0.42) when all Severin classes were analyzed independently. The average kappa coefficient for the four pairwise comparisons between the blinded raters and the operating surgeon was even lower (0.02). The kappa coefficients for the three intraobserver comparisons were 0.20, 0.38, and 0.44 (average, 0.34). Kappa analysis demonstrated variable and low levels of agreement when the Severin system was used to rate the results of operations performed for the treatment of congenital dislocation of the hip. We believe that the unadjusted kappa coefficient should indicate excellent agreement (kappa > 0.75) for all comparisons if this system is to be used for the evaluation of clinical results. The unacceptably low levels of intraobserver and interobserver reliability call into question the clinical conclusions of reports in which the Severin system has been used as the basis of proof.
Ward, WT; Vogt, M; Grudziak, JS; Tümer, Y; Cook, PC; Fitch, RD
Volume / Issue
Start / End Page
International Standard Serial Number (ISSN)