Summary


EXAMINATION OF SCORING RELIABILITY ACCORDING TO GENERALIZABILITY THEORY IN CHECKLIST, ANALYTIC RUBRIC AND RATING SCALES
The aim of this research is to examine the inter-rater reliability in the context of G theory when the same performance tasks are rated by different raters with the help of a checklist, rating scale and analytical rubric. To this end, a checklist, rating scale and analytic rubric were prepared to rate the story-writing skills of fifth grade students. Six stories selected from the stories written by the 5th grade students of the primary school were rated 45 different raters with three different scoring keys at intervals of 10-15 days. 100 samples each were drawn with 2, 3, 5 and 10 raters from 45 raters participating in the study. For the 400 samples obtained, reliability between the raters was calculated according to G theory. For the 100 samples obtained for each case, the median and standard error were calculated. When the median values of the reliability estimates are examined, the median values increase as the number of raters and the number of categories increase, except for the median of the reliability of the raters that the 5 raters make using the checklist; it was observed that the standard errors obtained decreased as the number of raters increased. It has been determined that the lowest standard error values are obtained in the case of 10 raters. When the number of raters was 5 and the number of category was 2, it was determined that the reliability estimation gave the highest value.

Keywords
Generalizability theory, inter-rater reliability, checklist, rating scale, analytic rubric.

References