Observation is one of the im portant parts of many pedagogical researches, and is the basis of supervision and inspection. In the socialist countries, however, pedagogy has so far paid little attention to the extent of mistakes affecting the results of the observation, and to the question of what statistical methods should be used in verifying the reliability of standardized observation.
In a summary study, which is based on statistical works published abroad, the author suggests a way of solving this interesting methodological problem. First of all, definitions are given of the basic concepts (standardized observation, observational technique, rating scale, sign system, category system). Then three equivalent definitions are given of the coefficient of reliability. The reliability of observation is found to be dependent on the observer himself, on the observational technique chosen and on the circum stances of o b servation. How many times a teacher should be observed so that the results o b tained can be relied upon is a question that is discussed here.
A detailed analysis is made of statistical methods suitable for m easuring the reliability of rating scales, sign and category systems (including the conditions for use and the recom m ended values for each of the coefficients). W hat is recom m ended as regards rating scales is the use of K endall’s coefficient of agreem ent W and T aylor’s coefficient of agreem ent (in the case of ordinal scales) or the intraclass co rrelation coefficient (in the case of interval scales). As regards sign and category systems, w hat is recom m ended (depending on various conditions specified in Table 2) is the use of: the coefficient of m arginal agreem ent, Flanders’s modification of coefficient π, Cohen’s coefficient kappa. Light’s modification of coefficient kappa. A brief explanation is also given of the principle of Cronbach’s theory of generalizability.
In conclusion, users are rem inded that there is no „best“ nor „universal“ statistical m ethod to calculate the reliability of observation, but the method to be chosen is the one that is the most suitable for the case in question.