Page 98 - Balancing between the present and the past
P. 98
Chapter 4
The results of our D-study can be found in Figure 3. A scoring design with one observer evaluating one lesson taught by a teacher yields a Φ of .59 (poor reliability), and this value increases to Φ = .72 when one observer evaluates two lessons taught by the same teacher. Because we are interested in research purposes and formative evaluations, the optimal scoring design would use two observers who each evaluate two different lessons taught by the same teacher (Φ = .83) or three observers who each evaluate the same lesson taught by a teacher (Φ = .80).
Figure 3. Results of the D-study
4.6 Conclusion and discussion
The aim of the present study was to develop a reliable observation instrument and scoring design to assess how history teachers promote historical contextualization in classrooms. This study resulted in the FAT-HC observation instrument. Using
96