Page 96 - Balancing between the present and the past
P. 96

                                Chapter 4
lessons, and o represents the number of observers. To determine the optimal number of observers and lessons needed in a scoring design to achieve acceptable reliability, we conducted a D-study using the information from the earlier conducted G-study that estimated the reliability of our instrument.
4.5 Results
4.5.1 The instrument’s dimensionality
Based on our theoretical framework, we consider our instrument to be one- dimensional because all items should measure teachers’ ability to promote historical contextualization. The first data analysis indicated that five items (“the teacher asks evaluative questions,” “the teacher uses classroom discussion,” “the teacher uses group work,” “the teacher compares phenomena with the present,” and “the students compare phenomena with the present”) displayed a low correlation (< .30) with the other items. These five items also obtained a standard deviation above 1.00 and were excluded from further data analysis, resulting in a total list of 40 items in the final version of the FAT-HC observation instrument (see Appendix D).
To further explore the instrument’s dimensionality, we conducted a G-study at the item level with seven facets in a crossed design using the collected data of the five observers who each evaluated two lessons taught by five teachers (50 observations in total). If our instrument is, in fact, one-dimensional, the item facet should explain the main part of the overall variance and the other facets (including the interaction effects) should explain a lesser part of the variance (e.g., Brennan, 2001; Shavelson & Webb, 1991). As shown in Table 13, the item facet was responsible for most of the variance (47.25%), indicating that our instrument is one-dimensional in regards to observing how history teachers promote historical contextualization in classrooms.
4.5.2 The instrument’s reliability
To determine the reliability of our instrument, a new G-study was conducted using the same data set (50 observations). The analysis was conducted on the final version of our observation instrument, which consisted of 40 items (see Appendix D). Table 14 displays the results of this G-study and presents the variance decomposition to assess the instrument’s reliability. A reliable instrument should have a high proportion of the variance explained by differences between the observed teachers and a low proportion of the variance explained by lessons and observers.
94


























































































   94   95   96   97   98