Interesting but seemingly rather depressing study on differentiated instruction based on data

I found this study by Faber et al. via this tweet by Paul Bruno who highlighted the one-phrase summary:

https://twitter.com/MrPABruno/status/898549534623539204

The study in itself is very interesting and relevant and more complicated than this one sentence. Although a personal frustration: every single time they used the abbreviation DI I had to correct myself that it’s not about Direct Instruction, but in this case stands for differentiated instruction.

What is this study about:

In this study, the relationship between differentiated instruction, as an element of data-based decision making, and student achievement was examined. Classroom observations (n = 144) were used to measure teachers’ differentiated instruction practices and to predict the mathematical achievement of 2nd- and 5th-grade students (n = 953). The analysis of classroom observation data was based on a combination of generalizability theory and item response theory, and student achievement effects were determined by means of multilevel analysis. No significant positive effects were found for differentiated instruction practices. Furthermore, findings showed that students in low-ability groups profited less from differentiated instruction than students in average or high-ability groups. Nevertheless, the findings, data collection, and data-analysis procedures of this study contribute to the study of classroom observation and the measurement of differentiated instruction.

This insight makes things even worse: low-ability groups profited less from differentiated instruction than students in average or high-ability groups.

Let’s dig a bit deeper. Some important definitions used in the study:

data-based decision making (DBDM): “Teachers, principals, and administrators systematically collecting and analyzing data to guide a range of decisions to help improve the success of students and schools” (Ikemoto and Marsh, 2007)
On differentiated instructions:

First, DI is planned, and instructional decisions should be based on the analysis of student data. Second, what makes DI observable in the classroom is the variation in learning goals, instruction content, instruction time, assignments, and learning materials aimed at addressing varying learning needs. In the present study, we tested whether these DI characteristics explain student achievement.

And now for the results:

…the findings of the generalizability study showed that, even though most variance was explained by differences between teachers, there was much variability between the lessons of the same teacher. These observation time effects were also found in a study by Praetorius, Pauli, Reusser, Rakoczy, and Klieme (2014), and such findings indicate that more research is needed on how valid and representative teacher observation scores can be obtained. Furthermore, our findings indicated that students from different ability groups do not profit from DI to the same extent. This finding is in line with previous research: Ability grouping can have a negative impact on the achievement of students in low-ability groups, ability grouping is effective for students in average-ability groups, and ability grouping has no impact on the achievement of students in high-ability groups (Lou et al., 1996; Saleh et al., 2005). In future research, it would be worth investigating whether lower teacher expectations, less stimulating learning materials, and a lack of self-regulation skills among low-performing students (Campbell, 2014; Hong et al., 2012; Nomi, 2009; Wiliam & Bartholomew, 2004) could explain the negative impact of DI on the achievement of students in low-ability groups. Furthermore, we expected that students taught by teachers who differentiate their instruction more, or by teachers who plan DI more, have higher student achievement levels. No such positive effects were found. A reverse causality between DI and student achievement (i.e., DI practices are executed more in classrooms with many low-performing students and a very diverse student population) might be an explanation for this finding (De Neve & Devos, 2016; Nomi, 2009). Another explanation might be the impact of DI on noncognitive outcomes such as students’ feelings of competence (Carver & Scheier, 1990). Especially for students in low-ability groups, there might have been an impact on noncognitive outcomes, and consequently on student achievement. Also, these findings may suggest that planning differentiation strategies in advance should always be combined with responsive ad hoc classroom differentiation practices. It may be that a balance between preplanned instruction and responsive teaching is most effective (Sawyer, 2004). In future studies, such effects should be studied to explain better how DBDM affects student achievement.

This makes it a bit less surprising. As noted in this paragraph, we know there are possible issues with ability grouping, at the same time this study does give a lot of food for thought when looking at differentiation.

Little note: as all studies, every study has it limitations. In this case two of those limitations were – for me – at first a bit surprising:

the relationship between DBDM and DI was notexamined. Based on the DBDM literature, it was assumed that DBDM could result in more data-based DI practices in classrooms and that, if this is the case, student achieve- ment would consequently improve. If DBDM does not result in more data-based DI practices, then DI does not explain (potential) student achievement growth. So, our findings would have contributed more to our understanding of how DBDM influences achievement, if the relationship between DBDM and DI could also have been examined.

In addition to this, the Focus intervention was based on the DBDM literature and not on the DI literature. As a result, some effective DI practices unfortunately were not included in the intervention.

So does this all mean that we shouldn’t differentiate in education. No, but it does suggest to be careful to get your hopes up too high when talking about data as basis for differentiation. It still depends on what you do with that data. Or: more data doesn’t necessarily make something work that didn’t work that well before (cfr ability grouping).