I confess: when I first read the title of this article (“A national megastudy shows that email nudges to elementary school teachers boost student math achievement”) and the accompanying press release, alarm bells immediately started ringing. Big words, big numbers, big promises. And one well-known name in the author list, Angela Duckworth, has previously been associated with claims that later required substantial nuance. That is not a reason to dismiss the study outright, but it is a reason to read it very carefully. So let us judge the research on its own merits. And if you sense a certain irritation in what follows, that is no accident.
What is this study actually about? More than 140,000 American teachers who use the online platform Zearn Math received one email per week for four weeks. There were fifteen different versions of these emails, plus a control group that received a standard reminder. Some messages referenced concrete data about a teacher’s own class; others remained generic. The researchers did not examine test scores or mathematical understanding, but focused on a single outcome: how many online lessons students completed on the platform during those four weeks. The most effective email increased this number by an average of 0.09 lessons per month. This difference was statistically significant.
Let us begin with what the study does well. Methodologically, this is impressive work. More than 140,000 teachers and nearly three million students were randomly assigned to fifteen different email interventions, following a preregistered analysis plan. This is not a small laboratory experiment, but large-scale field research in real schools, with real teachers and real students. That deserves credit. The authors are also transparent about technical problems. For more than 11% of teachers, email delivery failed. They explicitly correct for multiple testing and for the so-called winner’s curse, the statistical phenomenon in which the best-performing condition among many conditions almost always appears slightly better than it truly is. This is careful and honest research.
But then comes the key question: what is the effect?
The best performing intervention, in which teachers received a weekly email with a link to a personalised dashboard showing their students’ progress, led to an average increase of 0.09 lessons over four weeks. That corresponds to about a five per cent increase, or 3.3 per cent after statistical correction. In absolute terms, this means a change from 1.78 to 1.81 lessons in four weeks. This is what is presented as “boosting student achievement”.
And this is precisely where the big words become problematic. The result is statistically significant, but substantively extremely small. We are talking about an additional online lesson per month. The effect size is d = 0.02. In most practical conversations, this would be rounded down to zero. The authors themselves acknowledge this and describe the effect as “surprisingly small”.
What the study really shows is how difficult it is to change teachers’ behaviour through light-touch interventions. That, in itself, is an important message. Researchers, platform staff, and teachers alike predicted effects that were 30 times larger than those actually observed. This may be one of the most interesting findings of the entire study. Our intuitions about what will work are systematically too optimistic.
There is also a second crucial question: what exactly was measured? Not mathematics achievement in any conventional sense, but the number of completed lessons on one specific online platform. More activity on Zearn Math does not automatically translate into deeper understanding, let alone durable learning gains. The authors explicitly acknowledge the distinction between performance and learning. Yet the title still suggests that this is about “achievement”. That is, at the very least, debatable.
Moreover, this intervention barely touches teaching itself. It consists of emails encouraging teachers to log into a dashboard more often. The causal chain is long: email, login, attention, and a slight increase in platform use by students. This is not a pedagogical innovation, but an administrative nudge. It is difficult to see why anyone would expect such a mechanism to have a large educational impact.
One finding is worth noting. Emails that contained personalised information about a teacher’s own class worked slightly better than generic messages. When teachers receive concrete, relevant data, they respond a little more. This fits well with what we already know about feedback and motivation: specificity matters. But again, the size of the effect remains modest, roughly a 2% increase in completed lessons.
What also stands out is how strongly scale and meaning diverge. With such a huge sample, even a tiny effect becomes statistically significant. But from a policy and pedagogical perspective, the question remains: is this worth it? Do we really want to invest in systems that send millions of emails in order to generate, on average, 0.06 extra lessons per teacher per month?
Interestingly, the authors themselves are much more cautious in their discussion than the title suggests. They speak of small effects, the need for further research, possible fade-out over time, and the fact that there is no miracle solution. The problem lies less in the analysis than in the framing. “Boost student math achievement” sounds as if something substantial has been achieved, while in reality, this is a marginal shift in platform usage.
This raises a broader question about how we think about educational improvement. Studies like this risk reinforcing the idea that we can address complex learning problems through behavioural micro-interventions. As if a cleverly worded email matters more than curriculum design, instructional quality, teacher education or time for preparation. Ironically, this study suggests the opposite. Even with enormous scale and careful design, the gains remain very small.
If you read the article without a marketing lens, something valuable remains. Not that emails improve mathematics education, but that education does not lend itself easily to improvement through emails. That is a far more honest and interesting conclusion than the title implies. You just have to work a bit to extract it.
My main objection, therefore, is not to the study itself, but to how it is presented. This is solid, large-scale and transparent research about a very small effect. The authors frame it as a breakthrough. That is unfortunate. Anyone who truly believes in evidence-informed education should also believe in the precision of language. Saying what something is, rather than what we wish it were.