There is no shortage of opinions about AI in education at the moment. New tools, new promises, new concerns. And increasingly, new studies. The past weekend in Tokyo I also had many talks about the subject (thank you Carl, Barb and the infamous many others).
But if you take a step back, something odd appears. We keep asking the same question: Does it work? That sounds reasonable. It’s also the wrong question. Or at least: it’s a question that sits at the wrong level.
We’ve seen a similar argument before: that much of the debate around ChatGPT in education is confused at its core, because we treat the tool as if it were the intervention itself (What if we’re asking the wrong question about ChatGPT in education?)
Take the growing body of research on AI in classrooms. Much of it still focuses on outcomes. Test scores, task performance, and sometimes motivation. Useful, but often limited. Effects are small, context-dependent, and not always where you would expect them to be. In that sense, they resemble many other educational interventions.
But when you look a bit closer, a different line of work starts to emerge. Not asking whether AI “works”, but what it actually does to the work of teachers. A recent paper I discussed earlier moves exactly in that direction: instead of comparing outcomes, it examines how tasks shift, what gets automated, and what becomes more central in the process (What does AI do to teachers’ work?).
That shift in perspective matters. Once you start looking at tasks rather than outcomes, the conversation changes. It becomes less about replacing teachers and more about reconfiguring what teaching actually consists of.
At the same time, a parallel discussion is taking place outside academia. One that is often louder and less nuanced. Concerns about cheating. About students’ outsourcing thinking. Or, on the other side, strong claims that not using AI will soon be irresponsible. We should urgently teach every student to use these tools.
Both sides tend to oversimplify.
We have seen similar dynamics before. In a recent post on AI-generated study advice, I argued that even when the tips look correct on the surface, they can miss something essential: learning is not just about what you do, but how and when you do it (Google published AI study tips… That’s exactly the problem).
The same applies at a larger scale. Whether it is AI colleges or more radical visions of fully automated learning environments, the risk is not only that they might not work. It is that they reduce education to something narrower than it actually is.
And this is where things become slightly uncomfortable. Because even without AI, we already tend to misunderstand how learning works. We often assume that more engaging explanations automatically lead to better learning. But multimedia, for instance, does not simply “add” learning. It helps under specific conditions, and can just as easily overload learners if used poorly (Check also: Multimedia works. But not always the way we think).
Trust me: even without AI, learning is already less intuitive than we think. Add AI to that mix, and the risk is not just that we get things wrong. It is that we get them wrong faster and with more confidence.
There is another layer to this as well. One that tends to get even less attention. Not every “good” intervention is good for everyone. Something as simple as reading at home can widen gaps between students, precisely because not everyone has the same access to time, resources or support. What helps one group may advantage them even more (How something good can increase inequality).
And the same logic applies to schools and systems. Differences in outcomes are rarely explained by a single factor, and almost never without context.
So when we introduce AI, the question is not just whether it works.
It is: for whom does it work? Under which conditions? And what does it change in the process?
There is a final piece that often gets lost in these discussions. Learning is not a perfectly stable process. Performance fluctuates. Motivation shifts. Context matters more than we would like to admit.
And while not everything has to be fun, the experience of learning still matters in ways that are hard to reduce to efficiency or output (Not everything has to be fun. But joy in learning still matters).
Which brings us back to AI. Maybe the real question is not whether students should use it or whether teachers should allow it. Maybe the more useful question is whether we understand learning well enough to decide when it helps, when it doesn’t, and what we might lose along the way.
Because if we don’t, AI will not fix that.
It will simply scale our misunderstandings.