AI in education: what 800 studies do (and don’t) tell us

There is currently no shortage of opinions on AI in education. What remains scarce, however, is solid evidence. That makes the recent report, The Evidence Base on AI in K-12: A 2026 Review, interesting. Not because it brings spectacular conclusions, but precisely because it does not. It exposes just how thin the real knowledge base still is, while simultaneously showing which patterns are beginning to emerge.

Let us start with the most uncomfortable point. There are now more than 800 studies on AI in education, but only about twenty provide strong causal evidence. This means that in most cases, we do not know what AI *really* causes, only what is associated with it. And even then. Many studies are short, take place in artificial settings, and measure immediate effects.

At the same time, even with those limitations, we see a fairly consistent pattern. When students have access to AI, they perform better. This applies to mathematics, programming, and writing. So, at the moment itself, it works. But… as soon as that support is removed, the picture becomes much less clear. Sometimes something sticks, but usually not. The difference between performing with a tool and learning without it becomes painfully clear here.

That comes as no surprise when viewed from the perspective of what we already know about learning. AI reduces cognitive load. Tasks feel easier, students experience less friction, and often more motivation. However, that very friction is sometimes necessary to learn. What Bjork once called “desirable difficulties” partially disappear when a system takes over the thinking. Easier is not automatically better.

It becomes interesting when you look at how AI is designed. Tools that simply provide answers seem less effective than systems that steer, ask questions, or provide step-by-step guidance. In other words, AI as a cheat sheet works differently from AI as a tutor. This aligns nicely with classic insights such as the zone of proximal development. Support works best when it does just the right amount.

For teachers, the story is different. There, the effects are remarkably more consistent. AI can save time, especially in lesson preparation and feedback. In one study, it amounted to about half an hour less work per week, without loss of quality. That seems modest, but in an overloaded job, that is no small thing.

In addition, there are indications that AI can enhance instructional quality. Systems that provide real-time suggestions or analyse feedback help teachers ask more targeted questions or respond better to students. Notably, the benefits are often greater for less experienced teachers. This opens up interesting perspectives regarding professional development and inequality between schools.

But caution is needed here as well. Saving time does not automatically mean less work. In some studies, teachers simply use the freed-up time to provide more or more in-depth feedback. In that case, efficiency does not translate into fewer hours, but into a different use of those hours.

What remains striking, but perhaps logically absent from research for the time being, are long-term effects. We know little about how AI affects deep learning, metacognition, or independence. We know even less about the effects on well-being, social development, or inequality. Yet these are precisely the questions that arise most frequently in the debate.

In my opinion, this is where the report’s most important message lies. Not whether AI works or not, but that the answer will always depend on how, when, and for whom. The current evidence is too narrow to draw major conclusions, but broad enough to puncture several illusions.

Leave a Reply