What AI does better than teachers. And especially what it doesn’t.

One of the things teachers often hope for when it comes to AI in education is that it might finally reduce the time they spend providing feedback on writing assignments. Good feedback is one of the most powerful ways to promote learning, yet it also remains one of the most time-consuming parts of a teacher’s job.

A new systematic review of 34 intervention studies from 19 countries suggests that this promise is not entirely unfounded. Large language models, most commonly versions of ChatGPT in the reviewed studies, are surprisingly good at providing feedback on student writing. At the same time, however, the review also makes it very clear where their limits lie.

What does AI excel at?

Virtually all of the systems reviewed provide rapid feedback on grammar, spelling, word choice, style and structure. Grammarly, for example, also helped me with some of the wording while writing this post. As a result, students can revise their work more quickly and are more likely to go through multiple rounds of revision. Several studies also report higher levels of engagement and improvements in what researchers call feedback literacy: students become better at understanding and using feedback on their writing. That strikes me as a meaningful gain.

The findings on writing quality are also cautiously encouraging. Across several interventions, students produced better texts after using AI-generated feedback.

When does AI fall short?

Quite simply, once feedback becomes more complex. The review found that human teachers still outperform AI in higher-order aspects of writing. Think about the quality of an argument, the coherence of a line of reasoning, deciding which issues deserve attention first, or tailoring feedback to an individual student’s needs and learning process.

Frankly, that seems entirely logical. A language model primarily sees a text. A teacher, hopefully, also sees the student behind the text.

Interestingly, the authors interpret their findings through the well-known feedback model developed by Hattie and Timperley. AI performs particularly well at answering the question “How am I doing?” (feed back): identifying mistakes, highlighting weaknesses and suggesting improvements.

It struggles much more with the questions “What should I do next?” (feed forward) and “How can I become better at improving my own work?” (self-regulation). Those remain areas where teachers currently have a clear advantage.

Better together?

Perhaps the most interesting conclusion is that the strongest results do not come from students simply using ChatGPT on their own. They come from integrating AI feedback into good teaching. Studies in which teachers discussed AI-generated feedback, helped students interpret it, or combined it with their own comments generally reported better learning outcomes than studies where students worked independently with AI.

There is, however, an important caveat. Almost all of the studies included in this review were conducted in higher education. Evidence from primary and secondary education remains scarce. Many of the studies were also relatively small and used a wide range of research designs. The findings are therefore promising rather than definitive, especially given how quickly the technology itself continues to evolve.

Even so, I read this as good news going into the summer holidays. AI is unlikely to replace teachers any time soon, certainly not when it comes to providing meaningful feedback. If anything, it could strengthen their work.

Let AI handle the first round of feedback on spelling, grammar and structure. That leaves teachers with more time for the things that are far harder to automate: helping students think more deeply, build stronger arguments, make better choices and become more independent learners.

Leave a Reply