Are Student Evaluations of Teaching Still Biased Against Women?

When we wrote More Urban Myths About Learning and Education, one of the chapters discussed whether male and female teachers are judged differently. We knew then that female teachers often receive less recognition than their male colleagues. This is not because they teach less effectively, but because the lens through which students (or pupils) evaluate their teachers is rarely neutral.

A new review study by Edgar Valencia and colleagues (2025) looks precisely at that question: What Have We Learned About the Instructor’s Gender Effect on Student Evaluation of Teaching?

The researchers analysed 21 experimental and quasi-experimental studies published since 2000. These studies examined whether student evaluations of teaching (the well-known SET scores) are affected by the instructor’s gender. The answer is nuanced but clear: yes, usually they are and mainly to the disadvantage of women.

That finding itself isn’t new. We’ve known for some time that female lecturers tend to receive slightly lower evaluations than their male counterparts. This occurs even when their teaching performance is identical. What makes this review valuable is that the authors didn’t just ask whether bias exists, but why.

What students (unconsciously) expect

The review identifies three main explanations.

  • The first involves gender stereotypes: students often expect men to be competent and brilliant, and women to be warm and caring. When teachers don’t match those expectations, their ratings suffer.
  • The second is the expectancy violation theory: when teachers behave differently from what is considered “appropriate” for their gender role, they are penalised.
  • The third, role congruency theory, suggests that male strictness is perceived as professional, while female strictness is interpreted as cold or unfriendly.

Across the 21 studies, seven found strong evidence of gender bias. Eleven found partial evidence, and three found no significant difference. Yet the direction of the effect was always the same: female instructors scored lower on average. The differences were minor, but systematic. In an academic system where these scores often influence promotion and career progression, small differences can add up.

Of the three explanations, the stereotype account proved the most consistent and best supported. Most studies show that expectations around competence and brilliance systematically favour men. The other two theories help explain when and why the bias becomes stronger. For instance, they saw this in disciplines that value authority or abstract reasoning.

The numbers don’t lie, do they?

Valencia and colleagues also pay close attention to the quality of the research designs. Measuring bias is tricky. Some of the strongest experiments use fictitious names or voices to isolate the effect, but those studies inevitably lack realism. Even so, the overall pattern is robust enough to support one clear conclusion: the evaluations we use to measure teaching quality “objectively” are not always objective at all.

Anyone who wants fair assessments of teaching quality needs to look beyond the numbers. This is not only because numbers rarely capture what matters most about learning, but also because they sometimes reveal more about our biases than about the teachers themselves.

Leave a Reply