42

In the somehow recent past, teaching evaluations have become obligatory in many universities/departments in Germany. I've also heard that sometimes these evaluations are used in formal ways to decide about salary raises or even hiring. For me it is pretty clear that a purely formal use of teaching evaluations is not a good idea, but my questions goes in a different direction. From time to time I read in newspapers and other kind of media that a side effect of these teaching evaluations is that

"the solicitous professor who aims at good teaching evaluations usually lowers the level in class and gives good grades in general."

Myself I never did it like that and I have the feeling that giving an "easy" course where everyone is able to score good grades but does not learn very much will not be appreciated by the students.

My questions is:

Do you know of any evidence for the claim that teaching evaluations encourage professors to lower standards and give better grades?

Note that I am not trying to deduce that the use of teaching evaluations in a formal way to decide about hiring or salary is a good thing; I am just not sure that "lowering standards and pampering students" is a major side effect of teaching evaluations.

10
  • 7
    Very nice question. Not an answer, just an observation: students tend not to be sheep and value when they are challenged, while at the same time prefer to be treated in a tough, but fair way. At least at places I was fortunate to study, or work at.
    – walkmanyi
    Commented Apr 4, 2013 at 21:34
  • 1
    @walkmanyi Good point! A recent German source is this Spiegel article spiegel.de/unispiegel/studium/… where one reads "Außerdem wird ein Nachweis der Lehrbefähigung verlangt, der durch das zweifelhafte Instrument der Studentenbefragung erfolgt. Der strategisch kluge Mittelbauer tendiert deshalb zu Infotainment und einer Senkung der Anforderungen." which translates (very roughly) as "Moreover, a certificate of teaching abilities is needed which shall be given by the dubious instrument of teaching evaluations. The ...
    – Dirk
    Commented Apr 4, 2013 at 21:40
  • 2
    ... keen and strategic non-tenure staff tends to infotainment and lowering standards." I think I have read similar claims in the "Forschung und Lehre"-Magazin of the DHV (www.forschung-und-lehre.de/) and will post them if I find anything.
    – Dirk
    Commented Apr 4, 2013 at 21:42
  • 2
    +1 for asking for actual evidence. I only know personal examples of the described behaviour as well as counterexamples. However, it is difficult to link this to the teaching evaluations, because I've known university rankings trigger similar behaviour. However, also in my experience students have a very keen perception of who is in which category. So have colleagues.
    – cbeleites
    Commented Apr 6, 2013 at 13:25
  • 1
    @BenCrowell Interesting to hear! However, where I am, all these web sites do not play any role. I have taught courses for about ten years now and I do not have any evaluations online anywhere.
    – Dirk
    Commented Jun 9, 2014 at 20:22

4 Answers 4

31

Jacob and Levitt have an article in the quarterly journal of economics that looks at teachers cheating in public schools due to compensation based on their class performance. They find that teachers will do things to help their students get higher grades if it affects their compensation.

Rotten Apples: An Investigation of the Prevalence and Predictors of Teacher Cheating. Quarterly Journal of Economics. 2003

An article by Nelson and Lynch look at the relationship between grade inflation and teaching evaluations suggesting professors buy better teacher evaluations with grades.

Grade Inflation, Real Income, Simultaneity, and Teaching Evaluations. The Journal of Economic Education. 1984.

1
23

It depends on what they are evaluating, and how.

I studied at a university in a mess of a country that was recovering from a period of war. The educational system was not just depressingly dated, it was also falling apart at the seams. Enthusiasts were trying to reform the system, and one of the bigger pushes in the right direction was achieved through course evaluations. This evaluation had questions such as these:

  • How often does the lecturer show up for class?
  • Does each lesson have a clear topic?
  • Is it clear which parts of the printed course materials are covered in which lecture?
  • Were all the exam questions linked to some printed course material?
  • Does the lecturer answer students' questions?
  • Is the lecturer available to students at any point outside the lectures?
  • Does the lecturer use e-mail to correspond with students?
  • Do you feel that the lecturer treated you unfairly at some point? How so?
  • Do you feel that the lecturer engages in any problematic behaviors during class? Please describe.
  • Did the lecturer ask you for any favors in return for a higher grade?
  • What are, in your opinion, the good aspects of this course?
  • What are the bad aspects?

...etc.

There were more questions - many were about lecturing style for example; these are just off the top of my head. Now, this evaluation made lecturers begin to come to class, made them finally pick textbooks, forced them to pick a topic for every lesson (rather than just rambling on), forced them to tell students which part of the book corresponds to which lecture so that students could read the materials in parallel. It also rapidly cut down on truly problematic behaviors such as smoking in class. Furthermore, it helped lecturers improve their performance through providing feedback on the strong and weak points of the course, at least as students saw them. Here, I think the evaluations very clearly helped improve standards in class, especially in truly problematic departments. The reason they helped was twofold: (1) there was a lot of room for improvement, and (2) the questions were well thought out, i.e. each question was linked to a particular goal in the educational reform.

I've also studied at a wonderful, well organized university where most of these questions would be completely ridiculous. There, the evaluations had questions such as:

  • How many hours per week did you study for this course?
  • How important would you say this course is for your overall academic development?
  • Would you say this course was easy, just right, or difficult in terms of content?
  • Do you think the lecturers evaluate students' knowledge fairly?

...etc.

I honestly have no clue what is gained by such an evaluation, and I hope nobody's salary depends on it. With the right (i.e. wrong) questions, I'm sure you could lower teaching standards by giving financial incentive to score well. The question, then, boils down to what the evaluation sheets look like. To the best of my knowledge, these are not standardized across universities, so the results may vary a lot.

3
  • 16
    I hope nobody's salary depends on it — Sigh. Not only individual salaries, but entire department budgets.
    – JeffE
    Commented Apr 5, 2013 at 21:16
  • Thanks for the interesting story and sharing insight. Although I was more interested in concrete examples in which somebody really lower standards our turned to infotainment to gain better evaluations...
    – Dirk
    Commented Apr 6, 2013 at 15:43
  • 1
    @Dirk - I hope somebody will share that sort of data as well. I just wanted to defend the concept of evaluations, because I think it's a good idea gone bad. My impression is that it goes bad when the main goal is to measure some vague concept of student satisfaction instead of course quality. There, the student is seen as a customer, and the teaching process as an economic exchange. While this is partly true, it completely ignores other aspects of education (social, cultural, knowledge as a goal in itself, etc).
    – Ana
    Commented Apr 7, 2013 at 12:45
16

Grade inflation has been an issue in the US since mid 1970s, so welcome to the club. See endgradeinflation.org. None of the attempts to curb it have been successful so far; the practice of student evaluations is deep-rooted in US colleges, and cannot be easily modified.

The uphill battle against grade inflation has been spearheaded by University of North Carolina, Chapel Hill, one of top 5 large public US universities. They put a rather extensive research effort into figuring out the patterns of grade inflation. The cause, as you observed, is what economists call market failure, when the self-motivated actions of the players lead to outcomes that are worse for everybody. The employers of the graduates, and the grad programs they apply for, suffer the most, as they cannot distinguish good students from bad students. Organizations and student societies that rely solely on GPA (grade point average) discover great differences between disciplines: the humanities end of the spectrum have been hit the hardest by grade inflation, while engineering and sciences that have more specific assessment and evaluation criteria tend to produce lower grades. The opening page of this 2000 report provides a specific figure to answer your question: about 15% increase in student evaluations associated with 1 standard deviation increase in the course average grade. This standard deviation was 0.4 on the American scale that goes from 0 to 4; at the time of writing the report, the average GPA at UNC was 3.18.

In mid 2000s, UNC came up with an idea of an effective grade, called achievement index. In very simplistic terms, it essentially normalizes each class to have the same GPA. Each student is mapped onto a percentile implied by his grade in a given class, relative to the distribution of grades in this class; percentiles across all classes that a student took would be aggregated; and the ultimate student's achievement GPA would be reported based on the normative judgement of what the university wants to see as the average GPA and the range of grades. This idea is based on item-response theory, or can alternatively be explained using Bayesian methods (a maximum a posteriori estimate of student ability). As you can imagine, this literally caused a student unrest that UNC has not seen since the civil rights movement of the 1960s (o tempora o mores... how petty motives are these days), so the faculty chickened out and ruled against it.

Still, UNC has found a way to put the grades into the context by augmenting the transcript with the average GPA of other students who took this particular class, student's percentile in a given class, and the "schedule point average" = average GPA of all the students in the classes that a student took. The above link shows a clear picture of somebody who had a nominal GPA of 3.6, way up from the average GPA of classmates of 3.0, consistently performing above the median (7 grades above the median, 5 at the median, 0 below), vs. somebody who has only be able to achieve GPA of 2.5 in easier classes with average GPA of 3.2 (1 grade above the median, 3 at the median, 9 below).

The dramatic timeline (if you know how to read between the lines... I grew up in Soviet Union and have this unfortunate skill) of UNC attempts to deal with grade inflation is available here. Some other institutions are likely to use these or similar ideas, including another high-profile public school, Berkeley. (The administrator's claim that the university's computer system cannot handle the additional evaluation method is ridiculous; I could do these numbers on my laptop.)

1
  • 4
    The endgradeinflation.org site is pretty awful. For example, their front page seriously misrepresents the evidence presented in the book by Arum and Roksa; the page claims that "45% of undergraduates do not improve their academic skills during the first two years of college. By graduation, 36% have not learned anything." This is total nonsense and is not what A&R's evidence shows. The A&R book relies strongly on a standardized test of critical thinking skills; it isn't a test of "academic skills" or whether students "learned anything."
    – user1482
    Commented Jun 9, 2014 at 19:53
9

I can offer only personal experience on this topic. However, I can say that I read the literature on course evaluations rather extensively in preparation for a past application for promotion. What I found was that there are passionate people on both sides of this debate. Some think course evaluations are the best thing since ice cream, while others believe they are responsible for grade inflation and overall lowering of standards. Based on my own experience, I tend to side with the latter group. I have been at the same institution for almost thirty years, and early in my career enjoyed very good course evaluations. After about ten to fifteen years, I notice that my evaluation scores began to erode. So, I started making things a bit easier for the students to get good grades, but nothing I felt uncomfortable with. My evaluation scores shot up noticably.

In recent years, the quality of our incoming students has slipped, and so have my evaluation scores once again. But this time, I do not feel I can make any more concessions to the students, at least if I want to retain the integrity of my course (and myself). At least at this point I have a good deal of job security, so I can hold my ground, even though students and administrators probably wish I wouldn't. Someone in a less secure position could face a serious moral or ethical dilemna in this situation. It is easy to see how grade inflation can happen.

In the U.S., we face the same problem with standardized testing. So much is at stake with these tests for high school students and teachers, that the whole process has devolved into teaching to the test as opposed to teaching for understanding. In my opinion, it will take a vocal effort by major public institution, and even private ones, to make any headway against teacher evaluations at the college level. That is not to say that professors and teachers are not to be held accountable for the conduct of their courses. Evaluation is necessary. The devil is in the details of finding the best way to do the evaluation. I don't think the current way is the right way.

According to some writers, peer evaluations are an even worse tool than student evaluations. StasK has written a great answer. Pay particular attention to the remark that administrators don't believe the univeersity computing system could handle the load. Administrators are quick to cite some technical limitation as to why they cannot do something. They seem to forget that they are talking to an audience that contains experts who know that their arguments don't hold water.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .