I don’t have the time to write a five thousand word essay on the use of student evaluations at Williams. So, why don’t the rest of you do it!? We have many faculty readers with firsthand experience, both with respect to their own evaluations and with the use of evaluations in promotion and tenure decisions. Please, educate us. I will start with some quotes and commentary. (Related topics include grade inflation.)

Ken Thomas ’93 asks:

For statisticians and all: given the sample size, what’s the p factor (statistical relevance or reliability) for student evaluation forms? Would any quantitative social scientist– much less “hard” scientist– accept them as having any “value” at all?

Yes. Although I have never worked with this data myself, I have read some of the literature and talked with those that do. There is a great deal of “reliability” in the data, meaning that professors who have good ratings this year will tend to have good ratings next year. Imagine that you ranked the 200 faculty at Williams by their average student evaluation from 2004–2008. Compare that ranking with one created from the data for 2009–2013. The two sets of ranks would be highly correlated. (I don’t have a good sense of the magnitude, but I would be surprised if it were anything less than 50% and probably more like 80%.) I think that this especially true at the tails of the distribution. Look at the 20 professors with the highest ratings in the past. They will be highly rated going forward as well.

Professor James McAllister writes:

Just a very brief note about the general issue of grades and student evaluations since the general level of ignorance about this issue is high (I have absolutely no info or knowledge about any of the particular cases being discussed in the NY Times article). This idea that someone can get tenure and great student evaluations by giving out high grades is totally false. At Williams all students are asked to give their “expected grade” and indicate the level of difficulty of the class and how hard they worked in the class. A professor who received glowing recommendations from students who all expected to receive A’s in the class would be very suspect and told to toughen up standards. A professor who received bad teaching evaluations but who had students who all expected to receive C’s and felt they were overworked would be praised for his/her high standards and would be cut much slack for his/her grading practices. The idea that grades/workload are not considered in relation to teaching scores is simply a myth. More importantly, it is well established that Williams students do not reward professors who give out easy grades and assign little or no work; those are exactly the profs who often do badly on their SCS forms.

One final comment; to borrow what Winston Churchill once said of democracy, student teaching evaluations are the worst form of evaluation except for all of the others. Should assessments of teaching be made by one’s colleagues who never set foot in your classroom (or visit once), or by the hundreds of students who take these classes over a period of 5-6 years? I would rather be judged by hundreds of students than a handful of colleagues who are far more likely–repeat far more likely–to be guided by political ideology and petty considerations than students who have no major stake in the matter.

Many thanks to Professor McAllister for taking the time to educate the rest of us on this important topic. All of the above is true, but I would emphasize a few points:

1) “This idea that someone can get tenure and great student evaluations by giving out high grades is totally false.” is a bit of a straw Eph. No one believes that. Student evaluations depend on many things besides grade. (Junior professors should see Laura for some good advice on how to improve their evaluations.) The experiment we want to run is: Have Professor McAllister teach two sections of the same class with students randomly assigned. In one, he centers the grade distribution around C+. In the other, he centers it around A-. Does he get the same student evaluations in both classes? I doubt it!

We want to know the causal effect of grades (and other factors) on student evaluations, holding all else constant. Harry Brighouse notes:

Read Valen Johnson‘s book Grade Inflation. It has a couple of chapters on evaluations, and all sorts of tips on how to get better ones. Unfortunately, these tips will not make you a better teacher, because evaluations don’t measure that. One thing that people frequently do is give the students candy on the day of the evaluation. This has a significant effect. When I tell students this, they are shocked.

They shouldn’t be.

2) “The idea that grades/workload are not considered in relation to teaching scores is simply a myth.” Good. As always, I have a great deal of faith in the Williams faculty on issues like this. They are smart, thoughtful and experienced educators. I am certain that they use student evaluations in a balanced way, taking account of all the relevant information.

3) It would be nice if student evaluation data were made public for tenured faculty at Williams. (I can understand all sorts of arguments for why doing so is unfair for untenured faculty.) For starters, what possible harm is there is making the numeric data public? The more information that students have, the better their course choices will be. Seeing written comments would also be useful. (I could imagine giving professors/departments the right to remove mean/hurtful/unfair comments. In that case, we would just know that a student’s comment had been deleted. Honest/brave/popular professors would make all their comments public. A good way to start this process would be for someone like Professor McAllister to make all his student evaluations public. Would the College allow him to do that if he requested it?

Other comments welcome.

