How do student evaluations survive ?

by John Q on August 4, 2019

Among the few replicable findings from research on higher education, one of the most notable is that student evaluations of teaching are both useless as measures of the extent to which students have learned anything and systematically biased against women and people of color. As this story says, reliance on these measures could lead to lawsuits.

But why hasn’t this already happened. The facts have been known for years, and potential cases arise every time these evaluations are used in hiring or promotion: arguably every time the data is collected. And student evaluations are particularly popular in the US, where litigation is the national sport. Yet no lawsuits have yet taken place AFAICT.

Maybe the zeitgeist is changing. I was going to write this post before seeing the linked article, which turned up in my Google search. Any lawyers or potential litigants want to comment?

{ 40 comments }

1 Matt 08.04.19 at 10:57 am: I am sympathetic, but would add, first, that the low response rate to many of these evaluations makes them especially problematic. At my current university, response rates are regularly around 30%, so it is pretty difficult to get a good idea of what the students thought of the courses, even if we leave the other issues aside.
Second, I’d challenge the idea that the surveys are “particularly popular in the US”, at least as opposed to, say, Australia. Two of my colleagues here are finishing an article arguing that the use of student evaluations violates state and federal Australian anti-discrimination law (for reasons like those noted above), and their study shows that the use of such evaluations for hiring, retention, and promotion is extremely common in Australia, at least as common as in the US. Additionally, the federal government in Australia runs its own student evaluation system, the Student Experience Survey, which is used as part of a scheme for deciding how funds will be distributed to Australian universities. So, such surveys are stitched into the fabric of Australian higher ed from top to bottom. It would be good if some people would be legal challenges here, too!
2 Murali 08.04.19 at 12:08 pm: Student evaluations are also popular in the UK and in Singapore. I wouldnt say that they are popular in every university, but they seem par for the course in all anglophone universities.
3 Quite Likely 08.04.19 at 1:09 pm: Not sure if this is exactly what you’re talking about, but as a student I found reading past student evaluations to be the most useful tool in picking new classes.
4 harry b 08.04.19 at 1:56 pm: Cynical take: they survive exactly because they are useless. Faculty are determined to resist any serious evaluation of their teaching, and it suits them to have a mechanism that they can endlessly complain about. It would be quite easy for the professional associations to get rid of them: develop measures of learning and propose that they be used alongside observational evaluations (and have serious training for the evaluators in the use of observational protocols, etc). But… then we’d actually be talking about teaching and learning.
5 hix 08.04.19 at 1:56 pm: Permanent evaluation is also a national passtime in the US aswell, its gona stay and its infested the rest of the world already. Think of all those 360 feedbacks. Its part of makeing an obedient workforce that puts on the desired public appearance. Which is probably all that counts anyway in the sectors where those permanent evaluations are most popular anyway.
One anecdotal impression that the young Profs “like” those permanent evaluations aswell. By like i mean, they got used to permanent 360feedback in their consultancy or similar former jobs, so they already feal the course evaluations forms are far too few feedback. Not that this would result in a particular healthy attitude towards the evaluation forms. Its just that they consider permanent evaluation by everyone normal.
6 Cervantes 08.04.19 at 2:57 pm: What you don’t mention in the post, but which is clearly stated in the NYT article and very well known to all of us, is that student evaluations principally reward instructors who assign little work and grade easily. They are in fact a major driver of grade inflation.
7 lioness 08.04.19 at 3:01 pm: Very good points. Although not the same point, I’ve been wondering about this from a different angle. Student evaluations are not only a mechanism for discrimination from institutions, but they are also a formal and approved way for students (especially male, especially white) to harass female teachers and teachers of color. In a way that if it were to happen outside the formal student evaluation process would almost certainly not be tolerated (I hope!).

Just yesterday, my partner received an email from her employer (she was teaching in a summer language/history program in the classics) with a copy of the teaching evaluations for the summer. The email came with a warning — there are several examples of outright abusive language about her teaching (did I mention she’s a highly successful and committed teacher, one who just happens to teach about patriarchy and white supremacy in the ancient world (as well as in the current classics community)?). That is, this is more than scoring lower on likert scales about teaching effectiveness. This is about being called names and abused, and on top of it all, it still is necessary to read the evaluations so that you can ‘respond’ to the criticisms and include it in your self-evaluation when up for another job or reappointment (and as an adjunct, she’s always looking for the next job…).

The employer, while “sensitive” enough to warn her in advance about what’s coming, nevertheless doesn’t question the broader effects of these evaluations. They want the benefits of doing some diversity work (and in the world of classics, a white woman teacher is sadly a form of diversity!), but do nothing to protect them from this abuse.

Probably not going to rise to the level of a legal challenge, but I just wanted to add, because it’s really problematic how this abuse has been normalized and accepted.
8 Tamara Piety 08.04.19 at 6:04 pm: I think they survive for the same reason that eyewitness testimony, forensics testimony, testimony about future dangerousness, etc. continues to be accepted in the courts, in many cases decades after we have discovered it is deeply flawed: because it purports to “solve” a problem or offer evidence about something that we very much want need evidence about but lack good alternatives. If we were to abandon these, and many other troubling metrics we would have to confront just how little we know, or would have to engage in what we know are more nakedly subjective peer assessments. For a variety of reasons, social pressure and because faculty will have to live with each other after the assessment, peer assessments tend not to be very demanding in many places. But I have seen the student teaching evaluations end up being a huge issue for women of color and women generally, or anyone who might actually have other problems that are not so easily articulated or are more controversial. In contrast, to many people student teaching evaluations seem neutral and unbiased. They aren’t (for the reasons explored in the links), but they may be more unbiased, or biased in a less problematic way, than if assessment was purely by peers. It is a real problem. One day someone will sue. The question is, by the that time will the courts be more hostile to employment discrimination cases than they already are? It is possible, given all the new appointments under this administration. Also, the capacity of the courts to flat out ignore evidence is demonstrated by the examples with which I started. In particular, criminal cases are replete with questionable practices (like using prior convictions to impeach a defendant for his character for truthfulness, but supposedly NOT for his propensity to commit those types of offenses….uh, right. Jurors cannot engage in these sorts of mental gymnastics and we have also known THAT for years so….)
9 Rohan Maitzen 08.04.19 at 6:05 pm: This ruling against their use in tenure and promotion decisions is a step in the right direction:

https://www.universityaffairs.ca/news/news-article/arbitration-decision-on-student-evaluations-of-teaching-applauded-by-faculty/
10 Stentor 08.04.19 at 8:35 pm: Cynical answer: The ability to collect quantifiable data pertaining to student learning makes them indispensable to a bureaucracy that can’t function without such data, and thus makes their use organizationally mandatory and impervious to evidence about said data’s validity. The only way to dethrone student evals is to come up with an alternative way to assess student learning that can be as easily implemented and collected.

Optimistic (?) answer: Student evals are *votes*, not *measurements*. That is, the idea that evals are (poorly) measuring objective facts about student learning is a misconception of their role. Instead, evals function to give students a vote in personnel decisions, which greatly affect their lives but are officially made by colleagues and administrators. Students need not cast their votes based on the criteria we educators think they should be evaluating faculty on. Evidence of bias in eval results is evidence of bias in student voting patterns, which may be morally culpable, but isn’t a form of factual incorrectness.
11 Harry 08.04.19 at 8:59 pm: “They are in fact a major driver of grade inflation”

Are they? If so, you’d expect increases in average grades to be more pronounced in teaching-focused institutions where they have more impact on decisions about hiring/tenure/etc than in research-focused institutions where they have close-to-none. The reverse is the case.
12 Harry 08.04.19 at 9:06 pm: Notice that the objection that they are biased against women and racial minorities is very different from the objection that they bear no information. If we can measure the extent of the bias, then we can control for it when interpreting the evals. If they provide no information, then we should not use them at all.

Do they really provide no information? I’ve seen no study (and I’ve read a lot of them) that contradicts my conjecture that low scores are a good warning sign that something is going wrong with someone’s teaching. And I’ve seen no studies at all that consider the qualitative information evals provide. I have gone through 6 years worth of written comments for several different faculty members, and, honestly, I don’t believe they are providing no information. At minimum, in one case, I learned that the teacher in question either didn’t read his own evals or really had no interest in improving as a teacher.
13 HR 08.04.19 at 11:26 pm: Dean perspective here: the qualitative section of student evaluations are full of information (from the student perspective) and over time they give a fairly consistent (if sometimes biased) picture of how students see a faculty member. Certain terms and phrases (“passionate” “disorganized” “sticks to the syllabus” “changes dates without telling us” “always rushed” “always stays after class”) are present year after year. The problem is what to do with this information. Many chairs do nothing, particularly with long-term lecturers, who don’t have a formal evaluation process.

Perhaps this discussion is wholly about numbers, in which case I agree. I don’t look at the numbers. But student comments over time are worth reading and worth listening to.
14 Collin Street 08.04.19 at 11:54 pm: If we can measure the extent of the bias, then we can control for it when interpreting the evals.

If we could measure the extent of the bias, then that would mean that we had a less-biased measure of [whatever it is that student evaluations evaluate] and we could — should — use that measure instead and skip the evaluations entirely. Systemic distortion can only be corrected for if it is known to be uniform, which basically is never the case for anything sociological.

At the end of the day, science isn’t the only way of knowing. If you’ve got enough data it’s the best option, but if you don’t then it’s outperformed by narrative methods. And, well, humanities academics are kind of defined by their working in fields that are data-short to the point that science methods don’t work 100% or sometimes at all.
15 hix 08.05.19 at 12:01 am: My baseline expectation is that all evaluations are biased/unfair and will get more so the higher the stakes are. So lowering the stakes is always a good start. I do have some specific complaints about course evaluations from a students point of view. The Prof friendly one is that i indead hold our (student) ability to evaluate a course correctly in particular low regard.

Unfortunatly sometimes courses have obvious structural problems that most students are able to see: But then what to do about it? Those are best addressed in a written comment, not just on the quantative scale part. But in a smaller course a written comment is not really anonymous. Even if it is anonymous, that comment affects the general mood during gradeing, which always comes after the course evaluation.

There definitly is a tactical understanding among students to not rock the boat in evaluation forms – do the quantatitive part with a strong positive bias and avoid meaningfull written comments alltogether. That frankly seems the right decission. Not just in the hope to get better grades but also for teachig quality since there is always a risk of a problem case doubling down on a bad thing based on feedback while the chanes for improvement tend to be limited when one knows the Prof makes the same mistakes for at least 10 years.

So why evaluate every damn course every term? Maybe evaluate just the more important ones every two terms? The Profs interested in feedback will get enough informations from that and everyone has a lot less work.

Now to the real serious problem: The course evaluations are not necessarily just for internal consumption. They might also play a role in the certification process or public perception of a degree program/university – we definitly got suggested as much. So in particular adressing serious problems would just be self punishing. At worst one ends with a non certified degree. Again that problem with the high stakes.
16 J-D 08.05.19 at 12:06 am: harry b

… It would be quite easy for the professional associations to get rid of them: develop measures of learning and propose that they be used alongside observational evaluations (and have serious training for the evaluators in the use of observational protocols, etc). …

But then what would they do in the afternoon?
17 Matt 08.05.19 at 1:31 am: On the “qualitative” part of evaluations – I have found them useful for myself on a number of occasions, and have, I hope, improved my teaching in light of them, and no doubt the repeated themes are often important, but it’s surely important to note that there’s lots of trouble here, too. For one, this is the area where inappropriate and biased remarks about appearance, dress, etc. usually show up. Second, they are often not that reliable, in different ways. I have had (multiple) students say that I knew a topic very well, and make similar comments, in classes I was teaching for the first time, didn’t know well, and where I was literally one set of readings ahead of the students. So, these remarks seem to show the importance of demeanor to positive evaluation, more than actual skill or knowledge. I have also had a case where a student mis-heard, and grossly misunderstood, a comment in a discussion about diversity between and within institutions, and then write that I was prejudice against Christians. I’m sure others get worse. So, while the qualitative part can be pretty useful, they still have many of the same problems, have some of their own (and also suffer from an even lower response rate, in my experience.)
18 Andrew Murphie 08.05.19 at 1:59 am: https://www.tonybates.ca/2018/05/11/11025/ In fact I think in some studies there’s occasionally been negative correlation between positive evaluations and student learning. Also a lot of this would apply to many other metrics when it comes to potential legal action. Citations might be one of those, although a different kettle of fish, but still problematic.
19 HR 08.05.19 at 2:29 am: @Matt ” So, these remarks seem to show the importance of demeanor to positive evaluation, more than actual skill or knowledge. ”

Demeanor is as important to teaching as “actual skill or knowledge”! Teaching is a profession and one expects professional demeanor in a teacher as much as a lawyer or doctor. It is odd to me that many faculty members don’t seem to get this. Certainly student evaluations take professional demeanor into consideration and why not? Teaching is a transaction and if a faculty member doesn’t take care to respect students–maybe dress up, be prepared, speak clearly, listen, follow the syllabus, return papers in a timely manner–the evaluations will reflect the lack.
20 JanieM 08.05.19 at 3:24 am: in the US, where litigation is the national sport.

If only our national sport were that bloodless.
21 nastywoman 08.05.19 at 7:57 am: Everything get’s ”evaluated”… or should we say ”rated” right now.
Restaurants – hotels, cruise-lines… doctors – or even whole countries –
-(that’s how we decided to reside where we reside)
And there is a case here – where a restaurant is suing google – because google had the nerve to come up with a chart how long a guest have to wait in this restaurant – and as most Prof’s (supposedly?) are pretty ”pÃ¼nktlich” – probably that’s why none ever got sued?
22 Matt 08.05.19 at 11:42 am: HR – I don’t disagree with most of what you’re saying, but that’s not what I meant by “demeanor”. Rather, what I had in mind is that students very often confuse _acting like you know the subject matter well_ with actually knowing it well. I know, because in classes I’ve taught where I didn’t know it well, I’ve still had a number of students write that I knew it very well! That’s because I’m fairly good, at this point, at looking confident and saying things like, “We’ll come back to that point in a few classes” (even if I’m only semi-sure it’s true!) But insofar as student evaluations are good at telling “acts confident in front of the class” from “knows the material well”, and so the two are often confused, this is a problem. (And, if certain groups of people have a harder time with bluffing, or just _seem_ less confident to students, they will be punished for this, even if they know the material as well.)
(I also think that it’s pretty unclear that teacher dress has any clear relevance for respect for students, at least beyond some extreme outlying behavior. I was slightly on the casual side in US law schools, and slightly on the formal side [though not much] for Australian ones, but see no real relation here.)
23 David Hilbert 08.05.19 at 3:03 pm: For my sins, I have been deeply involved in administering the undergraduate evaluations for my department for most of 20 years. I’ve done this not because I think they provide fine-grained information about the quality of teaching but because the university mandates them and by paying attention we get a little bit of control over the questions and how they are used. It’s been horrifying to me since graduate school that both faculty and administrators are willing to attempt to interpret relatively small differences in scores as containing useful information about differences in teaching quality. It’s absurd in so many different ways and so obviously so that it does invite the kind of speculation about motivation that you see in the comments. Their use in the P&T process is an outrage.

I do think they can contain useful information for instructors. When I look at my own, if one of the sub scores is very different from the others or there is a theme in the comments then this is something worth reflecting on. But reflecting and not necessarily acting. As far as using them in a comparative way, if I see a class with very low scores it’s a reason to try to find out more about why the students were unhappy. But the usual range of variation (3-5 on our 5 point scale) seems completely uninformative to me and even low scores are a screening tool not something on which to base a conclusion.
24 Matt_L 08.05.19 at 3:19 pm: The institution where I teach asks its faculty, as part of the union contract, to conduct regular evaluations of our teaching and to report those measures to their colleagues and the administration as part of our professional development planning and reporting process. The evaluations play a role in tenure and promotion, but it’s a very loose one. That said, the University I work at does _not_ have a standardized teaching evaluation form that everyone is required to use. That means the faculty develop their own individual evaluation strategies and as a result there are a wide range of approaches. Some of them are of the “going through the motions” or a “cover your ass” variety, while others actually solicit useful feedback, but are of limited use due to their amateur nature.

Like assessment, student teaching evaluations suffer from a basic flaw. They are both forms of social science research and experimentation that are designed by well-meaning amateurs: People who do not know how to set a research question, design a survey, and create a good sample. Even the student teaching evaluation forms designed by experts from a university’s department of institutional research are flawed because they are supposed to apply to all teaching across the entire university which means they are divorced from the specific disciplinary context of the class.

I have been teaching at my current institution for fifteen years. I have designed my own forms with some quantitative measures and a lot of qualitative questions. I have also had peers run mid semester course evaluations for me using focus groups. These evaluations, coupled with some self reflection have helped me become a better teacher. If you ask good questions students will give you good answers. Students know when they have had an experience where they learned something, they also know when they have not learned something. The students are also reliable in terms of telling you in relative terms how well the reading went, how much they studied and what assignments they found valuable. I also broadly agree with what HR said at #13. If students tell you semester after semester that you are disorganized, or that one specific assignment was confusing, you know that you have to fix something.

I think one of the reasons why I have found teaching evaluations effective is that I do tell students that I have made changes to the class based on student feedback (and I have, its not a line). I still get some crazy answers once in a while (like Matt #17) but they are outweighed by students who give thoughtful responses. Another reason the evaluations work for me is that I spend class time on them. I give students fifteen or twenty minutes to work on the evaluations in class after I leave the room. I used to have a 90% response rate when I used paper forms. But those forms had to be collated and tallied by myself or our office admin to make the information useful. So we switched to an online survey which I give the students time to complete in class. That response rate varies a lot, from 60-80% depending on the class and semester.

One last comment, to hix at #15: I don’t see my teaching evaluations until the start of the next semester well after grades have been submitted. At the schools where I have worked as a prof or TA wait to give back the teaching evaluation results until grades have been turned in. In that sense the system is usually fair, but other people might have had other experiences.
25 Trader Joe 08.05.19 at 6:27 pm: I’ve experienced these evaluations both recieving and filling-in.
I tend to agree with Harry’s comments that while its possible some recurring commentary can have information content, on the whole they aren’t really a precision instrument designed to generate reliable data.

The first problem is usually how they are administered – generally on the last class setting or concurrent with final exams. Students are busy at those times of the year and have no incentive to spend more than about three minutes on the project – less if they can get away with it. Even when provided 15 minutes of class time say, to do a decent job, most will still spend 3 minutes and use the other 12 to look at their phone or do something else.

Equally, while there will always be a handful of students that might take them seriously (say 25%), most responses are going to fall into two categories -1) those with an axe to grind for whatever reason fair or not (very likely the source of the racial/gender bias) and;

2) those who feel some vague obligation to do the eval but don’t really want to rock the boat so they fill in lots of 4 out 5 scores to indicate they didn’t hate it, but aren’t some sort of teachers pet and put vague platitudes like “knows the material well” (no, really? a freaking university professor that knows what he’s teaching?) and equally vague negatives like “poor time management” or “didn’t provide enough time for projects” which is usually more a projection of the students own weakness than the professor’s.

That’s not to say there are never nuggets of insight nor that sometimes you can get 50/50 kids all say something useful like “speak slower” or something…but on the whole the students have little incentive to provide a lot of detail and the profs have too many ways to disregard it. They exist as a butt covering tool. If you’re looking for a reason to sack/reprimand/demote, there will be data that you can extract to make the case, if you’re looking to promote/reward usually that too.
26 Sashas 08.05.19 at 6:28 pm: @Collin Street #14: Surely it’s not as hopeless as that. We could have a prohibitively expensive measure (e.g. a research study) which we could use to estimate the bias in student evaluations, but which we couldn’t run every semester on every class. Systemic distortion cannot be perfectly eliminated given unknown non-uniform factors, but even allowing for the gender of the professor as a non-uniform modifier of student evaluations, we can bring the means of the distributions together. This would be a significant improvement over throwing up our hands and calling change impossible. It should also be feasible to detect and correct outliers on an individual basis. I would expect non-uniform effects from professor to professor, but relatively stable effects from year to year, such that once an outlier is detected we could continue to adjust in a fair manner.

I’m all for narrative methods (I think–not 100% sure I know what you mean), and pitting them against the scientific method as if these things are opposites or even incompatible is counterproductive.
27 Kris 08.05.19 at 7:40 pm: The evals are not to assess your teaching. They are there to threaten you in some small way. â€œIf the students are upset with you enough to give you low scores, that means the â€˜customersâ€™ are unhappy, and we canâ€™t have that.â€ All academia follows one rule: donâ€™t rock the boat where a problem could be created. Just get in and out and say your little lecture. Let everything else be run as a business by the admin who claim to be academics who, of course, are executives.
28 CJColucci 08.05.19 at 7:55 pm: I’d be interested in any insight into how much, if at all, student evaluations influence renewal or tenure decisions. How seriously are they taken?
29 Yon Yonson 08.05.19 at 8:15 pm: Pardon my anonymity, but I wanted to speak freely.

At my institution, student surveys – both the National Student Survey and our very own internal survey – are administered electronically, as a series of agree/disagree statements with Likert-scale boxes to tick; I discovered recently that the cursor is pre-positioned in the ‘Agree’ column on each row.

As well as general comments on facilities and the programme as a whole, there’s one question about each individual unit, “This unit was taught well” or WTTE. The answers to this question are converted to a numerical variable – 1 = Strongly Disagree to 5 = Strongly Agree – and the mean is calculated.

Yes, we take an arithmetical average of ordinal data. (And our department has won awards for the quality of its quants teaching.)

The number everyone is aiming for is 4. (At least, it was this year. A couple of years ago it was 3.8.) If your unit has one more Neither Agree Nor Disagree than Strongly Agree – or, heaven forfend, if you’ve got a Disagree that’s not outweighed by at least two Strongly Agrees – then you’re in trouble, buster, and you will be asked to specify in what way(s) you will be improving your unit and its delivery so as to stop this happening again. If, on the other hand, your overall score is 4.2 or 4.1 or even 4.0, congratulations, you’re in the clear; you can read the students’ comments if you’re interested, or you can just forget about the whole thing for another year.

Completion rates are positively correlated with ‘good’ scores, presumably because if you nag the students who aren’t bothered about the survey into doing it they’re more likely just to click through and give you a 4. Having said that, there was one year when I got the best number in the department, and a personalised letter from the Dean to congratulate me; I’d forgotten all about the survey, and the only people who’d completed it were four students who’d done it off their own bat because they liked the unit.

Student feedback can be really helpful, but it’s not quantifiable and it doesn’t allow staff to be ranked against one another. Obviously you can’t calculate a mean on ordinals and call it meaningful – and there’s so little difference between the kind of feedback scores that would result in a 3.9 and those that would result in a 4.1 that holding the first teacher to account and giving the second a pat on the head is absurd. But that’s the regime we work in.

The effect of some student feedback on my work has been good, but the effect of the numerical scoring system is baleful. I hate the feeling that I’m competing in a popularity contest, not least because it’s one I don’t feel I can possibly win (I’m not naturally outgoing, or young, and I have colleagues who are both). I constantly worry about ‘losing’ the students in a lecture, or boring them in seminars – all of them or any one of them (they’ve all got one vote, after all) – which naturally makes me self-conscious, which (ironically) is bad for my teaching. I want above all to challenge my students, take them into complex or confusing areas and inspire them to think for themselves, but I’m afraid that’s not the road to popularity – or at least not universal popularity, which is what I need to aim for.

Oh well, there’s always retirement to look forward to.
30 ph 08.06.19 at 4:44 am: Great topic. @29 rings true. Assessment is relatively easy and becomes increasing complex (and meaningless) when desired outcomes have not been clearly established, and/or considered. My own approach is to ask for self-assessment from students. We then measure their week-to-week ability to set and meet short-term and long-term goals, goals which are negotiated with desired outcomes based initially on individual skill levels at the beginning of the term. If the department sets its own goals – x number of visits to an e-learning site monitored by AI, for example, that’s a fixed variable, which we discuss and factor in to self-assessments.

Administrators exercise great care in the design of assessment instruments. Were universities to ask students whether they’d prefer smaller classes, fewer expensive textbooks, more time spent with highly-qualified instructors willing and even keen to engage students as people, and less expensive equipment – chairs, buildings, etc. with a much more demanding but ultimately useful university experience as the result, I expect a great many would say yes. That’s certainly my own experience. I create and modify quantitative and qualitative instruments designed expressly to reveal how effectively I use time in class vis-a-vis student need, outside class assignments, my own weaknesses, and ways to improve my own classes. Interviews throughout the term help. If we’re not doing that, maybe we just don’t care? (I recognize the impossibility of this with large lecture classes.)

I hold very deeply to the notion that almost all administrators and full-time faculty are willing to everything possible to further their own careers, often at the expense of learners. Assessments stay because they allow those in power to keep their own power; the entire edifice is corrupt – one propped up on the exploitation of adjunct faculty and by burdening young people with six-figure student debt.

I ensure my own students learn that much before the end of the first term. Given that reality, I encourage them to learn how to learn independently as quickly as possible. Once they understand that their relationship with the university essentially involves being bilked for four years, or more, in return for a piece of paper, and perhaps some expertise, they generally start acting like adults. Especially, if they realize that once the university has their money it no longer has any incentive to ensure students (or faculty) actually succeed. From that point forward learners monitor their own progress each week, consult peers and interested experts, and do all they can to improve themselves without putting their own money into the pockets of the corrupt and the self-interested.

Once students understand they’ve a limited period of time to acquire the skills to make themselves competitive in the workplace, they generally set about doing so. Most are able to understand that the university is focused primarily on preparing the next set of marks, sucking money out of alums, and convincing the gullible that the entire experience is essential to the moral and intellectual success of all.

Great topic, but I fear the cavalry isn’t coming.
31 Colin Danby 08.06.19 at 6:01 am: Just to chime in with a number of points above, on the one hand. there’s something to be said for giving every student an opportunity to speak about how the class went for them and occasionally you learn really useful things from evals, though almost always from the written parts. On the other hand, just like with any data, thoughtful interpretation is needed, and some student comments are channels for racist and sexist abuse, as noted in #7 above. Institutions need to think seriously about this, and not just leave it to the recipients of abuse to deal with it.

The question in #28 is good and there’s a lot of variation. You can see in a number of comments above and in the McCulloch-Lovell and Decatur statements in the NYT forum a lot about how to use evals well — as one kind of evidence in the context of other evidence. This requires that tenure committees and so forth do their work thoughtfully. But it requires real discipline not to take the numbers as self-evident, not to use a glance at numbers as a substitute for more careful reading and thinking. At some institutions and units, candidates for tenure are routinely required to make a summary table of their numerical eval scores and include that prominently in their packets — a barbaric practice.
32 otto 08.06.19 at 9:31 am: These evaluations do sometimes reveal when work is graded/ returned very late for example, or when instructors are cancelling a lot of classes. Its often hard for others in the Department to know these things without some sort of feedback system.
33 Andrew Murphie 08.06.19 at 2:18 pm: I guess what I find sad in all this is how accommodating academics can be of regressive systems. It is true, as I think Stefano Harney writes somewhere, that the university systems’ largest gift to the management world (aside from the ongoing funding of a gazillion consultants) has been the intensity of it’s well developed systems of mutual surveillance (on which evaluations and much more builds). That’s really something on which to reflect for a while and I think we now miss exactly how out-moded and destructive much of this can be. Yet it’s also true I think many academics are far too pleasant, and well, sweet and well-intentioned, when it comes to evaluations and so much more. Which of course has its positive side and one doesn’t want to suggest that academics should abandon their good intentions of course, towards good teaching, towards student voice, and so forth. However, to maintain such good intentions within a regime which mitigates so strongly against the voices of all the people in the classroom, and controls so much of things in some weird and poorly conceived behavourial-cognitivist-managerial experiment (not often directed from outside, often but not only by consultancies earning their money exactly from such things) helps very little, and does much harm. It doesn’t help academics. It doesn’t help the people taking the courses. It doesn’t help management, really, not good management (call me old fashioned) in the end, although it does help current managers who can introduce/enforce something regressive, claim credit for it and get promoted/move on, and then leave the mess for others to deal with .. and said current managers will never, really, have to learn to really manage, beyond the application of a system of basic formulae and operations nearly always based on “what everyone else is doing”. So on the one hand, wouldn’t it be great if there was a little more collective, actual resistance to such things, especially I would add among senior academics. On the other hand, wouldn’t it be really great to further empower academics and the people in our classes by giving everyone a real voice in how things work (outside of regressive procedure and metrics). So, for example, although I think it’s obvious that academics have much to offer people in the classes in terms of expertise (otherwise why are they there), and indeed that the people in classes actually want to engage with this expertise, this is no real reason that some of the program/classes should not be responsive to direct feedback from people taking the courses in terms of programs/course design. That would give “students voice”. Etc. In sum, less culture of mutual surveillance, especially as AI, big data, and well, basic metrics, poorly applied, all together are only going to make this much worse .. and more actual participation and working together, across the board, to makes things better, not only in the system, but in the world as large. I’m an expert on technical change. I’m a semi-expert on social change. I’m also gathering expertise in climate change (which I teach, and I’ve taught in universities for the most part, at all levels, but also at schools at all levels, for a short period, for nearly 40 years). This is at least a triple whammy when it comes to extremely large problems hurtling towards us at much greater speed than I think most people expected. Yet our institutions are forcing us to expend enormousâ€”and I do mean enormousâ€”amounts of time dealing with silly if not harmful managerial systems and processes when we could be working together to, well, embarrassing and cliches but true, to save the world. This, in the end, is my main objection to so many of these kinds of processes. :)
34 Marc 08.06.19 at 2:21 pm: I collect narrative student evaluations at the end of every course and find them remarkably useful. We also have the automated fill-in-the-box variant – which actually tracks well with the narrative one if you can figure out how to get the students to actually fill it in. You can get close to 100% narrative feedback if you do it right. Online evaluations are far less reliable and have low response rates.

At least in my department there is no correlation between good evaluations and “easy grading” (yes, we can track that…an instructor with unusually high or low marks for a large survey class will stand out.) There is a strong correlation between clear expectations (as stated by the instructor) and good evaluations – the students really dislike it when grades are perceived as arbitrary, or if the class is chaotic and confusing.

I do also have to note that my peer teaching evaluations of others in my department correlate very well with their automated survey evaluations. The people who dislike them the most are, unsurprisingly, people who don’t get good marks. So I’m more than a little leery of removing one of the only tools that we have to gauge student opinion. As with all of these methods, there is enough statistical slop in pure numbers that you should only take pretty extreme excursions seriously – but sending in peer observers if you get a yellow flag is pretty reasonable in my book.
35 Harry 08.06.19 at 6:53 pm: We have moved to online evaluations. When we did so we cut the number of questions we ask dramatically, and increased the scope for narrative comments. I introduce the evals in class in the penultimate week, and give them the first 20 minutes of class to do them. I insist on explaining their purpose which is, in my case, only to help me improve the my instruction (I have tenure. If they were really really awful they might influence my pay but since we rarely give raises and they just won’t be really really awful they won’t). Ive found I get the same response rate as when we did them in class, and much more feedback, which is much easier to read and internalize. Like others on here, I get a lot of feedback that is genuinely useful, and a little that is not. But I’m a grown up, and can make my own judgments. (I don’t know who the student was who said of my political philosophy course “I don’t know what this course was about but we didn’t read any political philosophy”, but I suspect it was one of the political science majors…).
36 TheSophist 08.06.19 at 10:13 pm: Harry, (#35)
Ooh, do tell…what were some of the texts that this student decided were not political philosophy? (and what, one wonders, might the student have been expecting:”the philosophy of the Democrats is x, that of the Republicans, y”?)
37 Harry 08.07.19 at 2:30 am: I’ve no idea. That was all that was said. I think that they were expecting that we study the ideas of great political thinkers, rather than actually learning how to do political philosophy. But… really that’s just made up in my head, all I got was that one sentence (and some other stuff about it being a good course in other respects).
38 Alan White 08.07.19 at 2:53 am: I taught at the grad-school-to-Full Prof level for 40 years mostly at a teaching institution and grew to detest institutional student evaluations to the extent that I did not look at mine at all for the last 15 years or so. One reason is that they were so obviously geared to institutional goals of satisfying the stats needs for accreditation, institutional promotion, and justification for funding that they were useless to me. So instead I just used informal methods during the course of the semester to ask students as part of assignments, what was working, what was not, what questions did they have I was not addressing, and so on. In other words I tried to communicate directly with them, in class, on assignments, in personal ways in after class q&a as I could catch them to see how I could make things better. There’s no substitute for showing them that you care about them and their learning to improve your own teaching. No forms or summaries can substitute for that real involvement, which is really what teaching and learning is all about–including the learning how to teach that any committed instructor should yearn for.
39 SusanC 08.07.19 at 11:50 am: [blockquote] â€œknows the material wellâ€ (no, really? a freaking university professor that knows what heâ€™s teaching?)[/blockquote]

Well, you do see examples ranging from the lecturer being clearly out of his depth for the course he’s supposed to be teaching, through just parroting the textbook, through to researchers who use the course material in their own research and clearly understand it well.

I agree with the OP’s opinion on the problematic nature of student feedback, but the lecturer’s apparent grasp of the material they’re trying to teach probably deserves to get a mention. (And no, I wouldn’t take it as a given that just because they’re a prof they know what they’re talking about).

A confounding factor – mentioned upthread – the ability to sound convincing even when you don’t actualy know what you’re talking about…
40 Yon Yonson 08.07.19 at 7:24 pm: their purpose which is, in my case, only to help me improve the my instruction (I have tenure

On this side of the pond, of course, everybody has tenure – everybody in every job in every profession; termination at will just isn’t a thing. At least, it isn’t a thing if you’ve got a permanent contract, and if you’ve been in post for a year – but that still covers quite a lot of people, including little academic ol’ me.

The only fly in the ointment is that our institution, like a lot of British universities, has recently introduced a “disciplinary and competence” policy under which you can be put on track for a final warning for… well, displaying what management consider to be incompetence or indiscipline (a category which, incidentally, specifically includes ‘insubordination’). Job security gone bye bye. I asked my union rep what they were going to do about this; they said, probably nothing, because they were already in dispute with management when it was brought in.

The purpose of the surveys isn’t to improve our teaching, it’s to keep us on edge. They motivate us to keep our students happy by holding up the prospect of ending up as the one at the bottom of the list, the one who might be in trouble – and that’s not an entirely empty threat.

Comments on this entry are closed.

How do student evaluations survive ?

Recent Comments

Search

Archives

Pages

Book Events

Contributors

Fine Print

Lumber Room

Old Wood

Meta

Recent Posts

Tags