Grading Medical Students (and More on Grade Inflation)

by Harry on December 5, 2008

The UW School of Medicine and Public Health has just adopted a new grading policy; for first year students it has gotten rid of public letter grades, replacing them with a Satisfactory/Unsatisfactory division. You can read a bit about it here. In fact, the students do get assigned letter grades, but these do not appear on their transcript. The Wisconsin Association of Scholars asked me to participate in an event at the School this week, where we would simultaneously discuss the new policy (the Dean of the School was on the panel) and launch Grade Inflation: Academic Standards in Higher Education edited by my colleague Lester Hunt, who also spoke (adapting part of his really excellent summary afterword to the book). I adapted a bit of my chapter for the book in my talk (it also overlaps with this post announcing the book — the repetition isn’t too extensive though), but also said what I think about the new policy. I thought I’d post the talk partly because the panel was not so well attended, largely because it was one of those suddenly very cold Wisconsin evenings (my father-in-law, fresh from Iraq, did attend, but slept right through my talk!). The text is below the fold. I should add that my sense is that the W.A.S. set up the event partly because some people were very skeptical about the policy; I think that once people had heard the Dean all were convinced that it was sensible.

First some autobiography. When Lester invited me to speak at the conference from which the book emerged I was convinced that grade inflation was real, and that it was a problem. I admit that I still think it is real, but I’m afraid that now I’ve looked at the evidence, I feel that my belief, while genuine, doesn’t really have much of a basis in the evidence. I have also come to doubt that, even if it is real, it is much of a problem. I’ll talk a bit about the paltriness of the evidence, then explain why I don’t think its much of a problem. This is because, given the proper purposes of grading, it doesn’t matter much if grades inflate. Having talked about the purposes of grading I’ll explain why, in the light of those purposes, I think the new grading policy of the School of Medicine and Public Health seems sensible to me.

Now, there is no doubt that grades have increased; within many institutions the median grade is higher than it was 40 years ago, and there are also circumstantial reasons for expecting that grades would have been inflated; in particular the widespread introduction of student evaluations of teaching, scores on which we know are sensitive to the grades students receive (as well as numerous other factors irrelevant to the quality of teaching).

But, if grade inflation is supposed to be analogous with price inflation, rising grades do not, in themselves, constitute inflation. Rather, grade inflation occurs when grades rise relative to the quality of the academic performance of the students. And because we do not keep records of the quality of work students have done, we do not know whether grades have risen relative to the quality of that work. Perhaps, in any given university, the students are more talented, or harder working, or better prepared on entry, or better taught, or all four.

Could the students really be more talented? Well, think about the Ivy League schools, which while most of them still practice affirmative action for the children of their alumnae, do it much less than they used to. It is hard to imagine, for example, even a legacy student as weak as now-President George W Bush was at the time gaining admission to Harvard or Yale or any other elite college today, as he did (to Yale, admittedly) in 1964. Nor do most top universities practice affirmative action in favour of men, as they used to (at least, not as much as they used to). In fact, the talent pool from which they can draw has expanded massively, because for a while they admitted women on an equal basis with men, and even now they only put a thumb on the scale for men so that the sex ratios aren’t too severely skewed. The “Gentleman’s C” which both the 2004 Presidential candidates were awarded by Yale in the 1960’s is reputedly a thing of the past, but that is because of the absence of the gentlemen who were awarded them.

Could the students really be better prepared? Here are some possible reasons why they might be. In the past 40 years the mean number of children born into upper middle and upper class families has declined, enabling those families to invest more in each of child; the women who are now eligible for admission have been socialized to be ambitious in academic and career terms over that time. These are also reasons for finding it plausible that students are harder working, at least at elite (top private, and top public) institutions, than they used to be.

It is not even inconceivable that they are better taught, even if it is self-serving for me to say so. The academic job market is much more open and much more competitive than it was 40 years ago, when people routinely got jobs where their friends were without open searches. (Could productivity itself have increased? I was at a meeting recently where a former President of a rather well-known university recently said “It is hard to believe that there is any area of collective human activity in which productivity could not have increased over the past hundred years”, but he said so with a twinkle in his eye that suggested that he suspects that even though productivity could have increased in higher education, it probably hasn’t).

Having been startled by the lack of evidence of grade inflation, I turned to the purposes of grading. I think there is a tendency to think that grades are there to reward, or signal, individual merit, and excellent achievement. Here’s Harvard History professor, Harvey Mansfield:

Grade inflation compresses all grades at the top, making it difficult to discriminate the best from the very good, the very good from the good, the good from the mediocre. Surely a teacher wants to mark the few best students with a grade that distinguishes them from all the rest in the top quarter, but at Harvard that’s not possible.

I now think that is just a wrongheaded view about what grades are for. For two reasons. First, in nearly 20 years of teaching in research universities I regularly—in just about every class—come across students who are smarter than I am and more promising than I was at their age, but there have only been 4 or 5 students whose work placed them unambiguously well above the rest of the top quarter, and only one whose work stunned me. Reserving an A (or A+ or A++) for them takes grades too seriously. How could the one stunning student know that he was being rewarded with a stunning grade? And why should he care? The student in question, I know, would have found the very idea of reserving a grade for him absurd, laughable, arrogant, and vain. A professor can reward, or ‘mark’, those students’ work much more effectively with verbal or written praise, or with a request to meet to discuss the paper, or with frank admiration of a thought in the public forum of the classroom. Only a student unhealthily obsessed with their grades would be more motivated by a special grade than by alternative forms of recognition. I have not yet come across a student whose work is extremely good and who is sufficiently grade-obsessed that adding a reserved high grade would motivate or reward them at all in the presence of any of the alternatives I have mentioned.

Second, it is not really true that high achievers are, by virtue of that, meritorious. To the extent that achievement is the product of natural talent, or fortuitious environment, which in most cases is considerable, it is not meritorious, but a matter of brute luck on the part of the achiever. I agree with political theorist Michael Sandel that one of the deep flaws of our social environment is that it sends lots of signals to high achievers that they are somehow meritorious in virtue of their achievement and need not feel humble or an obligation to turn their talents to the service of others less fortunate. Universities already participate in that culture, there is no need for the grading system to further mislead. Anyway, high achievement in a particular class is not always the result of effort in that class. The best predictor of achievement in a class is prior achievement in the subject that class teaches; some students routinely achieve at a lower level than other students because they are more intellectually ambitious, and thus (in my opinion) more academically meritorious.

So what are the legitimate purposes of grading? Different stakeholders have different interests with respect to grades, and sometimes the purposes these interests describe are in tension.

* Grades inform students about the quality of their own performance. Students want to know, and have a legitimate interest in knowing, whether their performance conforms to standards it is reasonable to expect from them at the current stage in their intellectual development. This is especially important in a system like the US higher education system, in which students are simultaneously pursuing studies in several disparate classes and disciplines, and in which they therefore need information on which to base their decisions about budgeting their time and effort from week to week.

* Grades are pedagogical tools for eliciting better performance from the student. This purpose is highly individualized. So we might encounter a student, Celia, who lacks self-confidence, and is easily discouraged. Receiving higher grades might encourage her to put more effort in and thus raise her performance (and thereby achieve what really matters, which is learning more). Another student, Betty, might, by contrast, have an unjustified surfeit of self-esteem; she might be coasting, and a slightly lower grade would perform the equivalent function of eliciting more effort and, consequently, more learning.

* Grades inform future employers, vocational schools, and graduate schools, about the quality of the student. One thing employers are interested in is how well the applicant has applied herself to demanding tasks; another is how capable she is of performing those tasks well. Grades aggregate this information for the employer, and thus help him to sort applicants. (Notice that the information grades give employers is crude; it is impossible to disaggregate the effort from the talent, and it is difficult to compare grades across disciplines and, even more so, between institutions.)

The stakeholders with respect to these purposes are the student, and the prospective employer. I don’t believe that faculty members themselves have any legitimate interest with respect to grading, which is one reason that I am not well-disposed to the institution of teacher-assigned grades, which gives faculty an unhealthy sense of themselves as gatekeepers.

But there is another stakeholder that has a legitimate interest in the way that the grading system is designed, and that is the institution itself. So the fourth purpose of grading is to further the legitimate mission of the institution within which the grading system is used. So, for any institution, you have to ask: what is our legitimate mission? And what system of grading will be optimal in pursuit of that mission.

So what is the mission of the UW Madison School of Medicine and Public Health? Well, let’s think about it. It is a vocational program established by the State and subsidized by its taxpayers, many of whom have never directly enjoyed the subsidy of higher education, still less vocational education, that we reserve for young people who have done very well schools and who we expect to do very well in the labour market, and it is regulated by a democratically elected legislature.

Its mission is to participate optimally in producing doctors who will serve the population of the State (in the first place) and of the world (in the second place). It therefore needs to teach a wide array of knowledge and technical skills that students do not already have. Grades have a legitimate role in that mission, and I guess that is why they have been retained past the first year. But it also has to combat the individualist tendencies that schools, colleges, and families tend to elicit and reward in our society. In particular, schooling and higher education both tend to encourage academically promising children to focus on their own success, rather than on cooperating effectively with others, and developing the traits which enable them to interact empathetically with people who are very unlike them. And they also tend to emphasize the importance of career success understood in terms of achieving high status and high incomes.

But almost all doctors once they are actually working have to work collaboratively with others, some of whom are quite unlike them, and most of those who have regular interactions with patients in order to do their job well need to be able to communicate empathically with them. Furthermore, if the school is going to do its share of producing the numbers of general practitioners that our society needs, then until health care reforms reconfigure the status order and reward profile of the profession in a more sensible way, it has to counteract the tendency to be self-seeking in terms of status and financial reward.

I presume that the shift to pass/fail grades in the first year is motivated in part by this concern to fit with mission; counteracting the tendency of students in the first year to develop competitive habits which leave some people who would otherwise develop further and faster behind, and diminishing the effectiveness of early high achievers. It is not just legitimate, but required, to modify your grading system better to pursue the mission of the institution. Now, as far as I can tell, it is just a conjecture that it will have the desired effects; the school is working on a hunch. Its also worth mentioning, and worth remembering when people obsess about grades and grade inflation, that the ethos of a school has more influence over the kinds of choices students make and dispositions they develop than any small change in something as concrete as the grading system. Anyway, I hope that there are systems in place to study and monitor the effects of the change in policy, and I hope that it works.



Paul 12.05.08 at 3:09 pm

Oh the machinations of academe…:-)


Righteous Bubba 12.05.08 at 3:55 pm

Grades inform future employers, vocational schools, and graduate schools, about the quality of the student.

This category should also contain professional associations, which, around the world, often demand a measure of performance in certain areas for licensure. Failure to prove sufficient achievement in area X may mean a Wisconsin student may have to take a test or retake a course depending on the student’s eventual residence.


PG 12.05.08 at 4:20 pm

Righteous Bubba,

Which professional associations measure performance by a course grade? Every one of which I know has its own tests — bar exams, boards, etc. — to decide whether the candidate knows enough. Some of these require that the student have graduated from an accredited school in order to qualify to take the test, but graduation is sufficient and grades irrelevant. I will join the NY bar next month and they never asked for my transcript.


Matt L 12.05.08 at 4:56 pm

A timely post… thanks Harry!!! I am just going through and updating the grade book in anticipation of finals week…

Yes, its good to be worried about grade inflation, but mainly because we need to think about the social purposes of grades. The first reason you mentioned, score keeping for students, is probably the strongest social purpose. The students want to rank themselves and they do see grading as a matter of merit. I went to UC Santa Cruz when they still had narrative grading and students tried to compare themselves to one another based on their written evals!

I think the pedagogical purpose of grading is much weaker than teachers assume. Students readily forget bad grades and believe that they generally merit good, or at least passing, grades. I know its heretical to say this, but I think a good grading rubric can be a more effective pedagogical tool than a letter grade or even extensive written comments on an assignment. I think rubrics could also be used to highlight a student’s improvement (or decline) over time.

I am not so sure employers actually find grades that helpful. I worked for five years between undergraduate and graduate school and I never had an employer ask about my GPA. The only people who asked for transcripts were the graduate programs I applied to and grant giving institutions. Again, this seems to be more about the students-turned- academicians rating themselves, rather than any meaningful measure of merit.


john theibault 12.05.08 at 4:58 pm

On behalf of historians everywhere, I would just like to note that Harvey Mansfield is in the Government department, not the History department, at Harvard.


Righteous Bubba 12.05.08 at 5:43 pm

Which professional associations measure performance by a course grade?

I’m thinking mainly of health professions; dunno anything about law. Nursing, for instance, often has requirements that need to be met by particular college-level coursework. For medical doctors I am blissfully unaware of state requirements, but I know that international moves may be accompanied by very close scrutiny of each and every educational document and the grades may be a factor in future training.


Steve LaBonne 12.05.08 at 6:41 pm

During my professorial days I always had a bad conscience about grading and a sense that nobody was really willing to argue openly and honestly about what its purpose (if any) was. Assigning grades is what I miss least about academia.


J. Michael Neal 12.05.08 at 7:19 pm

A bigger problem than grade inflation is watering down coursework. I’m currently taking classes at the University of Minnesota’s Business School, for a professional certificate in accounting, and there has been only one class in the entire program that I consider to be at all rigorous. I’ve found this to be a particular problem with the business school, and, talking with others who have gone elsewhere, I think that it’s a lot more widespread than just my institution.

Business schools have gone far overboard with the idea of students as customers, and end up catering to their interest in getting through classes easily. It is a school policy. I’ve talked with a couple of professors who have expressed frustration that they can’t give difficult (or essay) exams any longer, but, between the hassle from administration and the endless complaining by students, they just don’t have the time to do so. I blame the administration for the latter as well; if they provided enough backup to instructors in their grades, student complaints wouldn’t be such a problem.

The fact that Carlson mandates that professors give a median grade of B+ in its upper division courses is a symptom of the problem, not the actual problem.


PG 12.05.08 at 10:59 pm

Nursing, for instance, often has requirements that need to be met by particular college-level coursework.

Yes, but this has nothing to do with grades as conventionally understood — as with the lawyer requirement to graduate from law school, this is a matter of pass/fail. Unless students are inflated from failing grades to passing ones (something I doubt), grade inflation would be irrelevant to such licensing.


Righteous Bubba 12.05.08 at 11:03 pm

Yes, but this has nothing to do with grades as conventionally understood—as with the lawyer requirement to graduate from law school, this is a matter of pass/fail.

Not so. If you get grade whatever in whatever course you may be exempted from a requirement. Quite a money and time saver.


LFC 12.05.08 at 11:14 pm

Re the parenthetical remark about productivity in higher ed.: what does “productivity” mean here?


Righteous Bubba 12.05.08 at 11:42 pm

PG, I believe I am full of it as far as nursing goes: please excuse my faulty brain.


HP 12.06.08 at 1:46 am

I say we take the currency analogy and run with it. I recommend a program of devaluation and decimalization — the New C. The New C is declared by fiat to be equal in value to the old A. We then assign this new unit a value of 100 in the New GPA. That is, a 4.0 GPA (OGS — Old Grading System) becomes, in the new system, 100C.

So, my old GPA (in 1985) of 3.79, would convert to 79C under the New C system. All grades would be expressed in terms of C (although much like the guinea or the farthing, older units would doubtless be used colloquially for many years). The Old B would be written 200C, the Old A 300C, etc. Old D is now 0C. A negative C is required to fail. Rather than competing for As and Bs as under the current inflationary grading regime, students would now compete for the more finely graduated decimal values of the New C, thus slowing the rate of inevitable grade re-inflation. Problem solved.

Of course, this change would need to be accompanied by a massive marketing and public education campaign, but when has academia ever lacked for funding?


Bloix 12.06.08 at 2:10 am

Unless things have changed since I was a TA (which is as far as I got) grading is an extraordinary arbitrary exercise. There is no quality control at all – no rubric or protocol, no training, no second look, no standard distribution curve.

Do you know whether the grade you give is influenced by the place of the bluebook in the stack (do you give higher grades to those on top? On the bottom?) Do you have a colleague look at every fifth paper to make sure you are grading in line with others in your department? Do you train your graduate students in techniques of accurate, fair, consistent and transparent grading? Do you follow the academic literature on grading? (Is there an academic literature on grading?)

If you don’t, then your grading may not be much better than taking the papers and tossing them down a flight of stairs. You may believe it is, but you don’t know one way or the other.


florentine 12.06.08 at 2:28 am

Maybe I’m not representative, or not representative of students today, but when I was an undergrad in the late 1960s and a grad student in the 70s, getting high grades left me feeling empty, and I didn’t attend my own initiation into Phi Beta Kappa. Yet on the few occasions when I received a grade I knew to be extraordinary (an A+ from someone who was on probation for failing a large proportion of his students and who was giving mostly Cs to the other students in the honors seminar I was taking; an H+ from someone who had only given a grade that high twice before in institutional memory; a series of “1”s from someone who was grading everything on a scale from 1 to 3—we had no idea what the numbers meant except that 1 was better than 3—and was giving everyone else 2s and 3s), the effect was overwhelming, much more powerful than nearly any of the things that were ever said about my performance. People get numb to grades, so it takes something extraordinary to get around that.


praisegod barebones 12.06.08 at 6:41 am

‘Unless things have changed since I was a TA (which is as far as I got) grading is an extraordinary arbitrary exercise. There is no quality control at all – no rubric or protocol, no training, no second look, no standard distribution curve.’

Things might gave changed, but when I was in grad school in the UK, there was quite a lot of training, second marking etc. All finals papers were double marked, and problematic cases sent to an external examiner (ie someone from the discipline, but in a different institution.) Papers marked by TAs were sampled by the course leader to check the level; detailed rubrics were standard.

It costs money of course; and if grade inflation isn’t a problem, maybe some of this money is ill-spent.

I think Harry’s point that one of the primary consumers of the an institution’s grades is the institution is an important one. I think something else important follows: since different institutions – and even different departments – have different goals, we shouldn’t expect there to be one correct approach to grading.

If that’s the case, maybe transcripts should also be accompanied by rubrics, explaining what (say) a 3.6 GPA means in a particular institution. That said I rather suspect that no-one would pay much attention to them, so it might just be a waste of time and effort. But I’d be interested to know what Harry thinks of the idea.


agm 12.06.08 at 5:20 pm

I’m glad to be in a tech field. It’s just not that hard to write a rubric for using specific principles without which you cannot understand a problem and solve it. If you present a grading system and explain why you choose to weight things the way you did from the start, things work out ok, and it signals intent to assess fairly. There are definitely tricky grading issues (such as assessing implicit use of a concept when assigning points for explicit use on the grading rubric and solutions, which I email out for every assignment), but it works better than what I see described above — either you used Newton’s laws right or you didn’t, either energy was conserved in your solution or it wasn’t.

As another plus, no need for fitting student to a curve curves since I’m evaluating based on correctness, not relative performance, and bias shifts are easy to handle if I’ve assigned something unwarrantedly difficult or labyrinthine — everyone’s grade gets adjusted in a uniform manner.

All of which is good when considering that these people want to transfer to 4-year schools and study engineering, and thus need grades that signal their understanding to themselves and to admissions officers at those schools.


rosmar 12.07.08 at 12:33 am

“I don’t believe that faculty members themselves have any legitimate interest with respect to grading, which is one reason that I am not well-disposed to the institution of teacher-assigned grades, which gives faculty an unhealthy sense of themselves as gatekeepers.”

If teachers shouldn’t assign grades, who should assign them?

As a teacher, a) I hate grading, and b) I take grading very seriously because so many students take their grades seriously. In my department we take steps to make grades as consistent as possible across the department–agreeing on learning outcomes, how to assess them, sharing our rubrics, etc.

I guess my c) is that, as much as I hate grading, I do like seeing with my own eyes what material and skills the students have grasped and what they are still struggling with. It helps me to focus my class periods sometimes, and definitely helps when individual students come to office hours.


Eli Rabett 12.07.08 at 4:06 am

MIT grades freshmen on a U/S scale. They appear to do ok.


PG 12.07.08 at 9:28 pm

Righteous Bubba,

No problem — I just figured you must be thinking of a system very different from the ones of which I have any knowledge (U.S., UK, Canada, India).


Righteous Bubba 12.07.08 at 10:18 pm

No problem—I just figured you must be thinking of a system very different from the ones of which I have any knowledge (U.S., UK, Canada, India).

I actually made the dumber mistake of misreading certain state nursing continuing ed requirements as requirements for initial licensure.


Sam 12.08.08 at 1:52 pm

This is my own tangent, and for context, I work with undergraduates… the grade-centered interaction that drives me the battiest is when talented students who have been getting high marks for mediocre work can’t handle when someone calls them on it. Even when the message ultimately is, “you can do great work!” (Hear that said with a hopeful, encouraging tone… almost idiotically out of touch with the students’ anxiety, confusion, and panic.)

Also, I am very thankful for the professor I had in undergrad (1998) who pitched a fit in front of an intro level class for the sense of entitlement the students had and their disrespectful behavior. It was a “rock for jocks” class that took care of students’ lab science requirement and was reputed to be the easiest option. Students signed in and left, didn’t bother to show up, or talked or played on the computer during class. The teacher said, “Your parents paying for a 40,000 dollar a year education does not entitle you to an A in this class… so don’t come to me with you excuses when you skip half this class and can’t pass the course.” As a senior (taking an intro-level major requirement), I was really happy to see someone say that out loud. A professor (not one of the TAs) was passionate enough and connected enough to care about their class and our learning… and the general level of respect for education. That is what I took from that not-so-graceful display of frustration.

I’m not sure how grades are take on a population-level, but I discuss them in classes a lot. I don’t assume that I have the same ideas about their utility and meaning as my students do. So I tend to make the discussions and challenges transparent and seek students’ feedback and input on the syllabus and evaluated projects. For a process- centered course I had them evaluate and grade themselves. They loved the idea, but ultimately found it terrifying to have to take that responsibility — even when I provided a rubric. The students who took that class talk about how much they learned by having to think about their own work and development without having it validated by someone else. (Note to teachers: Yes, I had to let a few people grade themselves higher than I would like, but many students graded themselves lower than I would have. It was a class of 15.)


Drew 12.09.08 at 3:17 pm

At my degree-granting institution, a large state university, I taught upper-division major courses as a postdoc. The student population there is typical of large state universities–middle to upper middle class. One of my TAs had worked the quarter before with one of the department’s recent hires, from an Ivy League school. She shared with me the fact that the recent hire never gave anyone a grade lower than B-. (For all I know, this may be standard at all Ivy League schools.) Now that I’ve moved on to a position at regional university, with a much more working-class to lower-middle-class population, the story continues to trouble me. Why, really, should I grade my less-privileged students in the C range, if their better-placed peers are guaranteed As and Bs by virtue of admission? I still give Cs and Ds, but like I say, I do so with a troubled conscience. My current students are already socio-economically behind their peers at better schools; am I adding to their disadvantage by grading harder than my peers grade at better universities? To take one example, my students will already have a harder time getting into a good grad school; I have no idea if employers look at GPAs or not.

Another issue, that I don’t see raised here: do non-tenure-track instructors grade higher or lower than their tenure-track peers? The problem, with student evals in the mix, is that we’re all motivated to grade higher. I know when I was on the job market, I constantly worried about evals, and I still worry a little as an assistant professor.

Comments on this entry are closed.