Mean and Regressive

by Henry on September 28, 2010

I just finished reading Justin Fox’s The Myth of the Rational Market (yes: two years late – I know), and came across this story about Daniel Kahneman which I didn’t know, and which illustrates one of those points that is ex post obvious, but ex ante rather brilliant.

The only point Daniel Kahneman was trying to get across was that praise works better than punishment. The Israeli Air Force flight instructors to whom the Hebrew University psychologist delivered his speech that day in Jerusalem in the mid-1960’s were dubious. One veteran instructor retorted:

On many occasions I have praised flight cadets for clean execution of some aerobatic maneuver, and in general when they try it again they do worse. On the other hand, I have often screamed at cadets for bad execution, and in general they do better. So please don’t tell us that reinforcement works and punishment does not, because the opposite is the case.

As a man trained in statistics, Kahneman saw that of course a student who had just brilliantly executed a maneuver (and was thus praised for it) was less likely to perform better the next time around than a student who had just screwed up. Abnormally good or bad performance is just that – abnormal, which means it is unlikely to be immediately repeated. But Kahneman could also see how the instructor had come to his conclusion that punishment worked. “Because we tend to reward others when they do well and punish them when they do badly, and because there is regression to the mean,” he later lamented, “it is part of the human condition that we are statistically punished for rewarding others and rewarded for punishing them.”

{ 50 comments }

1

Joseph Heath 09.28.10 at 5:48 pm

I came across a very clear statement of this point (that regression to the mean leads us to overestimate the efficacy of punishment) more than 20 years ago in “The Art of Raising a Puppy” by the Monks of New Skete. I was surprised to be learn that this was common knowledge among dog trainers, but had somehow escaped that attention of social engineers. Anyhow, Kahneman mentioned it in his Nobel lecture in 2002 (where I believe that quote is from), which did a lot to correct the situation.

2

Bloix 09.28.10 at 6:04 pm

There is regression to the mean for events that occur randomly. Presumably flight training involves people actually learning how to fly, such that performance improves in a more or less non-random way.

3

Bruce Wilder 09.28.10 at 6:13 pm

W. Edwards Deming, the statistical quality guru, was often at pains to make similar points: that managers often create standards of performance and systems of reward/punishment predicated on them, in complete ignorance of the stochastic nature of the process they were tasked to control, and the statistical implications, and the inevitable implications of such misguided schemes for morale.

4

Jeremy 09.28.10 at 6:18 pm

Interesting. There is a lot of nonsense being written against the use of reinforcement that contradicts well established learning theory. I suggest people consult a standard textbook such as the excellent Introduction to Learning and Behavior by Powell, Symbaluk, and Honey.

5

Jacob T. Levy 09.28.10 at 6:30 pm

… not having read Kahneman’s Nobel lecture, my reaction’s pretty much like Henry’s. Wow.

6

Aulus Gellius 09.28.10 at 6:36 pm

Bloix: we’re talking about praising/punishing flight cadets, not for a general evaluation of their performance over time, but in regard to “clean execution of some aerobatic maneuver” or “bad execution [of same]” on one particular occasion. If you give you’re best performance ever on one particular occasion, the chances are that the next try won’t go as well (unless the effect of practice on your ability works much faster than it does for most skills).

7

Billikin 09.28.10 at 6:52 pm

When I was fresh out of college I taught English as a Second Language (ESL). My students were not novices, but were quite shy about speaking in English. My first job, I figured, was to get them talking, so I rewarded anything they said, even if it was incorrect English. After a few classes they were speaking freely. At that point I began offering corrections. They clammed up. Too soon, I thought, and I went back to rewarding everything. After a few more classes I noticed that they were improving anyway, correcting their own errors, despite the fact that I had “reinforced” them. :) (OC, I was not their only source or model of correct English.) Verrry interesting. In fact, in my two years of teaching ESL, I never again felt the need to offer corrections. (OC, I answered questions.) The students learned, anyway. ;)

8

Ben Hyde 09.28.10 at 6:54 pm

This is the way I teach that lesson:

“Let’s train a pile of pennies to behave. Good pennies come up heads. Flip the pennies ten times. Divide them into two piles, the good ones and the bad ones. Now punish the bad ones. Beat them with stick. Repeat the experiment. Notice how their behavior improves! Now reward the good ones. Kiss them. Repeat the experiment. Notice how they appear to be slacking off! Clearly negative reinforcement works and positive reinforcement doesn’t.”

9

Barry 09.28.10 at 6:57 pm

Bloix 09.28.10 at 6:04 pm

“There is regression to the mean for events that occur randomly. Presumably flight training involves people actually learning how to fly, such that performance improves in a more or less non-random way.”

Read what Henry excerpted, particularly the word ‘abnormally’. And meditate up on the fact that things can fluctuate randomly around an increasing trendline.

10

Lemuel Pitkin 09.28.10 at 7:22 pm

There is regression to the mean for events that occur randomly. Presumably flight training involves people actually learning how to fly, such that performance improves in a more or less non-random way.

True. But unless improvement is both rapid and steady, or N is very small, the variance between successive performances is still going to be mostly mean-reverting noise.

11

Barry 09.28.10 at 7:29 pm

Jim Henley (or Thoreau) at Unqualified Offerings made this point, on a post which even Google can’t find (my paraphrase) – in short, if negative reinforcement doesn ‘t work, the commonly offered advice is to do more of it. If positive reinforcement doesn’t work, the commonly offered advice is to scorn it, and insist on switching to negative reinforcement.

12

Marshall 09.28.10 at 7:44 pm

So, is the lesson here that neither positive reinforcement nor negative reinforcement is effective, since their performance–or behavior–will move to a fixed long-term equilibrium, anyway? Why bother praising an abnormally brilliant maneuver, since the pilot’s subsequent attempts will just regress to their mean performance. Similarly, for a child with chronic behavioral problems, why bother praising a single good deed, since they’ll return to an equilibrium of misbehavior soon after.

13

piglet 09.28.10 at 7:53 pm

The most obvious contemporary example that comes to mind is the debate about school improvement through teacher firing.

14

Harry 09.28.10 at 8:09 pm

Piglet’s thought was exactly mine on reading this. The difference being that the people who talk about firing teachers almost never actually do it.

15

Phillip.W 09.28.10 at 8:12 pm

So when g-d punishes you for stupidity it’s called moral hazard, but when your boss does it’s called an irrational response?

16

pdf23ds 09.28.10 at 8:45 pm

Even if teacher firing didn’t work as an incentive, it could still conceivably work as a way to get rid of bad teachers. (Which assumes there are better teachings waiting to replace them, of course.)

17

piglet 09.28.10 at 8:56 pm

I was in part referring to the article in today’s NYT, “4,100 Students Prove ‘Small Is Better’ Rule Wrong”: “Brockton never fired large numbers of teachers, in contrast with current federal policy, which encourages failing schools to consider replacing at least half of all teachers to reinvigorate instruction.” So much for praise versus punishment.

I was also thinking of two particular features of current education debate:
1. The focus of much attention on abnormally good or bad teachers, oblivious to the fact that most students are and always will be educated by average teachers; and
2. The concept of teacher evaluation based on student performance, which clearly has a large stochastic component.

18

Bloix 09.28.10 at 8:57 pm

Yes, I see the point: even if performance improves steadily, anything other than straight-line progress will involve some degree of variation sufficient for the effect to manifest itself. Thank you, Aulius Gellius, Barry, and Lemuel.

19

Barry 09.28.10 at 9:27 pm

You’re welcome, Bloix.

20

Barry 09.28.10 at 9:28 pm

pdf23ds 09.28.10 at 8:45 pm

” Even if teacher firing didn’t work as an incentive, it could still conceivably work as a way to get rid of bad teachers. (Which assumes there are better teachings waiting to replace them, of course.)”

A big problem – I live in Michigan. There’s quite a waiting list for Ann Arbor teaching slots; I imagine that the list would be much shorter for Detroit slots.

21

Tim Wilkinson 09.28.10 at 9:59 pm

I must (unprovably) say I got the point well before the reveal (and without paying much attention to the point of the post title). Also noted that

1. the underlying reason why the example is bad and that Kahneman’s point is good is that this appears to be a case of exercising skill rather than making deliberate decisions to behave one way or another, thus the usefulness of incentives is doubtful. I suppose focussing the mind or something. But in fact the motivation to perform well is unlikely to be a problem, and the improvement is brought about not so much by mere conditioning as by collaborative, deliberate acquisition of skill.

2. Punishment is never in any given single case going to be an alternative to praise (unless you are engaged in brainwashing rather than training.) The relevant comparison is between iterated policies of using only praise (when appropriate) or only punishment (when appropriate). Or both, like the instructor. Or some more complex mix, I suppose.

3. The problem with praise is that trying to reinforce negative behaviour (e.g. not barking) using only (withdrawal of) praise is pretty hard to pull off. It helps very much if you can actually explain what is being praised or punished. And then you may not need to do any punishing (see point 2).

22

Tim Wilkinson 09.28.10 at 10:00 pm

(see point 1), I meant.

23

Matt 09.28.10 at 11:14 pm

This does all leave out the fact that, from the perspective of one in power, punishing can be much more fun.

(I should say, though, that I at least partly mean that as a joke. I rarely enjoy punishing, in the rare case when I’m in a situation to do it, and do enjoying giving praise. I don’t think that’s unusual, but wonder if most people would expect that, before they thought about their own experience for a bit.)

24

Zora 09.28.10 at 11:27 pm

A fine book called Don’t Shoot the Dog popularizes behavioral training. It specifically discusses training a dog not to bark. You reward the dog for silence/not barking.

The author didn’t go into it in any great detail, as I recall. (I don’t have a copy of the book; I keep giving mine away.) I would imagine that you would give a cue for silence, then reward the dog only if it refrained from barking for X minutes. Give another cue to end training run. Gradually increase the number of minutes required to earn the reward.

Such an approach requires effort and deliberation and is much less viscerally satisfying to our primate instincts than simply beating the dog.

25

Doctor Memory 09.29.10 at 1:45 am

As always, it’s entertaining to watch economists discover with great fanfare what their benighted colleagues over in the behavioral sciences have known for approximately fifty years now.

Just wait until they discover Festinger. It’s gonna blow their damn minds.

26

Joseph Heath 09.29.10 at 1:55 am

This brings back memories. I once wrote a newspaper column on this topic, using the term “negative reinforcement” as a synonym for “punishment,” the way a couple of commentators have above. I got a very short letter from a psychology professor, pointing out that I was using the term incorrectly. Negative reinforcement is (roughly) when you train someone to do something by rewarding them with removal of a negative stimulus. Punishment is the opposite of reinforcement, it’s when you discourage rather than encourage the target behavior.

I thought that was interesting.

27

Eric L. 09.29.10 at 2:19 am

Just a footnote here. Don’t forget Tversky. See Kahneman, D. & Tversky, A. (1973). “On the psychology of prediction,” Psychological Review, 80, 237-251.

28

Maurice Meilleur 09.29.10 at 3:16 am

Doctor M, it’s even more entertaining to watch political scientists cribbing off the economists. In the political scientists’ defense, though, Kahneman and Tversky (1973) have been showing up in APSR bibliographies for years.

29

Aulus Gellius 09.29.10 at 4:09 am

At some point when I was grading undergraduate papers (which hasn’t been for a while), I had the blinding realization that of course pointing out things students do right is more helpful than pointing out what they do wrong, because it gives them more information: what they need to know is “how do you write a good paper,” and any particular example of not-good writing still leaves them with an infinite variety of equally bad options to choose from, whereas an example of good writing actually gives them a particular thing to do. It was an amazing revelation, and really changed the way I teach entirely.

30

Phillip.W 09.29.10 at 4:15 am

I hold no brief for moral hazard as a concept. It gives me the creeps. But there’s been no real response to my comment. An individualized sense of fear is held to have a primary role in the machinery of society, but human interaction itself is supposed to be ruled by milk and kindness. There’s a disconnect.

So maybe there’s a positive aspect to the social that mitigates against moral hazard – and against individualism itself. Maybe social activity acts a form of positive reinforcement. And maybe that it what should be seen as primary, or even natural.

31

burritoboy 09.29.10 at 5:21 am

However, the praising / punishing itself may have more important functions than actually increasing subordinate performance. Examples:

It may be a signaling device to other managers (see, I do this management stuff, I’m a good manager).

It may be a signal of status (I am powerful and high-status: I can reward and punish).

It may be an organizational rite or ritual.

I’m sure everybody can think of more.

32

Billikin 09.29.10 at 7:39 am

pdf23ds: “Even if teacher firing didn’t work as an incentive, it could still conceivably work as a way to get rid of bad teachers. (Which assumes there are better teachings waiting to replace them, of course.)”

It strikes me as curious how proponents of free markets do not seem to offer what seems to me to be an obvious market solution for the perceived poor quality of teachers: increase teacher pay to attract better teachers. Once upon a time you could get good teachers for cheap because career opportunities for smart women were limited. Not any more. If you want teachers as good as our parents and grandparents had, you gotta pay them more (in real terms, of course). :)

33

Billikin 09.29.10 at 7:50 am

BTW, it is difficult to teach a negative. Often the best thing to do is to teach a competing behavior. For example, when dogs bark that is rarely all they are doing. They are often displaying other aggressive behavior. Teaching the dog to sit or to lie down may do the trick. :)

34

Henri Vieuxtemps 09.29.10 at 8:26 am

Perhaps some managers and trainers view all good performance (even exceptionally good) as normal, and anything below the norm as an anomaly. From this perspective praising is simply unnecessary, perhaps even harmful: doing your job well is not a reason for celebration.

35

Guido Nius 09.29.10 at 9:02 am

Kahnemann & Tversky should get praised much more.

The interesting Kahnemanian observation here is not about learners but about teachers: “it is part of the human condition that we are statistically punished for rewarding others and rewarded for punishing them.”

People in power are perceived as good rulers when they behave in a powerful way because that is what we notice. It is inherently difficult for all of us to make ‘the right calculation’ so we have a tendency to want decisiveness and abhor wavering. This is true on both sides of any debate, & thinking some of us have transcended that limitation (the human condition) is very misguiding; it’s like thinking you are way past buying lottery tickets or insurance.

36

ajay 09.29.10 at 10:34 am

So when g-d punishes you for stupidity it’s called moral hazard, but when your boss does it’s called an irrational response?

I suspect that you are misunderstanding the concept of moral hazard.

37

g 09.29.10 at 11:06 am

ajay@36, I don’t think he’s misunderstanding the concept but I find his wording very unclear. Here’s a verbose paraphrase of what I think he meant: “As a society, we tend to think that when stupid actions have bad consequences we shouldn’t do too much to mitigate them for fear of creating a bad hazard; in other words, we allow the universe to punish people for doing dumb things. On the other hand, we are steadfastly opposed to having *other people* impose punishments for doing dumb things, preferring to praise good performance. Isn’t that inconsistent?”

For the avoidance of doubt, I agree with approximately 0% of that. But I think it’s what Phillip meant.

38

Tim Wilkinson 09.29.10 at 1:08 pm

Zora @24: Such an approach requires effort and deliberation and is much less viscerally satisfying to our primate [sic] instincts than simply beating the dog.

Well a bit of a tangent, but I have a dog whom I’ve trained to do everything necessary to stop him getting into danger or trouble (basically, coming immediately when called, and always staying on the pavement unless I initiate the procedure for crossing the road), and one or two things for our mutual convenience (e.g. dropping the ball so I can throw it again – he has conflicting desires that a. the ball should be thrown, b. he should keep hold of his prize).

But training him not to bark, particularly when left outside a shop etc., has eluded me, even though I am willing to administer punishment to the extent of expressing displeasure by saying ‘no’ in a firm and even somewhat harsh tone of voice (thus satisfying my atavistic desire to tell off small animals). I’m sure it would be possible to do it, but it hasn’t been enough of a priority for me to take any more time and effort than I already have to do so.

A large part of the problem is the hyper-Wittgensteinian one of getting him to understand what the rule is I actually want him to follow (he wants to please, which helps). The other main part is that he only problematically barks when I’m not actually present.

Again, this is about discrete (binary) outcomes: bark/not-bark, which is rather different from the case used for illustration which apperas to be about the continuous variable of how well the pilot flew, with a threshold between praiseworthy and punishable which in this case seems to move based on the last measured performance.

The structure of reward and punishment in this kind of case may differ from that often envisaged by schemes of fair praise and blame, in which the former attaches to superogatory actions, the latter to blameworthy ones, and there is plenty of neutral middle ground.

If you are treating people as objects of conditioning and you want ever-increasing results, you would probably tend to concentrate on punishing pretty much everything, while perhaps praising a few examples of relative excellence (a moving standard). That may go some way to explaining the revolting (I would say intolerable) way sales teams are run in my experience. Shout at everyone for not selling enough, reward the top seller only, regardless of absolute quentities, what is reasonable, etc.

39

JP Stormcrow 09.29.10 at 1:22 pm

An interesting example of this principle that I recall reading about involved tryouts and subsequent performance levels for “select” teams in a sport (gymnastics I think, am hazy on the details). Coaches had a stock narrative that certain athletes despite “having the skills” just could not hack it on the larger stage at bigger competitions. Might be true for some, of course, but does not take account of the fact that the selection process itself will intrinsically yield these results. Of course multiple tryout or scouting sessions mitigates the effect. (And I guess the reverse effect would be athletes who perform better “now that the pressure is off”.)

40

Tim Wilkinson 09.29.10 at 1:51 pm

BTW didn’t see Billkin @33: when dogs bark that is rarely all they are doing. They are often displaying other aggressive behavior

Mine is pretty clearly saying ‘come back (partly because I don’t like being left here tied up, but mostly because I am worried that you may have gone for good)’. In small shops where I can reasonably easily keep an eye on him, I leave him off the lead.

In the case of shops etc. that he is used to me going into and coming back out, if he can see me going into the entrance rather than disappearing round a corner, he doesn’t bark, as he doesn’t if he can look into the shop to check I’m still in there.

(Just as an off-topic bonus, possibly my favourite piece of philosophical writing, from by far the greatest philosopher I’ve ever read, is Hume, Treatise, On the Reason of Animals.

no truth appears to me more evident, than that beasts are endowd with thought and reason as well as men.

The final part almost poetic:

To consider the matter aright, reason is nothing but a wonderful and unintelligible instinct in our souls, which carries us along a certain train of ideas, and endows them with particular qualities, according to their particular situations and relations. This instinct, it is true, arises from past observation and experience; but can any one give the ultimate reason, why past experience and observation produces such an effect, any more than why nature alone shoud produce it? Nature may certainly produce whatever can arise from habit: Nay, habit is nothing but one of the principles of nature, and derives all its force from that origin.)

41

ER 09.29.10 at 2:22 pm

As Alfie Kohn points out (in http://www.alfiekohn.org/books/pbr.htm for adults, http://www.alfiekohn.org/parenting/gj.htm for children), praise and punishments are two sides of the same coin. Neither one is “better” or “more effective.”

42

Sebastian 09.29.10 at 3:15 pm

“The most obvious contemporary example that comes to mind is the debate about school improvement through teacher firing.”

This isn’t a very good contemporary example because teachers get fired so very rarely that school improvement tied to teacher firing (whether by correlation or causation) is not observed.

“Brockton never fired large numbers of teachers, in contrast with current federal policy, which encourages failing schools to consider replacing at least half of all teachers to reinvigorate instruction.” So much for praise versus punishment.

This reporting is crap. Except for incredibly useless interpretations of ‘consider’, there isn’t a school in the country that can be said to have considered replacing *at least half of all teachers* to reinvigorate instruction.

43

Mark 09.29.10 at 4:10 pm

The really important point is that this is the explanation of the Sports Illustrated jinx — the fact that after appearing on the cover of Sports Illustrated athletes tend to have poor performances. Since they placed on the cover of SI because they have had unusually great performances, having an off-day afterward is just regression to the mean.

44

piglet 09.29.10 at 5:02 pm

“I had the blinding realization that of course pointing out things students do right is more helpful than pointing out what they do wrong”

That’s an interesting point but it leaves me wondering, would you underline the words spelled correctly instead of the mistakes? More to the point, I see how positive feedback can easily be neglected when you are trained to find and correct mistakes in a large number of student papers. But I also hold the principle of learning from mistakes in high regard and when I am at the receiving end what I prefer is that the feedback be as specific as possible, be it positive or negative. What is useless is a low grade without specific directions that help the student to improve. Come to think of it, a high grade without specific directions that help the student to improve is also pretty useless. As to the question of punishment, if a student is not intrinsically motivated to improve then I suspect that low grades won’t do much to motivate the student either.

Ideally learning shouldn’t be linked to punishment at all. That doesn’t mean however that mistakes shouldn’t be pointed out. That is an indispensable part of any learning experience and avoiding it on account of not making the student feel bad wouldn’t do anybody a favor. What is needed is an atmosphere in which mistakes can be openly discussed without fear of stigma and punishment. This is true in academic settings as well as in the professional world where I have often made the experience that coworkers become defensive when a mistake is pointed out to them, instead of being able to acknowledge that we all make mistakes and what counts is the willingness to learn from them.

45

Billikin 09.29.10 at 5:46 pm

Tim Wilkinson: “But training him not to bark, particularly when left outside a shop etc., has eluded me, even though I am willing to administer punishment to the extent of expressing displeasure by saying ‘no’ in a firm and even somewhat harsh tone of voice (thus satisfying my atavistic desire to tell off small animals). I’m sure it would be possible to do it, but it hasn’t been enough of a priority for me to take any more time and effort than I already have to do so.”

Not to play dog trainer. ;) But remember the point about competing behavior? Here is a thought. When you leave your dog alone in public and you do not want him to bark, how about putting one of those chewable toys in his mouth? Let him chew on that while you are gone. :) (If he barked while chewing, that might make a funny video. ;)) (BTW, putting the toy in his mouth might serve as reassurance that you are coming back. Then if you are kidnapped, he might bark. ;))

46

Tim Wilkinson 09.29.10 at 6:12 pm

Tried leaving him with his tennis ball, which he likes chewing (once punctured) and stripping the ‘fur’ from. He ignores it until I come back. Maybe if I could arrange for a cat or a rat…Sam-I-Am belongs on a different thread though.

We’re verging on a hijack here, anyway. Thanks though; I’ll give the competing behaviour thing some thought as priorities permit. The barking is not a big problem really, just a useful illustration of reinforcing a negative.

47

JP Stormcrow 09.29.10 at 11:57 pm

Mark@42: The really important point is that this is the explanation of the Sports Illustrated jinx

There is a grand unifying theory of disappointment lurking in this–sports, dating, sex, restaurants, life in general–anything where people are preferentially drawn to prior positive results either experienced or reported. The smaller the sample, the greater the likelihood of disappointment reviewer in the paper says “best Thai food in the metroplex”? Don’t count on it.

48

a reader 10.01.10 at 4:10 pm

49

RationalCenter 10.02.10 at 3:22 pm

Read Daniel Pink’s book “Drive.” (As an introduction, find his TED Talk on YouTube.) He lays out a very comprehensive review of the research on personal performance, all of which consistently demonstrates that for most tasks (anything that requires any sort of thinking or problem solving), rewards or punishments are counterproductive. He doesn’t argue that they are always ineffective, but that they are far, far less effective than we believe they are – much like the flight instructor in the anecdote.

50

Siobhan 10.04.10 at 6:42 am

I know a lot of statistic gurus will be upset over me saying this but I think that it really depends on the individual. For example, as a university rower we are dished out instructions constantly. When I am praised there is an immediate change in my behaviour, suddenly I could row for hours on end. Whereas the rest of my crew seem to be driven by anger and tear the river apart presumably because the coach told them to watch their posture – might as well tell them their lives are meaningless and are a burden on society.

Comments on this entry are closed.