Estimation

by Chris Bertram on August 27, 2004

I managed a mere 39 per cent on “Chris Lightfoot’s estimation quiz”:http://roughly.beasts.org/ I’m sorry to say. Instructive and entertaining it is though. (Hat-tip “Dave Weeden”:http://backword.me.uk/ ).

{ 43 comments }

1

chris 08.27.04 at 11:34 am

39 (me too) may be better than it looks. Does anybody really know how many carrier bags are used is Aus, and if so, why?

Got the size of the HoC wrong though. Deep shame.

2

LowLife 08.27.04 at 11:41 am

I don’t exactly know what carrier bags are (grocery bags?). And I think I should get points if I guess English History dates within a hundred years (Would it have made a difference if I had widened my MOE?) And who is Tony Benn and should I feel stupid for not knowing? A disappointing 34%

3

LowLife 08.27.04 at 11:51 am

My HoC guess of 600 +/- somenumber got me some points though I have no reason to know that. My best guess, though, was the 9 points I got guessing the distance from Edinburgh to Cardiff since I had never heard of Cardiff and certainly didn’t know where it was. In this my ignorance provided a better result than Lightfoot’s residence in those two fair cities. The sad thing is that I got about 20% of my points from dumb guesses.

4

Simstim 08.27.04 at 11:59 am

Surely everyone knows there’s 50 states in the US?!? Or that the union of England and Scotland was in 1707? 51% here.

5

John Quiggin 08.27.04 at 12:09 pm

I got 49 %. I won’t complain about the UK bias, since I bombed out completely on the only Australian question (about grocery bags).

But I think the scoring pattern is itself an example of poor estimation. My central estimate was right to within a factor of two on all but a couple of questions, and my ranges were pretty good measures of my uncertainty. I’d say this counts as pretty good estimation, and I assume the same holds true for others, but everyone gets what would normally be regarded as either a fail or a bare pass.

6

Doug 08.27.04 at 12:16 pm

Is 20 percent of the UK adult population really functionally illiterate?

I had guessed Pride and Prejudice at roughly twice its actual length. Reminds me of the review of the movie The Hours. “Hours? It seemed like days.”

Otherwise 44 – not bad for an American taking such an Anglocentric quiz…

7

Chris Lightfoot 08.27.04 at 1:11 pm

I should say that I’m not very happy with the scoring of the quiz. I spent ages searching for a mathematically attractive scoring algorithm which produces reasonable results. Unfortunately I didn’t find one. (Suggestions appreciated.) So the way the scores are computed is a bit ad-hoc.

The quiz is rather anglocentric. Sorry. You do get more points for realistic margins of error — that’s the point! For the historical dates, the quiz is more lenient the for dates further from the present day.

I don’t, by the way, live in either Edinburg or Cardiff. I picked those two cities because they’re fairly well-known and neither of them is London. Because the UK is so centralised, I’d expect people to know the distance from London to either Edinburgh or Cardiff much better than they know the distance between the two latter cities.

8

Ken C. 08.27.04 at 1:13 pm

I was helped by hoping that Tony Benn was a misspelling of Tony Bennett, since they were born a year apart.

I don’t object to questions that reveal my ignorance of geography, literature, and history, but a question depending on the Australian meaning of “carrier bag” is just silly. Would a poor swagman use one as a tucker-bag?

The scoring presumably is larger when the exact answer is in the answer interval, and smaller when the relative width of the interval is large. It would be nice to know.

I assume the “Wisdom of Crowds” would reveal something here.

9

Jason 08.27.04 at 1:57 pm

I got 35%, but all bar 2 of the answers were in my windows (Eiffel tower and UK GDP, I stupidly missed a factor of 10 in each). I was a little miffed to get 0 points for my 500000 +/- 500000 words in Pride and Prejudice and a few others.

A number of answers there look to me to be things where the log would be a much better thing to estimate.

Plenty of people in the US don’t know there are 50 states, let alone foreigners. I know I had 1200+/-700 for the union of Scotland and England (now, if I’d been asked for the number of sheep in NZ, or when the treaty of Waitangi was signed, I’d be fine).

Still fun though.

10

Shai 08.27.04 at 2:25 pm

i got 10 points on the number of stars in the galaxy, but was off an order of magnitude on the length of the river nile and number of carrier bags in austrailia. so sad

then i guessed 1100 +- 200 for the union of scotland and england, and pretty much bombed every other UK question other than latitude

but I did get the magna carta, un, all the science questions right (7+), so I can live in pleasant ignorance, not knowing what I’d receive if the questions were not so UK centric

11

Shai 08.27.04 at 2:29 pm

“but a question depending on the Australian meaning of “carrier bag” is just silly”

maybe that’s why. I assumed it was identical to “carry on luggage”

12

Steve Carr 08.27.04 at 2:33 pm

I can’t believe the UK has 650 MPs but only 12,000 gas stations. That’s a remarkable contrast to the US.

13

DC 08.27.04 at 2:53 pm

Pah,33%.

14

andrew cooke 08.27.04 at 3:21 pm

the trouble with the scoring is that if you have no idea what a number is, you are often estimating the magnitude. in such cases, it makes more sense to describe the error in the log of the value (or the ratio). so you might let the user enter either a linear error (as you have now) or a factor (within a factor of 2, say).

also, your scoring doesn’t seem to account for the error estimates entered. or maybe i don’t understand what you’re trying to score. if i were you, i would not try to score each answer, but instead calculate the normalised error of each estimate. then score how well that is distributed.

for example, if i estimate x+/-dx and the real value is X, then my normalised error is (x-X)/dx (use logs if ratios are specified). now, if i’m good at estimating, the (x-X)/dx for all the answers, taken as a population, should have zero mean and unit error. you can give a score based on some test for that.

15

dsquared 08.27.04 at 3:22 pm

I maintain that although there are 659 MPs in the House of Commons, only about two hundred of them were elected in 2001 – the rest were reelected.

16

andrew cooke 08.27.04 at 3:23 pm

wow, doesn’t posting take a long time? anyways, forgot to add, you probably want the score to be based on the log of the probability that the distribution is as described, otherwise we’re all going to be scoring near-zero.

17

Ralph Hitchens 08.27.04 at 3:46 pm

40%, which I suppose isn’t too bad for a Yank.

18

Keith M Ellis 08.27.04 at 4:03 pm

Gosh, I thought my 23% was good, given that I’m a USAian and many but the most general UK-centric questions gave me much trouble.

But compared to the scores above, I’m pathetic. Darnit. A couple I got wrong but really shouldn’t have. I know, dammit, Earth to Moon is aprox quarter-million miles, but didn’t remember to think about it that way and so estimated half that. And I typed in 5,000,000 for words in Austen when I meant 500,000—still wrong, but not, you know, insanely wrong. :)

19

Anita Hendersen 08.27.04 at 4:04 pm

45%. I didn’t understand the “carrier bag” either and thought it referred to mailed packages.

And 20% of the UK population is functionally illiterate? I guessed 5%. I thought you Europeans were always lording it over us Americans with how much better you are at providing basic education.

20

Andy 08.27.04 at 4:06 pm

>>Surely everyone knows there’s 50 states in the US?!?

Actually it could be construed as a trick question. Only 46 of the states are actually “States” – Maassachusetts, Pennsylvania, Virginia, and Kentucky are officially “Commonwealths” – the safest answer would have been 50 +/- 4.

21

asg 08.27.04 at 4:12 pm

33% here. Totally whiffed on the Nile (guessed 800 miles, kicked myself afterwards since the Mississippi is that long and the Nile leaves it in the dust). Also whiffed on the Eiffel Tower (good lord, I had no idea it was THAT big!)

22

Sebastian Holsclaw 08.27.04 at 4:50 pm

31%, I completely miffed many of the UK-centric questions and I wasn’t sure what a carrier bag is.

There really should be an ‘I have know idea what you are talking about’ catagory for a couple of questions. I can’t estimate things that aren’t even in my realm of knowledge at all. (Though it did give me some points for my random-wild-ass-guess on Tony Benn.)

23

Chris Lightfoot 08.27.04 at 4:55 pm

The scoring accounts for the uncertainty stated in two ways: firstly, you score more if your estimate ± the uncertainty includes the true answer; secondly, if there is an uncertainty in the answer, your score improves as you get closer to that error. (The latter criterion is there to punish contestants who claim to know the answers better than they are actually known in reality….)

I probably agree that it would be better to base the error on the log of the value (I avoided this for simplicity), and the idea of testing whether the population of errors has zero mean is a good one and unit variance is a good one. I’m not certain how you incorporate information from the real error in the answer in that model, though.

24

dave heasman 08.27.04 at 5:18 pm

I got 52% and was low because of values I thought I knew, and consequently put a small +/- figure on – like I’ve always “known” that light takes 7.5 minutes to reach Earth from the Sun, London’s 50.5 degrees north. Zero points each. Bah.

25

billyfrombelfast 08.27.04 at 5:48 pm

I had never heard of Cardiff

Wow! Poor Wales!

26

Neel Krishnaswami 08.27.04 at 5:51 pm

Chris: this is how I estimated the number of carrier bags used in Australia.

Population of Australia: 20 million.
Each person buys something needing a bag 6 days a week, or 300 times a year.
300 * 20 million = 6000 million.
Add a suitable fudge factor for humility, and there you go.

I got the gas station question wrong, though, because I guessed that cars can go about 10 days between refuelings, and was off by a factor of 2 as a result. That’s what I get for not owning a car. :)

27

Kieran Healy 08.27.04 at 6:01 pm

48 percent for me. Oh well.

28

andrew cooke 08.27.04 at 6:10 pm

I’m not certain how you incorporate information from the real error in the answer in that model, though.

hi. if both values have error estimates then you normalize by the geometric mean. so if i estimate x+/-dx for a value “known” to be X+/-dX then the normalised difference is (x-X)/sqrt(dx*dx+dX*dX) (it’s all classical stats assuming normal distributions etc etc; i’m not sure how you’d go about combining a log-based answer with a linear target, though – probably move the linear target to logs and work there).

i agree that adding factors complicates the user interface. what you might do, although it’s hardly intuitive to the user, is switch to factors if the error is more than half the value. for example, say one of my answers was 500+/1000. you might take that as 500 within a factor of 3 (using the upper bound, since the lower bound is -ve…).

(but we all know the only thing that really matters is getting noticed, provoking debate etc; these are just nerdy details ;o)

29

Paula 08.27.04 at 6:25 pm

35%. Appalling. I do, however, congratulate myself on knowing what a carrier bag is.

30

Chris Lightfoot 08.27.04 at 6:35 pm

Thanks — yes, that makes sense. Perhaps I should do version 2 of the quiz….

(You mean “root mean square” not geometric mean, btw — the geometric mean is sqrt(dxdX), and so is zero in the case of an exact answer — not desirable.)

Interesting user-interface suggestion. I think the hypothetical version 2 could just ask for a percentage error.

31

andrew cooke 08.27.04 at 7:23 pm

yeah, sorry.

32

Dan Simon 08.27.04 at 8:02 pm

What makes the quiz unsatisfying, I think, is that it attempts to combine two completely different skills:

1) Accumulating general quantitative knowledge, and applying it to answering quantitative questions about the physical world; and

2) Estimating the reliability of one’s own knowledge and/or reasoning skills.

These are two completely different abilities, and it’s unclear how they’re weighted in the quiz score. For example, it appears that if you grossly overestimate an answer, but use a 100% error margin to indicate a recognized complete lack of knowledge, then you get zero–just as if you were supremely confident in your wrong answer. Conversely, if you’re accidentally dead on, but overestimate your error, you’re hardly penalized, if at all. (That happened to me a couple of times.)

On the other hand, the test did make me think carefully about how to construct a good estimate. I started off by picking my best estimate and then an error bound, but as I progressed through the questions, I shifted to a strategy of thinking in terms of the extreme ranges of what I thought plausible, and using that as my guide. I think it improved my score, although I was occasionally insufficiently confident in my estimates.

Oh, yes–for what it’s worth, I got 57%.

33

Xavier 08.27.04 at 9:04 pm

This would be a lot more interesting if it kept track of the answers people give. I would be curious to see what the range of guesses are on some of these questions.

34

jam 08.27.04 at 9:28 pm

46%. Which I thought was pretty lousy ’til I read these comments.

The only one I’d really quibble on was the birth of Christ question. The answer Chris gave privileges the Matthian story over the Lucan. Since it’s likely both are pure invention, I don’t see the basis for his preference.

35

Cryptic Ned 08.27.04 at 9:34 pm

42%, without knowing who Tony Benn is, and missing the astronomical questions by at least one order of magnitude, and having my entire knowledge of British history drawn from “Quicksilver” and “God’s Secretaries”. Accuracy on the geographical and populational questions, and a lucky guess on the number of UK counties, helped me out.

Grrr…the answer is 396.9 tonnes…I say “400, plus/minus 400”, and get 0 points.

36

Cryptic Ned 08.27.04 at 9:36 pm

“Conversely, if you’re accidentally dead on, but overestimate your error, you’re hardly penalized, if at all. (That happened to me a couple of times.)”

I don’t know how that could have happened to you, when I guessed “400 +/1 400” for the weight of the Boeing, which was off by less than 1%, and got 0 points.

37

Dan Simon 08.27.04 at 11:02 pm

I think the problem may be that reasonable error ranges are highly dependent on the question. For example, I seem to recall doing very well by answering “50 +/- 5” degrees for the latitude of London–even though ten degrees is a huge range for anyone who knows that London is neither in the arctic nor the tropics. In other questions, of course, accuracy within 10% would be pretty impressive.

38

vivian 08.28.04 at 2:38 am

From the brief blurb “how well you know what you don’t know” I thought that some of the scoring would be on how well you choose margins of error-in-guessing. That is, no matter how large the error, or random the guess, if the respondent is at least aware of how wide to make the confidence interval, that would count relatively more than being right with the guess. Not sure exactly how that would be done, but it ain’t done here. For the Tony Benn question I guessed 1925+/-10 and the answer was 1925 exactly, and you took off a point, as though you think I thought his mother’s labor lasted about ten years when it was mere hours. In other words, your margins appear to be inherent in our current ability to measure the quantity, not in the test-takers awareness of the vagueness of their recollections.

Orders of magnitude also should count for more, esp. with the numbers that aren’t dates. Getting within a factor of ten is good enough for a Friday night not cheating, while getting within a factor of two is pretty good for a casually informed guess. Getting within one’s own margin of error, close to the center of it, is good regardless of one’s knowledge of the facts, but narrower ranges are good too. Perhaps these are best brought out with different dimensions of scoring – or maybe you don’t really care about how well we estimate our own uncertainties, but then you might consider changing the cover blurb.

I’m pleased with 36% though, as a merkin.

39

plover 08.28.04 at 9:12 am

50% for me. 3 of those percent are directly attributable to Glenda Jackson, who happened to mention the number of MPs in a radio interview a few days ago.

Being a Yank and thus not having a clue who Tony Benn might be, but deciding that the name couldn’t go too far back I thought 1800±200 quite reasonable, but I got 0 points. However, my wild guess of 2000±1000 for the UK GDP got 3 points even though the actual answer isn’t even in the range. Do you suppose the scoring system assumes you know that Tony Benn is a recent figure? (And maybe also that he’s an adult — I set my range to include the idea that he might be a child.)

The quiz now says that carrier bag means shopping bag, which I assume implies it’s been altered since most people here took it.

40

Tracy 08.28.04 at 12:37 pm

I was puzzled by working out the error bands for the UK asylum seekers benefit. I vaguely recalled reading that it was about £40 a week, so that was my best guess. But, when it came to error bands, obviously asylum benefits can vary upwards without limit and without breaking the laws of arithmetic, but if the UK started charging asylum seekers for staying in the country it’d be called a tax or a fee, not a benefit. Which makes for a lop-sided error band, but you can’t specify that in the question. So I gave a wide band and got a 0, despite being very close to the right answer.

Anyway, 37%. Not helped by me thinking of things in metric and then forgetting to convert. (700 k is roughly the distance between major NZ cities, it would have been a better guess if I’d turned that into miles)

Incidentally, the Magna Carta was signed several times by various kings of England. To be pedantic, the question should be specified as when it was first signed.

Tracy

41

M Kochin 08.28.04 at 7:45 pm

58% Would have been better if I hadn’t read Tony Blair for Tony Benn :>(

http://roughly.beasts.org/scripts/quiz?_eq_web_session=a37a1095bdd1b918

42

Joshua W. Burton 08.29.04 at 6:51 am

_“Conversely, if you’re accidentally dead on, but overestimate your error, you’re hardly penalized, if at all. (That happened to me a couple of times.)”_

_I don’t know how that could have happened to you, when I guessed “400 +/1 400” for the weight of the Boeing, which was off by less than 1%, and got 0 points._

I answered 4 +/- 130 BCE for Jesus’s birth, fully expecting to be penalized for being a smartass, but got full marks. On the other hand, 800 +/- 700 billion stars (another “too correct for the quiz” answer I was proud of) scored nil. In general, it’s awkward to mix intrinsic errors with measurement uncertainties, since they are statistically independent and therefore add in quadrature. Perhaps questions like these that don’t have real answers (within an order of magnitude) should be stricken from the next iteration.

43

David Margolies 08.30.04 at 8:02 pm

45%. I would have done much better (pehaps 9 more points) if I said the English Civil War started in 1641+-1 rather than 1641 exactly (answer 1642).

Comments on this entry are closed.