August 27, 2004

Estimation

Posted by Chris

I managed a mere 39 per cent on Chris Lightfoot’s estimation quiz I’m sorry to say. Instructive and entertaining it is though. (Hat-tip Dave Weeden ).

Posted on August 27, 2004 11:09 AM UTC
Comments

39 (me too) may be better than it looks. Does anybody really know how many carrier bags are used is Aus, and if so, why?

Got the size of the HoC wrong though. Deep shame.

Posted by chris · August 27, 2004 11:34 AM

I don’t exactly know what carrier bags are (grocery bags?). And I think I should get points if I guess English History dates within a hundred years (Would it have made a difference if I had widened my MOE?) And who is Tony Benn and should I feel stupid for not knowing? A disappointing 34%

Posted by LowLife · August 27, 2004 11:41 AM

My HoC guess of 600 +/- somenumber got me some points though I have no reason to know that. My best guess, though, was the 9 points I got guessing the distance from Edinburgh to Cardiff since I had never heard of Cardiff and certainly didn’t know where it was. In this my ignorance provided a better result than Lightfoot’s residence in those two fair cities. The sad thing is that I got about 20% of my points from dumb guesses.

Posted by LowLife · August 27, 2004 11:51 AM

Surely everyone knows there’s 50 states in the US?!? Or that the union of England and Scotland was in 1707? 51% here.

Posted by Simstim · August 27, 2004 11:59 AM

I got 49 %. I won’t complain about the UK bias, since I bombed out completely on the only Australian question (about grocery bags).

But I think the scoring pattern is itself an example of poor estimation. My central estimate was right to within a factor of two on all but a couple of questions, and my ranges were pretty good measures of my uncertainty. I’d say this counts as pretty good estimation, and I assume the same holds true for others, but everyone gets what would normally be regarded as either a fail or a bare pass.

Posted by John Quiggin · August 27, 2004 12:09 PM

Is 20 percent of the UK adult population really functionally illiterate?

I had guessed Pride and Prejudice at roughly twice its actual length. Reminds me of the review of the movie The Hours. “Hours? It seemed like days.”

Otherwise 44 - not bad for an American taking such an Anglocentric quiz…

Posted by Doug · August 27, 2004 12:16 PM

I should say that I’m not very happy with the scoring of the quiz. I spent ages searching for a mathematically attractive scoring algorithm which produces reasonable results. Unfortunately I didn’t find one. (Suggestions appreciated.) So the way the scores are computed is a bit ad-hoc.

The quiz is rather anglocentric. Sorry. You do get more points for realistic margins of error — that’s the point! For the historical dates, the quiz is more lenient the for dates further from the present day.

I don’t, by the way, live in either Edinburg or Cardiff. I picked those two cities because they’re fairly well-known and neither of them is London. Because the UK is so centralised, I’d expect people to know the distance from London to either Edinburgh or Cardiff much better than they know the distance between the two latter cities.

Posted by Chris Lightfoot · August 27, 2004 01:11 PM

I was helped by hoping that Tony Benn was a misspelling of Tony Bennett, since they were born a year apart.

I don’t object to questions that reveal my ignorance of geography, literature, and history, but a question depending on the Australian meaning of “carrier bag” is just silly. Would a poor swagman use one as a tucker-bag?

The scoring presumably is larger when the exact answer is in the answer interval, and smaller when the relative width of the interval is large. It would be nice to know.

I assume the “Wisdom of Crowds” would reveal something here.

Posted by Ken C. · August 27, 2004 01:13 PM

I got 35%, but all bar 2 of the answers were in my windows (Eiffel tower and UK GDP, I stupidly missed a factor of 10 in each). I was a little miffed to get 0 points for my 500000 +/- 500000 words in Pride and Prejudice and a few others.

A number of answers there look to me to be things where the log would be a much better thing to estimate.

Plenty of people in the US don’t know there are 50 states, let alone foreigners. I know I had 1200+/-700 for the union of Scotland and England (now, if I’d been asked for the number of sheep in NZ, or when the treaty of Waitangi was signed, I’d be fine).

Still fun though.

Posted by Jason · August 27, 2004 01:57 PM

i got 10 points on the number of stars in the galaxy, but was off an order of magnitude on the length of the river nile and number of carrier bags in austrailia. so sad

then i guessed 1100 +- 200 for the union of scotland and england, and pretty much bombed every other UK question other than latitude

but I did get the magna carta, un, all the science questions right (7+), so I can live in pleasant ignorance, not knowing what I’d receive if the questions were not so UK centric

Posted by Shai · August 27, 2004 02:25 PM

“but a question depending on the Australian meaning of “carrier bag” is just silly”

maybe that’s why. I assumed it was identical to “carry on luggage”

Posted by Shai · August 27, 2004 02:29 PM

I can’t believe the UK has 650 MPs but only 12,000 gas stations. That’s a remarkable contrast to the US.

Posted by Steve Carr · August 27, 2004 02:33 PM

Pah,33%.

Posted by DC · August 27, 2004 02:53 PM

the trouble with the scoring is that if you have no idea what a number is, you are often estimating the magnitude. in such cases, it makes more sense to describe the error in the log of the value (or the ratio). so you might let the user enter either a linear error (as you have now) or a factor (within a factor of 2, say).

also, your scoring doesn’t seem to account for the error estimates entered. or maybe i don’t understand what you’re trying to score. if i were you, i would not try to score each answer, but instead calculate the normalised error of each estimate. then score how well that is distributed.

for example, if i estimate x+/-dx and the real value is X, then my normalised error is (x-X)/dx (use logs if ratios are specified). now, if i’m good at estimating, the (x-X)/dx for all the answers, taken as a population, should have zero mean and unit error. you can give a score based on some test for that.

Posted by andrew cooke · August 27, 2004 03:21 PM

I maintain that although there are 659 MPs in the House of Commons, only about two hundred of them were elected in 2001 - the rest were reelected.

Posted by dsquared · August 27, 2004 03:22 PM

wow, doesn’t posting take a long time? anyways, forgot to add, you probably want the score to be based on the log of the probability that the distribution is as described, otherwise we’re all going to be scoring near-zero.

Posted by andrew cooke · August 27, 2004 03:23 PM

40%, which I suppose isn’t too bad for a Yank.

Posted by Ralph Hitchens · August 27, 2004 03:46 PM

Gosh, I thought my 23% was good, given that I’m a USAian and many but the most general UK-centric questions gave me much trouble.

But compared to the scores above, I’m pathetic. Darnit. A couple I got wrong but really shouldn’t have. I know, dammit, Earth to Moon is aprox quarter-million miles, but didn’t remember to think about it that way and so estimated half that. And I typed in 5,000,000 for words in Austen when I meant 500,000—still wrong, but not, you know, insanely wrong. :)

Posted by Keith M Ellis · August 27, 2004 04:03 PM

45%. I didn’t understand the “carrier bag” either and thought it referred to mailed packages.

And 20% of the UK population is functionally illiterate? I guessed 5%. I thought you Europeans were always lording it over us Americans with how much better you are at providing basic education.

Posted by Anita Hendersen · August 27, 2004 04:04 PM

>>Surely everyone knows there’s 50 states in the US?!?

Actually it could be construed as a trick question. Only 46 of the states are actually “States” - Maassachusetts, Pennsylvania, Virginia, and Kentucky are officially “Commonwealths” - the safest answer would have been 50 +/- 4.

Posted by Andy · August 27, 2004 04:06 PM

33% here. Totally whiffed on the Nile (guessed 800 miles, kicked myself afterwards since the Mississippi is that long and the Nile leaves it in the dust). Also whiffed on the Eiffel Tower (good lord, I had no idea it was THAT big!)

Posted by asg · August 27, 2004 04:12 PM

31%, I completely miffed many of the UK-centric questions and I wasn’t sure what a carrier bag is.

There really should be an ‘I have know idea what you are talking about’ catagory for a couple of questions. I can’t estimate things that aren’t even in my realm of knowledge at all. (Though it did give me some points for my random-wild-ass-guess on Tony Benn.)

Posted by Sebastian Holsclaw · August 27, 2004 04:50 PM

The scoring accounts for the uncertainty stated in two ways: firstly, you score more if your estimate ± the uncertainty includes the true answer; secondly, if there is an uncertainty in the answer, your score improves as you get closer to that error. (The latter criterion is there to punish contestants who claim to know the answers better than they are actually known in reality….)

I probably agree that it would be better to base the error on the log of the value (I avoided this for simplicity), and the idea of testing whether the population of errors has zero mean is a good one and unit variance is a good one. I’m not certain how you incorporate information from the real error in the answer in that model, though.

Posted by Chris Lightfoot · August 27, 2004 04:55 PM

I got 52% and was low because of values I thought I knew, and consequently put a small +/- figure on - like I’ve always “known” that light takes 7.5 minutes to reach Earth from the Sun, London’s 50.5 degrees north. Zero points each. Bah.

Posted by dave heasman · August 27, 2004 05:18 PM

I had never heard of Cardiff

Wow! Poor Wales!

Posted by billyfrombelfast · August 27, 2004 05:48 PM

Chris: this is how I estimated the number of carrier bags used in Australia.

Population of Australia: 20 million.
Each person buys something needing a bag 6 days a week, or 300 times a year.
300 * 20 million = 6000 million.
Add a suitable fudge factor for humility, and there you go.

I got the gas station question wrong, though, because I guessed that cars can go about 10 days between refuelings, and was off by a factor of 2 as a result. That’s what I get for not owning a car. :)

Posted by Neel Krishnaswami · August 27, 2004 05:51 PM

48 percent for me. Oh well.

Posted by Kieran Healy · August 27, 2004 06:01 PM

I’m not certain how you incorporate information from the real error in the answer in that model, though.

hi. if both values have error estimates then you normalize by the geometric mean. so if i estimate x+/-dx for a value “known” to be X+/-dX then the normalised difference is (x-X)/sqrt(dx*dx+dX*dX) (it’s all classical stats assuming normal distributions etc etc; i’m not sure how you’d go about combining a log-based answer with a linear target, though - probably move the linear target to logs and work there).

i agree that adding factors complicates the user interface. what you might do, although it’s hardly intuitive to the user, is switch to factors if the error is more than half the value. for example, say one of my answers was 500+/1000. you might take that as 500 within a factor of 3 (using the upper bound, since the lower bound is -ve…).

(but we all know the only thing that really matters is getting noticed, provoking debate etc; these are just nerdy details ;o)

Posted by andrew cooke · August 27, 2004 06:10 PM

35%. Appalling. I do, however, congratulate myself on knowing what a carrier bag is.

Posted by Paula · August 27, 2004 06:25 PM

Thanks — yes, that makes sense. Perhaps I should do version 2 of the quiz….

(You mean “root mean square” not geometric mean, btw — the geometric mean is sqrt(dxdX), and so is zero in the case of an exact answer — not desirable.)

Interesting user-interface suggestion. I think the hypothetical version 2 could just ask for a percentage error.

Posted by Chris Lightfoot · August 27, 2004 06:35 PM

yeah, sorry.

Posted by andrew cooke · August 27, 2004 07:23 PM

What makes the quiz unsatisfying, I think, is that it attempts to combine two completely different skills:

1) Accumulating general quantitative knowledge, and applying it to answering quantitative questions about the physical world; and

2) Estimating the reliability of one’s own knowledge and/or reasoning skills.

These are two completely different abilities, and it’s unclear how they’re weighted in the quiz score. For example, it appears that if you grossly overestimate an answer, but use a 100% error margin to indicate a recognized complete lack of knowledge, then you get zero—just as if you were supremely confident in your wrong answer. Conversely, if you’re accidentally dead on, but overestimate your error, you’re hardly penalized, if at all. (That happened to me a couple of times.)

On the other hand, the test did make me think carefully about how to construct a good estimate. I started off by picking my best estimate and then an error bound, but as I progressed through the questions, I shifted to a strategy of thinking in terms of the extreme ranges of what I thought plausible, and using that as my guide. I think it improved my score, although I was occasionally insufficiently confident in my estimates.

Oh, yes—for what it’s worth, I got 57%.

Posted by Dan Simon · August 27, 2004 08:02 PM

This would be a lot more interesting if it kept track of the answers people give. I would be curious to see what the range of guesses are on some of these questions.

Posted by Xavier · August 27, 2004 09:04 PM

46%. Which I thought was pretty lousy ‘til I read these comments.

The only one I’d really quibble on was the birth of Christ question. The answer Chris gave privileges the Matthian story over the Lucan. Since it’s likely both are pure invention, I don’t see the basis for his preference.

Posted by jam · August 27, 2004 09:28 PM

42%, without knowing who Tony Benn is, and missing the astronomical questions by at least one order of magnitude, and having my entire knowledge of British history drawn from “Quicksilver” and “God’s Secretaries”. Accuracy on the geographical and populational questions, and a lucky guess on the number of UK counties, helped me out.

Grrr…the answer is 396.9 tonnes…I say “400, plus/minus 400”, and get 0 points.

Posted by Cryptic Ned · August 27, 2004 09:34 PM

“Conversely, if you’re accidentally dead on, but overestimate your error, you’re hardly penalized, if at all. (That happened to me a couple of times.)”

I don’t know how that could have happened to you, when I guessed “400 +/1 400” for the weight of the Boeing, which was off by less than 1%, and got 0 points.

Posted by Cryptic Ned · August 27, 2004 09:36 PM

I think the problem may be that reasonable error ranges are highly dependent on the question. For example, I seem to recall doing very well by answering “50 +/- 5” degrees for the latitude of London—even though ten degrees is a huge range for anyone who knows that London is neither in the arctic nor the tropics. In other questions, of course, accuracy within 10% would be pretty impressive.

Posted by Dan Simon · August 27, 2004 11:02 PM

From the brief blurb “how well you know what you don’t know” I thought that some of the scoring would be on how well you choose margins of error-in-guessing. That is, no matter how large the error, or random the guess, if the respondent is at least aware of how wide to make the confidence interval, that would count relatively more than being right with the guess. Not sure exactly how that would be done, but it ain’t done here. For the Tony Benn question I guessed 1925+/-10 and the answer was 1925 exactly, and you took off a point, as though you think I thought his mother’s labor lasted about ten years when it was mere hours. In other words, your margins appear to be inherent in our current ability to measure the quantity, not in the test-takers awareness of the vagueness of their recollections.

Orders of magnitude also should count for more, esp. with the numbers that aren’t dates. Getting within a factor of ten is good enough for a Friday night not cheating, while getting within a factor of two is pretty good for a casually informed guess. Getting within one’s own margin of error, close to the center of it, is good regardless of one’s knowledge of the facts, but narrower ranges are good too. Perhaps these are best brought out with different dimensions of scoring - or maybe you don’t really care about how well we estimate our own uncertainties, but then you might consider changing the cover blurb.

I’m pleased with 36% though, as a merkin.

Posted by vivian · August 28, 2004 02:38 AM

50% for me. 3 of those percent are directly attributable to Glenda Jackson, who happened to mention the number of MPs in a radio interview a few days ago.

Being a Yank and thus not having a clue who Tony Benn might be, but deciding that the name couldn’t go too far back I thought 1800±200 quite reasonable, but I got 0 points. However, my wild guess of 2000±1000 for the UK GDP got 3 points even though the actual answer isn’t even in the range. Do you suppose the scoring system assumes you know that Tony Benn is a recent figure? (And maybe also that he’s an adult — I set my range to include the idea that he might be a child.)

The quiz now says that carrier bag means shopping bag, which I assume implies it’s been altered since most people here took it.

Posted by plover · August 28, 2004 09:12 AM

I was puzzled by working out the error bands for the UK asylum seekers benefit. I vaguely recalled reading that it was about £40 a week, so that was my best guess. But, when it came to error bands, obviously asylum benefits can vary upwards without limit and without breaking the laws of arithmetic, but if the UK started charging asylum seekers for staying in the country it’d be called a tax or a fee, not a benefit. Which makes for a lop-sided error band, but you can’t specify that in the question. So I gave a wide band and got a 0, despite being very close to the right answer.

Anyway, 37%. Not helped by me thinking of things in metric and then forgetting to convert. (700 k is roughly the distance between major NZ cities, it would have been a better guess if I’d turned that into miles)

Incidentally, the Magna Carta was signed several times by various kings of England. To be pedantic, the question should be specified as when it was first signed.

Tracy

Posted by Tracy · August 28, 2004 12:37 PM

58% Would have been better if I hadn’t read Tony Blair for Tony Benn :>(

http://roughly.beasts.org/scripts/quiz?_eq_web_session=a37a1095bdd1b918

Posted by M Kochin · August 28, 2004 07:45 PM

“Conversely, if you’re accidentally dead on, but overestimate your error, you’re hardly penalized, if at all. (That happened to me a couple of times.)”

I don’t know how that could have happened to you, when I guessed “400 +/1 400” for the weight of the Boeing, which was off by less than 1%, and got 0 points.

I answered 4 +/- 130 BCE for Jesus’s birth, fully expecting to be penalized for being a smartass, but got full marks. On the other hand, 800 +/- 700 billion stars (another “too correct for the quiz” answer I was proud of) scored nil. In general, it’s awkward to mix intrinsic errors with measurement uncertainties, since they are statistically independent and therefore add in quadrature. Perhaps questions like these that don’t have real answers (within an order of magnitude) should be stricken from the next iteration.

Posted by Joshua W. Burton · August 29, 2004 06:51 AM

45%. I would have done much better (pehaps 9 more points) if I said the English Civil War started in 1641+-1 rather than 1641 exactly (answer 1642).

Posted by David Margolies · August 30, 2004 08:02 PM
Followups

→ A good guess? Who knows?.
Excerpt: Via Crooked Timber comes a fascinating Estimation Quiz, which measures both esoteric knowledge (warning: pretty UK-centric!) and our ability to gauge the precision of our knowledge - as well as our bizarre misconceptions. I'm not embarrassed by my own ...Read more at Wax Banks
→ RE: Estimation.
Excerpt: Read more at verns blog
→ Attention ETS.
Excerpt: Chris Lightfoot's Estimation Quiz is the most worthwhile online quiz I've ever taken. The task is to make educated guesses about obscure magnitudes and quantities. For example, the test might ask you to estimate the number of goldfish in Sweden,Read more at Majikthise
→ The weirdness of crowds.
Excerpt: So, many thanks to the thousands of people who have now completed my Estimation Quiz. Special thanks to Michael Williams, who posted a link to del.icio.us, Dave Weeden, Chris Bertram of Crooked Timber, Nick Barlow, Chris Brooke, and many others for lin...Read more at Chris Lightfoot's web log

This discussion has been closed. Thanks to everyone who contributed.