Pixels and Pies

by Kieran Healy on August 1, 2007

Via John Gruber I see Anil Dash wondering about the trend toward “square blocks of color … being used to represent percentage-based statistics instead of the traditional pie chart.” Like this.

Squares

I’d seen the one on the left—from a New York Times story about beliefs in the afterlife, and wondered about it, too. The white block in the middle of the Times graphic presumably represents “Don’t Knows” but it is not labeled. This is especially odd in the context of belief in the afterlife, as agnosticism is a recognized point of view and so not equivalent to “Don’t know” answers on other survey questions.

The main problem with this style of presentation is that it uses two dimensions to display unidimensional data. As the graphic on the right, especially, makes clear, the layout of the subcomponents of the graph is arbitrary. Maybe laying out responses on a line is impractical in a newspaper column. This is one reason pie charts are popular, but their problems are well known. (Word to the wise: don’t use them.)

Mosaic plots superficially resemble the ones pictured here, and they are sometimes used to very good effect. But the whole point of a mosaic plot is that it visually represents several categorical variables at once. It’s a picture of an n x n table, in other words, where the sizes of the blocks reflect the cell values in the table. Here’s an example. Even here you have to be careful interpreting the results. But the boxes above take this kind of picture but use it with only one variable, which doesn’t make any sense at all.

{ 38 comments }

1

Joshua W. Burton 08.01.07 at 4:20 am

Er, golden greenish tunnel of light, leading to a brilliant white portal at the vanishing point, hello?

The NYT graphic is after Thurber, not Tufte, and should be so judged.

2

David Wright 08.01.07 at 4:32 am

Okay, I’ll bite. What’s wrong with using a pie chart to represent a distribution across a small number of categories?

Certainly for a scientific paper a graphical representation of such simple data is unnecessary, but for a newspaper article I don’t see the problem.

3

Dan Simon 08.01.07 at 4:57 am

How hard can it be to come up with faithful, compelling graphic representations of data? They manage to do it all the time, over at The Onion.

4

dsquared 08.01.07 at 6:48 am

I agree with David Wright. A pie chart is a perfectly honest chart and the taboo against them is just another example of the way in which self-appointed design gurus get their arbitrary pronouncements taken on far too uncritically. I think the square-blob thing is a consequence of someone somewhere having taken on the fatwa against pie charts.

5

pete 08.01.07 at 8:12 am

I found this example pretty convincing.

6

bad Jim 08.01.07 at 9:01 am

This may not be the most durable link to my conception of a cake chart, um, pie chart, but as Lowell George said, “Thanks! I’ll eat it here!”

Neon Parks, cover for Little Feat, “Sailing Shoes”

7

Stuart 08.01.07 at 9:04 am

Pie charts can be useful to show certain things though.

8

dsquared 08.01.07 at 11:58 am

I utterly disagree with the analysis linked in 5 above. The pie chart conveys the correct impression, which is that all five segments are roughly the same size. The bar chart (particularly sorted by size) leads you to believe that there is something important about the black bar.

9

Kieran Healy 08.01.07 at 11:58 am

Sorry, dsquared, you’re wrong. Relative areas are just much harder to judge than relative length or relative position along a scale. And that’s why it’s not a good idea to use pie charts.

10

Kieran Healy 08.01.07 at 12:06 pm

The pie chart conveys the correct impression, which is that all five segments are roughly the same size. The bar chart (particularly sorted by size) leads you to believe that there is something important about the black bar.

On this point both the bar chart and the pie chart have a problem. Because the example is a completely abstract set of numbers, you don’t know whether all five segments or all five lines really are “about the same size.” There are many cases where the difference between 15 and 23 might be very large or obvious or a big chunk of the observed variation. In other cases it would be a trivial difference. The solution is not to use a pie chart but to use an appropriate baseline for the scale rather than automatically stretching it to zero.

For that reason I don’t use barcharts much either. I prefer dotplots, which use a dot to show the value and don’t bother coloring in from zero. Here’s an example with a lot of cases. Here’s a decent set of comparisons between barcharts and dotplots of the same data, with and without zero baselines.

11

Barry 08.01.07 at 1:08 pm

Thanks, Kieran – that’s a nice set of comparison graphs.

12

Ginger Yellow 08.01.07 at 1:38 pm

I can sort of see the argument about relative area with pie charts, but surely that applies all the more to the newfangled ones highlighted in the post. Sure you can count up the squares, but at that point you might as well just read the raw numbers. For me the pie chart has (in most cases) a good balance between aesthetic appeal and ease of interpetation. When there’s a lot of small data points and a few big ones then it’s less suitable.

And all the above should be read with the understanding that charts perform different functions in different settings – the same data will be (and often should be) presented differently in the NYT and in an academic paper.

13

Kieran Healy 08.01.07 at 1:48 pm

I can sort of see the argument about relative area with pie charts, but surely that applies all the more to the newfangled ones highlighted in the post

Yeah, my point is that the new ones are no good. They’re like pie charts with arbitrarily shaped segments.

14

thag 08.01.07 at 2:26 pm

“They’re like pie charts with arbitrarily shaped segments.”

This is worse than a soulless robot pie!

This is an abomination pie!

god I miss fafblog.

15

Dan Simon 08.01.07 at 2:33 pm

Because the example is a completely abstract set of numbers, you don’t know whether all five segments or all five lines really are “about the same size.”

But aren’t pie charts usually used to represent, not a completely abstract set of numbers, but rather a set of segments of a whole? In that case, it would seem (to me, at least), the relative percentages corresponding to the segments of the pie are fairly well conveyed by this pie chart (and pie charts in general). In contrast, both bar charts and dot plots are very poor at conveying the relationship between any particular segment and the total.

16

Ginger Yellow 08.01.07 at 2:36 pm

One thing to say in the new format’s favour is that you could make a snazzy animated chart with the pieces falling down Tetris-style.

17

Kieran Healy 08.01.07 at 2:42 pm

That’s a reasonable point, Dan. I’d still say there are other ways to represent proportions that let the reader judge more accurately at a glance what the relative quantities are, again because of the length/area thing.

18

Joshua W. Burton 08.01.07 at 3:33 pm

Argh.

Again, the NYT “Tetris” plot is a sight gag. If you keep discussing it deadpan, you’re only giving its sophomoric creator further license to laugh maniacally at her own supposed subtlety, and perpetrate another on us at the next opportunity. Please, stop feeding the troll.

19

Antti Nannimus 08.01.07 at 3:57 pm

Hi,

Anyone who thinks it’s hard to detect which slice of pie is the largest never lived in my family.

Have a nice day,
Antti

20

dsquared 08.01.07 at 4:24 pm

on the other hand, neither the bars nor the dots convey the information that they add up to 100%.

21

Kenny Easwaran 08.01.07 at 4:47 pm

The first place I prominently noticed those tetris plots was in electoral vote maps of the country. There it’s nice, because the shape and location helps you identify the state, and the squares make it clear just how important or not each one is in the election.

22

dsquared 08.01.07 at 4:48 pm

Relative areas are just much harder to judge than relative length or relative position along a scale.

I also don’t think this is true as an unqualified statement. Consider the difference between “exactly half”, “a little bit more than half” and “a little bit less than half”, for example.

23

Kieran Healy 08.01.07 at 5:23 pm

The first place I prominently noticed those tetris plots was in electoral vote maps of the country. There it’s nice, because the shape and location helps you identify the state, and the squares make it clear just how important or not each one is in the election.

Yes, those are cartograms: in that case the shape is not arbitrary because it approximates the geography, just with the relevant quantities taken into account.

I also don’t think this is true as an unqualified statement. Consider the difference between “exactly half”, “a little bit more than half” and “a little bit less than half”, for example.

I think the claim here is that viewers will tend to be more accurate when asked to judge 1-D lengths in this way (is this 25 percent of the total? 40 percent?) than when answering the same question about 2-D areas.

24

eudoxis 08.01.07 at 5:40 pm

Pie charts suffer more from problems in angle judgements and color perception than area perception. Area perception is difficult when the shapes differ, say a square vs. a circle or maps of different countries, not so much when the areas are alike. Area may be presented with clarity in, for example, a shaded bar chart. But, graphs in newspapers are rarely forthright, why should the type of graph matter?

Fidelity, btw, has a mosaic type presentation of market activity that is handy. Rectangles and their size represent companies and market share while number of shares traded is presented in color saturation. Deceiving, maybe, in that it draws attention to actively traded shares, but effective as a graphical presentation, especially with hyperlinks.

25

c.l. ball 08.01.07 at 8:29 pm

While a pie chart my distort visual understanding and bar chart can be clunky, it is easier to list percentages or counts in them than in the dot/plot charts. E.g., a public opinion poll asks whether people strongly or slightly agree or disagree. If the percentages are in the pie wedge (e.g.,38%) it is easier to graphically and numerically see whether more agree or disagree in general with a pie or bar than with a dot chart.

26

leederick 08.01.07 at 8:56 pm

These pixel-clumps have other advantages over pie charts: they make better use of space – as in there’s more diagram to blank page – and straight lines look better than curves when pixelated.

Labelling is likely easier too. And I suppose they’re easier to knock together than pie charts when you’re working to a deadline in Paintshop Pro, which is probably why they’re turning up.

I’m not sure Kieran’s points about arbitraryness is all that strong. The layout of the subcomponents a pie-chart is arbitrary. The layout of cartograms is arbitrary: there’s isn’t a unique approximation to geography.

You have a lot more options with these things, but that doesn’t mean your choice is going to be arbitrary. The data aren’t unidimensional. Some categories are more similar than others, and you can represent that by their locations in 2D space. Basically making them cartograms in category space rather than geographical space. With pie charts and stacked bars you can only do this in 1D space.

27

Nick Caldwell 08.02.07 at 9:48 am

Wow, I sure hope New York Times graphics staff have something better to hand than Paintshop Pro!

Adobe Illustrator has built-in support for pie chart creation, so I’d say it’s actually more time-intensive to build a chart like the ones displayed here.

28

John Quiggin 08.02.07 at 11:30 am

My big objection to pie charts is that they are less useful than the data they illustrate. Take party vote shares as an example. With n=2, the numbers are easy to understand, small differences matter and the pie chart is just a blurry representation of the numbers.

With n>=4, the numbers can be a bit tricky, but the pie chart is even worse. For example, suppose the parties are ordered A,B,C,D and you want to judge whether A+C is a majority. With the numbers you have to do some arithmetic, but with only the chart it’s just about impossible, assuming the combined share is close to 50 per cent.

The ideal case for a pie chart is n=3, when each pair of segments is continuous, so you can easily assess combined shares and so on. In this case only, the pie might be better than the data.

29

Kieran Healy 08.02.07 at 12:25 pm

Coincidentally, Andrew Gelman points to a doozy of a pie chart today, which provides a counterexample to the idea that pie chart segments logically form proportions of a whole.

30

c.l. ball 08.02.07 at 5:53 pm

31

Robert 08.02.07 at 6:31 pm

32

Dan Simon 08.03.07 at 4:58 am

My big objection to pie charts is that they are less useful than the data they illustrate.

That’s true in general of graphic representations of data, insofar as they’re less precise than the data itself.

Take party vote shares as an example.

Why not population statistics by year? They’re both roughly equally poor choices of data to represent using a pie chart.

Look, pie charts are reasonably good at doing two things simultaneously: giving a rough idea of the relative sizes of a collection of numbers, and giving a rough idea of the size of each number compared with the total. Sometimes that’s useful, and lots of times it isn’t.

For example, if the collection of numbers is really small–as in the electoral result example–then a graphic representation isn’t of much use in the first place. And if the numbers are too close together, then a representation that emphasizes their slight differences–such as a bar graph–works much better.

But if you want to represent, say, a budget–in which both the rough relative sizes of pairs of individual components, and the rough relative size of each component compared to the whole, are of interest, then a pie chart seems to me like a fine choice. And–wouldn’t you know it–that’s the kind of thing that pie charts have traditionally been used for.

Now, Kieran, you suggested that even in these cases, you’d rather use something else instead. I’m curious–what would you use? I can imagine, for instance, a rectangle subdivided into segments, as individual bars often are in a bar graph. But I don’t expect it would be any easier to visually compare nearly-equal subsegments of this rectangle than it would nearly-equal pieces of a pie chart. Am I wrong?

33

Robert 08.03.07 at 8:12 am

John Quiggin wrote:

The ideal case for a pie chart is n=3, when each pair of segments is continuous, so you can easily assess combined shares and so on. In this case only, the pie might be better than the data.

In this case, n=3 (viz., %insured publicly, %insured privately, %uninsured) for 50 states. Even here, I don’t think pie charts would work very well.

34

Ares Burger 08.03.07 at 10:03 am

Oh good. I’ve often thought that infomation presented graphically in newspapers should only be accessible to a certain class of autistic.

35

derek 08.04.07 at 10:40 am

dan simon, yes, you’re wrong. The work of researchers like William Cleveland over the past forty years has shown that people (as tested by their ability to actually respond to questions about the data they’re shown) really do visually compare segments of a pie more poorly than areas of a rectangle. There are devices that work even better than areas of a rectangle, but, for what it’s worth, that is the relative position in the hierarchy of pies and rectangles. Pies are the worst.

This isn’t stuff that was arbitrarily made up one afternoon as a hoax to fool the rest of you. The professionals really do know what they’re talking about.

36

Dan Simon 08.05.07 at 4:26 am

Derek–is that irrespective of the way the rectangles are laid out? I was speculating that, say, comparing the sub-rectangles at opposite ends of a segmented rectangle might be even harder than comparing non-adjacent segments of a pie graph.

37

Jonathan Goldberg 08.05.07 at 2:09 pm

Dan:

You’re getting a bit detailed for a comment thread; each answer will only lead to new questions. Derek is right: this is an area in which a lot of painstakingly detailed research has been done. Read Cleveland or Robbins or Tufte (who is my hero, in no small part because his prose is so beautiful). If you present numeric data visually you will communicate more clearly and more elegantly by taking their recommendations to heart.

38

Oskar Shapley 08.05.07 at 9:11 pm

It’s not two dimensions. It’s one dimension, known as “area”.

Comments on this entry are closed.