Outliers

by Kieran Healy on July 14, 2007

By now you’ve probably all seen this ridiculous graphic from todays’ WSJ, which purports to show that the Laffer curve is somehow related to the data points on the figure. Brad DeLong, Kevin Drum, Matt Yglesias, Mark Thoma and Max Sawicky have all rightly had a good old laugh at it, because it’s spectacularly dishonest and stupid. I just want to make a point about so-called outlying cases, like Norway.

In discussion threads about this kind of thing, you’ll find people saying stuff like, “I want to see a line showing x z or z”, or “I want to know what happens when you …”, and very often they’ll add “excluding outliers like Norway from the analysis.” Now, it’s true that in this plot Norway is very unlike the other countries. It’s also true that if you run regressions with data like this and don’t look at any plots while you do it then you will probably be misled by your coefficients, because some observations (like Norway) may have too much leverage or influence in the calculations. In this sense it’s important to take “outliers” into consideration.

But when your data set consists of just 18 or 25 advanced industrial democracies and your goal is to assess the empirical support for some alleged economic law, then you should be careful about tossing around the concept of “outlier.” In an important sense, Norway isn’t an outlier at all. It’s a real country, with a government and an economy and everything. Clearly they are doing something up there in the fjords to push the observed value up to the top of the graph. Maybe you don’t know what that is, but you shouldn’t just label it an outlying case and throw it away, at least not without re-specifying the scope of your question.

Dropping outlying observations in regressions used to be standard practice and is still pretty common. In his post on the topic, Max estimates a regression with the data, and he throws out “those annoying communist outliers at the top,” Norway and Luxembourg first (mostly to throw a bone to his opponents, I think). But as he notes, if you include them there’s no significant linear relationship between the corporate tax rate and and corporate revenue as a percentage of GDP. And this is the substantive issue. Cross-national data—especially when confined to OECD countries—often show surprisingly weak or non-existent evidence of supposedly strong theoretical trade-offs. That’s in part because there are often some annoying countries (like Norway or wherever) that cheerfully occupy the wrong place on the scatterplot, thereby making trouble for your perfectly nice generalization. Of course you can reasonably say something like “In the liberal democracies …” or “Excluding the corporatist countries …” or “Leaving aside the goddamn Scandinavians who have messed everything up again …” These days you can also use methods which incorporate information from all cases, but are resistant to letting one or two bits of data mess up your estimates. But what you really shouldn’t do—especially when the cases are in other respects quite similar, such as all being functioning, rich capitalist democracies—is label entire countries as “outliers” in order to remove them from your analysis, and then pretend that this has made them disappear from the face of the earth, too.

{ 4 trackbacks }

Easily Distracted » Blog Archive » Laugher Curve
07.14.07 at 1:40 pm
The stuff of legend [Pharyngula] | Cole Blog Network dot com
07.15.07 at 2:13 am
Crooked Timber » » Dept of Being Savaged by a Dead Sheep
07.15.07 at 8:46 pm
outliers - is it the data or the theory? « falling upstairs
07.19.07 at 12:07 am

{ 35 comments }

1

John Quiggin 07.14.07 at 4:59 am

Including Norway doesn’t, by itself, give you an insignificant relationship (at least I haven’t checked, but eyeballing it, I’m pretty confident). In fact the coefficient obviously has to be higher, though the standard error will also increase, and I think I could probably prove that this can’t lead to a loss of significance.

Checking on Max’s post, he says it’s insignificant only if you also allow a non-zero intercept, which is not very sensible in this case. The one sensible bit of the Laffer curve is the point that zero tax rates imply zero revenue.

2

Kieran Healy 07.14.07 at 5:05 am

Including Norway doesn’t, by itself, give you an insignificant relationship (at least I haven’t checked, but eyeballing it, I’m pretty confident).

Yeah, it’s not that that was bugging me, so much as the general call to throw out Norway that kept popping up in discussions.

The one sensible bit of the Laffer curve is the point that zero tax rates imply zero revenue.

I think that’s why the UAE was included in the countries used for the figure, as a beacon for the rest of us.

3

engels 07.14.07 at 5:24 am

I think the data better fits the Neo-Laffer curve.

4

Matt Austern 07.14.07 at 5:44 am

The deal with Norway is pretty simple, isn’t it? All it means is that they’re getting most of their government revenue from some source other than corporate income tax.

You might suspect, just by deductive logic, that Norway has chosen to have relatively high individual income tax rates and relatively low corporate income tax rates. There’s nothing illogical about that, and it doesn’t have any particularly deep lesson.

Or, of course, you could just spend a few seconds on Google and find out that Norway gets a lot of money from North Sea oil (it works out to 20% of their GDP), so not all of their government revenue comes from taxes in the first place.

Doesn’t have much implication for US tax policy, anyway.

5

Alan K. Henderson 07.14.07 at 6:17 am

Taxes aren’t the only factor that affect GDP, after all. Just ask the Heritage Foundation – it can name nine more, and will tell you that those factors are not equal in all countries.

To test the Laffer Curve, it would be better to compare time series data within a single country during a time in which tax rates fluctuate significantly but the other nine economic freedom factors remain fairly constant.

6

Tom Ames 07.14.07 at 6:31 am

This graph will be reproduced in every undergraduate statistics and graphic design textbook for eternity. It’s an instant classic!

7

Elliott Oti 07.14.07 at 6:41 am

The deal with Norway is pretty simple, isn’t it? All it means is that they’re getting most of their government revenue from some source other than corporate income tax [...] North Sea oil (it works out to 20% of their GDP), so not all of their government revenue comes from taxes in the first place.

The label on the vertical axis is tax revenue as a percentage of GDP. Not as a percentage of total government revenue.

Norway’s position is, to me, surprising because North Sea oil revenue would imply that taxes should constitute a smaller proportion of GDP, as illustrated by the United Arab Emirates’ position.

Maybe Norway’s oil resources are extracted by private corporations as opposed to the joint ventureships or oil-field leasing constructions found in other countries, leading Norwegian oil revenue to be collected in the form of taxation as opposed to direct revenue in the case of countries like the UAE.

8

John Quiggin 07.14.07 at 7:40 am

I assume from the left-axis range 0-10 per cent of GDP, it’s supposed to be corporate tax revenues, not all revenues. I imagine Norway’s outlier position has something to do with oil, but it’s not clear what.

Of course, whatever you do with Norway, it doesn’t change the fact that Hassett, WSJ and AEI are absolutely full of it. To think that even five years ago, people took AEI seriously.

9

Elliott Oti 07.14.07 at 7:53 am

I assume from the left-axis range 0-10 per cent of GDP, it’s supposed to be corporate tax revenues, not all revenues

I would have thought that was obvious given the United Arab Emirates’ location on the graph.

10

Sebastian Holsclaw 07.14.07 at 8:48 am

“Clearly they are doing something up there in the fjords to push the observed value up to the top of the graph.”

But we know what they are right? So unless you can duplicate the oil….

11

Sebastian Holsclaw 07.14.07 at 8:49 am

Which is not to say that the graph is anything but silly.

12

David 07.14.07 at 10:53 am

“I imagine Norway’s outlier position has something to do with oil, but it’s not clear what.”
A commenter on Brad DeLong’s site reports that Norway imposes a 50 percent surcharge on corporations engaged in oil and gas production. Definitely a unique approach, but in actual revenue generation it only differs from what I gather to be the UAE approach in how it’s entered on the books. It’s really a severance tax, not a corporate income tax per se. In that regard, Norway is a true outlier.

13

Bernard Yomtov 07.14.07 at 2:27 pm

But this graph concerns corporate tax rates only. Doesn’t that make it particularly silly, regardless of any regressions you choose to run? After all, countries make difeent decisions as to how the tax burden is to be divided among corporations, individuals, etc. So why the corporate tax level in isolation ought to tell us anything at all about the effect of tax rates on revenues is unclear.

14

lemuel pitkin 07.14.07 at 3:19 pm

Cross-national data—especially when confined to OECD countries—often show surprisingly weak or non-existent evidence of supposedly strong theoretical trade-offs.

Shouldn’t “often” here be “always”? What are the theoretical tradeoffs that do show up consistently?

(If there was one thing I learned from a couple years of econ graduate work, it’s that one should never, ever do cross-national regressions. Pure wankery.)

15

Miracle Max 07.14.07 at 3:46 pm

The vertical axis is corporate tax revenues/GDP.

I don’t know if Kevin H. drew that line on the graph. I suspect he did not. He supplied the data points, similar to what I picked up from the OECD site. I’ve posted the numbers on my own site for any interested.

16

Norwegian 07.14.07 at 4:22 pm

“Maybe Norway’s oil resources are extracted by private corporations as opposed to the joint ventureships or oil-field leasing constructions found in other countries, leading Norwegian oil revenue to be collected in the form of taxation as opposed to direct revenue in the case of countries like the UAE.”

Bingo!

Norway does have substantial direct revenue as majority shareholder in one of the biggest oil companies operating in the North Sea, StatoilHydro. But private companies are also very active, and they pay 78 % taxes (which they don’t like but fortunately haven’t been able to do anything about), accounting for a huge part of the state’s corporate tax revenues.

17

abb1 07.14.07 at 5:04 pm

If the US corporate taxes gross 10% of GDP like they do in Norway – I’m sure these AEI guys would’ve been screaming bloody murder no matter what the tax rate.

But the UAE has managed to achieve perfection! Fuck judeochristianity – go islam!

18

notsneaky 07.14.07 at 5:13 pm

Slope of the regression line (with 0 intercept) w/ Norway and Luxembourg is .1. Slope of the regression line w/o ‘em is .09. In both cases the Rsqr is just ridiculous.

Fitting a polynomial trend, w/ N and L you get
Rev = .27*t-.0051*t^2, R^2=.08

w/o N and L you get
Rev = .21*t-.0035*t^2, R^2=.07

So just looking at the point estimates I don’t think you can really call Norway and Luxembourg “outliers”.

In the linear case the coeffs are stat sign. In the polynomial case the second coeff (the one that that’s supposed to make the Laffer curve “a curve”) is stat sig if you exclude N and L but not stat sig if you include’em. But the Law of Large Numbers just called and said something about “at least 30 observations”.

19

MattF 07.14.07 at 5:35 pm

Just to go on a teeny bit more about the statistics here… It’s OK to that declare a data point is an outlier as long as you have some set of independent tests that the point fails. “It’s an outlier because it doesn’t fit my theoretical curve” is, of course, not good enough. But “It’s an outlier because it fails these three tests…” probably is good enough.

20

dearieme 07.14.07 at 8:40 pm

What are you buggers on about? This sort of data surely doesn’t remotely satisfy any of the requirements for justifying a least squares fit. It’s just baloney. You might as well suppress Norway and the UAE and draw a bloody circle through the rest. Stop pretending to be bloody scientists, for God’s sake.

21

notsneaky 07.14.07 at 8:46 pm

dearieme, that’s exactly the point that everyone’s making.

22

John Quiggin 07.14.07 at 10:02 pm

Of course, you can’t really do anything serious with a data set like this. But this is a blog comments thread, so that isn’t a problem for us.

Given that we’ve started, it might be worth making the McCloskey point here and looking at the actual significance of the estimated coefficient rather than its statistical significance.

If you were a naive tax-gatherer, you would expect the model to be

CorpTaxRev/GDP = CorpTaxRate*CorpProf/GDP

and treat CorpProf/GDP as a constant (no Laffer effects). So, you’d expect the estimated coefficient to be equal to the average share of corporate profits in GDP which in most countries is around 0.1.

Checking at #19 and Bingo!, we have a winner!

23

Luis Alegria 07.14.07 at 10:08 pm

Gentlemen,

The important point here is that corporate income tax revenues seem to have very little relationship to corporate income tax rates. That, at least, should be clear.

And that is a perfectly good point for the WSJ or AEI or in fact any reasonable person to back.

24

notsneaky 07.14.07 at 10:24 pm

John, of course you’re right, but the point was not to actually estimate anything serious, but to figure out how the heck did WJS actually get that curve.

25

engels 07.15.07 at 3:07 am

The Neo-Laffer Curve

26

engels 07.15.07 at 3:08 am

Oh well, trying to put image tags in blog comments is asking a bit much, I suppose. Still, they show up in the preview.

27

Bloix 07.15.07 at 3:29 am

If you go over to Brad DeLong’s blog, you’ll learn that the UAE, which is a federation of indeprendent states, levies no direct taxes of any kind. Its inclusion on the graph is just a bogus way of supplying a left-hand terminus.

As for Norway, in 2001 Norway partially privatized its state-owned oil company, Statoil. Nineteen percent is currently publicly traded. Statoil pays 78% tax on profits (28% regular corporate tax and 50% oil company surcharge). Because it’s such a big part of the economy, the tax on Statoil pushes up the corporate tax/percentage of GDP ratio. (Of course, 81% of the remaining profits also belong to the state, but these aren’t considered corporate tax.)

28

c.l. ball 07.15.07 at 4:10 am

The admonition against excluding outliers is well taken. There’s a cartoon of a scientist drawing a straight line on a chalkboard through a set of data points in the 1st frame, seeing an outlier in the 2nd frame, and erasing it in the 3rd frame.

But beyond this, how would a Laffer curve be relevant to cross-national data of the kind presented here? The Laffer ‘curve’ describes diminishing returns within a given tax system. WSJ has apples and oranges on their curve. I haven’t seen the WSJ editorial, but do they really think the optimal corp. tax rate is 29% across all countries? Their own graph does not provide much support for that — the UK has roughly the same rate as Norway but gets far less corp. tax revenue as a % of GDP.

29

Matt Kuzma 07.15.07 at 9:30 am

I might be going out on a limb here, but I’d say there are basically no statistical trends you can pull from that scatterplot, aside from the obvious stuff like ‘no country has a corporate tax rate of higher than 35 percent’.

One possible reason this graph is so worthlessly messy is that it uses the wrong variable for it’s horizontal scale. Rather than using the regulation tax rate on corporations, it should use the empirical effective tax rate after deductions and the like. Maybe it should use average tax revenue as a fraction of a company’s net income. One way or another, any serious attempt to express tax law on a single dimension needs to take into consideration more than the prescribed tax rate on profits – like qualifying expenses and deductions for example.

30

greensmile 07.15.07 at 12:12 pm

One of the first bits of math I learned as a freshman was the derivation of the Least Squares regression calculation. I have done quite a bit with that in the yeas since. What is that silly line on the graph? “dishonest and stupid” are entirely deserved. I know WSJ is the bible for some business people but after this sorry inept display, who cares if that rag becomes the jewel in Merdeoch’s crown?

31

Rich B. 07.15.07 at 5:06 pm

The sad part is, it is not obvious to me that there ISN’T a curve that could reasonably be derived there — say, heading north to above Canada and then turning downward below Luxembourg.

If they had done that, they could have made their little point, and no one would have noticed, or seen their curve-fitting as terribly dishonest.

Here, they have sacrificed their point on the altar or stupidity.

32

Ben M 07.15.07 at 11:11 pm

I’d like to point out something else. Why did the WSJ not draw their Laffer curve all the way down to the x-axis, at a tax rate of about 33%? Does the theory break down at the high zero-crossing? (Surely not: the existence of both a low and a high zero-crossing is pretty much the only correct thing about it.) Were they unable to constrain that tail of the curve using the data? (Wow, such careful attention to detail!)

No: even the graph’s author must have realized that the curve would look stupid plunging to zero revenue at a 33% tax rate. So he decided to stop drawing the curve, in order to avoid drawing readers’ eyes towards the error.

I think that’s enough to establish—what’s it called? “substantial capacity”? “consciousness of guilt”?

33

Barry 07.16.07 at 1:18 pm

Aside from the many good points raised, there’s a nother – this is not a case of ‘running a curve to the data’. If they had fit anything resembling a smoothed curve ‘through’ the data, it would go ‘through’ the data, not around it.

This curve starts at the lower-left corner, and then heads to Norway, ignoring half the data. Then, for no apparent reason, it dives down into the midst of the data on the right-hand side of the graph.

Pure BS on a whole set of levels. But we knew that from the facts that (a) AEI made it and (b) the WSJ editorial page printed it.

34

Jeff 07.16.07 at 1:43 pm

Yes, it’s clear that the graph of the “Laffer Curve” is in no way generated from these data — its source is elsewhere. It would be fair to say the data contradict that curve, though a local polynomial regression or smoothing would show a small hump in the center, with or without Norway.

35

eudoxis 07.17.07 at 12:09 am

I look at these data and have a hard time seeing any trend. Why start a line at zero? There is no real data point that represents zero or anything near it. It’s as theoretical as the Laffer curve. I see some clustering, that’s all. It’s amazing to me that economists see anything here at all.

Comments on this entry are closed.