In Defense of Rumsfeld

by John Q on February 10, 2004

US Secretary of Defense has received general derision for the following rather convoluted statement

Reports that say that something hasn’t happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know

As I’m giving two papers on this general topic in the next couple of days, I feel I should come to his defense on this. Although the language may be tortured, the basic point is both valid and important.

The standard planning procedures recommended in decision theory begin with the assumption that the decisionmaker has foreseen every relevant contingency. Given this assumption, making the right decision is a simple matter of attaching probabilities (or, if you like my rank-dependent generalization of the standard model, decision weights) to each of the contingencies, attaching benefit numbers (utilities) to the contingent outcomes that will arise from a given course of action, then taking a weighted average. Whatever course of outcome yields the best average outcome is the right one to take. In this way, uncertainty about the future can be ‘domesticated’ and reduced to certainty equivalents.

The problem is that, in reality, you can’t foresee all possible contingencies = the ‘unknown unknowns’ Rumsfeld is talking about are precisely these unforeseen contingencies. Some of the time this doesn’t matter. If the unforeseen contingencies tend to cancel each other out, then the course of action recommended by standard decision theory will usually be a pretty good one. But in many contexts, surprises are almost certain to be unpleasant. In such contexts, it’s wise to avoid actions that are optimal for the contingencies under consideration, but are likely to be derailed by anything out of the ordinary. There’s a whole literature on robust decision theory that’s relevant here.

Having defended Rumsfeld, I’d point out that the considerations he refers to provide the case for being very cautious in going to war. Experience shows that decisions to go to war, taken on the basis of careful calculation of the foreseeable consequences, have turned out badly more often than not, and disastrously badly on many occasions. The calculations of the German military leading up to World War I, including the formulation of the Schleiffen plan, provide an ideal example.

Finally, I should mention that I saw a link at the time to a post somewhere that seemed, from the one sentence summary to be making a similar point, but I was too busy too follow it, and can’t now locate it. Anyone who can find it for me gets a free mention in the update.

UpdateAt least one such post has come to my attention, atLanguage Log, along with a useful link to Sylvain Bromberger who has, it seems, written extensively on the theory of ignorance. I will be keen to chase this up.



Scott Martens 02.10.04 at 9:30 am

Daniel Wolpert published a paper back in the 80’s where he proved that no computer simulation could ever predict the physical evolution of a volume of space that contained the computer itself in time to use the information. This is an application of the halting problem and no possible new laws of phsyics or new form of computation, even the development of computers more powerful than Turing machines, can ever beat this result.

D^2 put me onto a paper applying this thesis to financial markets but I forget who wrote it. I’ve been wondering if anyone is going to take this result into decision theory and show that it is impossible to calculate decision weights in any context where outcome probabilites are dependent on the decisions of other agents who will know what decisions you’ve made, or something of the sort. Essentilly, I think it should be possible to rigourously show that decision-making in a world of non-static probabilities and cognitive agents is always a partially blind heuristic search across an error landscape that is provably unmodellable.


mondo dentro 02.10.04 at 1:14 pm

Thanks for this post, John. I’ve been trying to explain to my friends that Rumsfeld’s epistemology, as revealed in this infamous quote, is actually quite sound. I even like his language–it reveals a lot of time spent in engineering circles: plane spoken and crude, yet sophisticated at the same time.

Linking this, as you do, to a critique of the decision to go to war is especially apt–and shows once again just how tragic that decision was. To bad that Rummie didn’t seem to understand the implications of his own analysis.


Shai 02.10.04 at 1:15 pm

scott, you’re assuming by analogy that there’s a combinatorial explosion due to sensitive dependence on initial conditions of relevant factors or the decision procedure itself. but then there’s an open question whether the problem is indeed in that class of computational complexity and if so, is there a sufficient approximation for a model to avoid worst case time complexity?

I’m skeptical myself of some impossibility result for the behavior of cognitive agents in general, but perhaps that’s just because I’m a young student used to working with simple polynomial time algorithms.


cgo 02.10.04 at 1:28 pm

Rumsfeld is a total fraud. I don’t care if somebody sounds ‘smart’ from time to time with a silly statement or if they went to Princeton. If their overall policy is one huge contradiction then they deserve to be discredited and in fact called stupid. Rumsfeld, Cheney, Perle and Wolfowitz are in fact STUPID. Any institution of higher learning that accomodates these clowns in future years deserves to be discredited.


pblsh 02.10.04 at 1:32 pm

Actually, it’s a common aphorism found in business books and repeated often by consultants from at least the 1980s. “It’s not what you don’t know about the market that gets you, it’s what you don’t know you don’t know.”

Regis McKenna often used it, as I remember. My guess is Rumsfeld picked it up from his business experience.


Shai 02.10.04 at 1:41 pm

and the business people in turn picked it up from Clausewitz? lol


Backword Dave 02.10.04 at 1:46 pm

What Scott said.

But I think that Rumsfeld’s quotation is trite; it’s true of every situation. If he’d given examples (at least of known knowns and known unknowns), he might have said something.

However, given that the consequences of a military are more serious that those of a commercial screw-up, wasn’t the decision to leave supplying equipment like boots, body armour, and chemical suits to the last minute (in order to save some cash) particularly unforgivable? And as for provoking and falling out with the UN, who had the best intelligence, well, it’s not the decision-making process that I recommend you teach.


GMT 02.10.04 at 2:17 pm


Scott Martens 02.10.04 at 2:24 pm


Nope – I ain’t assuming any such thing.

What’s really neat about Wolpert’s approach is that it does not depend on the computational complexity of the problem. In fact, it’s independent of both the computational power of the computer and the laws of physics themselves. Wolpert showed that no universe with laws of physics sufficiently rich to support computer simulation of those laws could contain a computer able to predict the outcome of its own calculations before reaching them. This is quite independent of any discussion of the complexity of the laws of physics or the speed of the computer. No possible set of knowable laws of physics in any possible universe can ever predict the future state of a volume of space containing a cognitive agent aware of those laws and capable of using them to make predictions.

I can’t remember the title of the paper, but he makes his case straight from the halting problem to this really very strong result. Furthermore, Wolpert showed that this was not a time complexity problem because a computer could calculate the evolution of a volume of space containing itself, but only at a slower pace than the evolution it is trying to calculate.

It’s really the same paradox as time travel. You go back in time and kill your Grandpa, so you don’t exist to go back in time to kill your Grandpa. That’s a contradiction. Instead, your Grandpa looks in his crystal ball, sees you and kills himself. As a result, you’re never born, so he couldn’t have been able to predict the future, could he?

The paper I can’t remember goes ahead and draws the same conclusions about financial markets. The whole point in doing so is to block the notion that economic laws can have predictive value because as soon as an economic agent who knows those laws and is capable of making predictions with them enters the picture, prediction becomes impossible. Economic laws can only be verified forensically, they can never be used to make certain – or even accurate probabilistic – predictions.

This strikes me as quite immediately applicable to the calculation of decision weights. I suspect one could show that accurately calculating the weights is impossible under those conditions.


Shai 02.10.04 at 2:57 pm

my point had nothing to do with wolpert. everyone knows there are problems that would require a computer outside the universe or a time longer than the heat death of the universe to solve.

your last point about financial markets seems to me to be in the same class as the efficient market hypothesis. my point was that you can’t assume a priori for every problem modelled in a decision procedure that if cognitive agents are involved that there is unstable sensitive dependence on the decisition procedure (things other than information may limit behavior for example), or that the agents comprise a type 3 complex system and therefore there are no regularities that one can easily take advantage of. (it may be the case for any problem x, but I would hypothesize that it doesn’t have to be for every problem x) (and the decision model can be dynamical itself easily enough, no?)

I’m only working with second year maths here, so perhaps I’m missing something I’ll learn in one of my applied maths courses next year, so perhaps you can explain to me why I’m wrong if you think I am.


rea 02.10.04 at 4:01 pm

“a lot of time spent in engineering circles: plane spoken”

Evidently aeronautical engineering circles . . . :)


Michael 02.10.04 at 4:08 pm

General derision? I suppose the critics also don’t like the St. Crispin day speech at the end of Henry V. Rumsfeld’s remarks are a masterpiece both rhetorically and substantively.


Scott Martens 02.10.04 at 4:09 pm

Shai, don’t worry about your math background. Go take a look at the post from yesterday on important books – not a lot of grad school math texts in there, but a few crit lit ones. I flunked tensor calc myself, and lived to regret it.

Your’re right, sort of, but look at the limitation you’ve imposed on the class of problems modellable using a decision procedure. You don’t need to assume anything as high-falutin’ as an “unstable sensitive dependence.” It is not the nature of the dependency or its complexity that matters. You just need to assume a real dependency on agents with specified minimum knowledge and computational capacity. The stability and sensitivity of the dependence is irrelevant, just so long as the agents can affect outcomes.

This covers every possible problem in the social sciences. The efficient markets hypothesis is a proof that a particular model has certain mathematical properties. The problem is that the model’s applicability to the real world is in considerable doubt. All I have to assume to draw my conclusion is that agents possessed of a model of the world have an effect on the outcome that I’m trying to predict. I don’t see any way to avoid that conclusion if we’re talking about the stock market.

My suspicion – one which I hope to one day explore when I’m not writing code for a living – is the idea that this constitutes a novel problem class beyond type 3. Or rather, that it is a sub-class of type 3 problem which is uniquely unsolvable.

IIRC, a type 3 problem is one where the error surface (the environment the agent works in, if you will) changes in response to the agent’s decisions in a way which can’t be treated as a type 2 problem. A type 3 problem is not necessarily unsolvable but there is no generalised procedure for calculating decision weights, and when I was last in grad school the chic way to solve them was to drag out Holland’s theorem and use evolutionary programming to attack it.

The kind of case covered by this approach is knowably unsolvable even for stochastic procedures like EP. If you take it and run with the implications, it eliminates the possibility of any form of stable public policy on the long run.


apostropher 02.10.04 at 5:04 pm

Yes, it makes logical sense, but it still is funny when read aloud. It’s like he’d been getting policy briefings from Dr. Seuss.


rou 02.10.04 at 6:01 pm

He misses the most important one. ‘things we don’t know we know’. This would include being careful about intelligence from single unreliable sources. Not listening to your own intelligence experts. The failure to grasp the fact that WMD’s have not been found. The failure to realize that your post war fantasies didn’t occur. Mr. Rumsfeld doesn’t seem to believe in public information at all. The closest he comes to public knowledge is this cartoon like charictature for the press.


Sebastian Holsclaw 02.10.04 at 6:23 pm

There seems to be some dismissal of the statement as ‘trite’. Well sure it is repeated all over the place. For instance in boxing: “It is the punch you don’t see that knocks you out.”

Nevertheless it is a hugely important idea in politics where practically everyone acts as if they have perfect information all the time (not to mention they act as if they would know what to do with perfect information if they had it).


pyromania 02.10.04 at 7:46 pm

the theoretical correctness isnt at issue: epistemology as a response to the question “where is your homework?” is fatuous, and nothing else. rummy delving into the theory on information to cover his inability to use his eyes as sense organs is unserious in the extreme.


Bill Carone 02.11.04 at 2:36 am


“You just need to assume a real dependency on agents with specified minimum knowledge and computational capacity. The stability and sensitivity of the dependence is irrelevant, just so long as the agents can affect outcomes.”

Are you referring only to infinite cases? In other words “I think that he thinks I think that he thinks …” ad infinitum?

I wouldn’t be surprised if, sometimes, trying to assign probabilities given that someone will try to anticipate your anticipation of his anticipation of … won’t work, since many probability and mathematical calculations don’t work when explicit infinities are involved.

Does your problem still work when you take limits? In other words, only use the result of such an “infinite” case when it is the well-behaved limit of a finite case?

If the limit isn’t well-behaved, you cannot learn anyhting from your model; you have to find some other model of the problem, and cannot assume that the underlying problem is unsolvable.


asg 02.11.04 at 3:42 am

I believe you may be talking about a paper by one David Wolpert, not Daniel Wolpert. There is a paper entitled something along the lines of “future uncomputable” in PostScript format accessible from an ftp site linked to his homepage (


asg 02.11.04 at 3:44 am

Oh, and as for the individual who thinks the decision to go to war was “tragic”, I’m sure the average Iraqi shares his view entirely, thinking it especially tragic that quality citizens like Uday and Qusay aren’t still walking around and snatching the occasional bride from a wedding.


The Kid 02.11.04 at 4:37 am

In the dark ages as a congressman from Illinois, Rumsfeld was depicted by the national press as a dummy and the perfect match for the Ford administration’s collection of nincompoops. And even though Rummy has exhibited some talent and managerial skills in two distinct industries, many – including a few here – find his management of the Pentagon and service to the country deficient.

Yet he did manage to cancel a cold-war relic that had wide support in the Congress: the 70-ton Crusader artillery system. And he has successfully brought under control the one service – the Army – that refused to go along with the force modernization because its former leaders felt secure in the sort of political gamesmanship that kept key congressional districts well porked.

Oh, he did stumble acrossthe right folks with the right plans for rapid response in Afghanistan. I can think of no other SECDEF who’d have been as successful in bringing out the best that the services had to offer.

As for the Iraqi WMD, US intelligence may have been as deficient as that of other nations. That Foxbat found buried was certainly advertised as a surprise, being more modern and powerful that we – who’d been patrolling the no-fly zones – had any reason to expect. Or it may be that we civilians don’t know what Rummy knows.

Don’t forget that a key counterintelligence precept is to compartment information according to mission and need to know. The Bush administration is infamous for its ability to keep many secrets. Along those lines, Kay, for example, had access to the information necessary to search for WMD within Iraq, but not to the entire universe of information about Iraqi WMD. More on this in a moment.

One of the more remarkable secrets was the capture of Saddam Hussein, a fact not made public until the early hours of a Sunday. You may recall that Rummy hosted a holiday party the night before, yet none of the press or others in attendance found out about the capture until the fact was made public the next day.

Some speculate that Hussein moved WMD components out of Iraq before the war. There are rumors that Syria has some or that Syria gave passage to a couple of tractor trailers which ended up buried in the Beka Valley. There are reports too that Rummy has approved SOF operations in a wine-producing region of Lebanon…

I’m fascinated by a more practical omission: the three to six SCUD launchers and around twenty SCUD missiles that are AWOL. While they are not WMD, they are prohibited and haven’t turned up on anyone’s inventory. Are they buried in Iraq or were they loaned to Syria, a country that has its own SCUDs? Without the VINs it will be hard to tell, no?

Calling Rummy a dummy is easy and has a long history. But one can readily find evidence of success in his stewardship of the DOD and the conduct of military operations abroad. The leaked memo of a few years back is exactly the kind of communication an executive of a large organization would send his subordinates to stimulate their creative juices. Journalists are neither managers nor students of organizational theory, and therefore can’t provide the context the public sorely needs. Rummy’s interchanges with the press are classic examples of a competent manager providing context to those who want information but can’t assimilate it.

Finally, Rummy doesn’t need the job and doesn’t have to worry about employment in a think tank or defense contractor after his time is up. He can focus all of his energy, charm, wit, and enthusiasm on the job he’s doing.

You may consider him a rube, but never, ever play poker with him. Or his boss. You’ll wind up broke.


Scott Martens 02.11.04 at 8:41 am

Bill – that’s not really the problem. The essense of the proof is that in order to make predictions, the simulator has to simulate the effect of its own prediction, or if there are other agents of similar computational power out there, then it has to simulate the effects of their predictions. This leads to a contradiction.

There is an algorithm – I don’t know how widely it’s still used – for deciding when to buy and sell a stock. IIRC, the idea was that you buy a stock when its 50-day moving average rises above its 200-day moving average, and sell it when it falls below. Automated trading systems were built on this basis, and the effect was that no one was able to make any money using this algorithm. As a predictor of consistent gains, it failed because the gains were dependent on cognitive agents equipped with the same knowledge and cognitive abilities.

This same problem is widely held to be the major cause of the collapse of LTCM. Since its algorithmic trading strategy was available to a large number of market players. This application of the halting problem shows why any algorithmic trading strategy – which is essentially the same as a prediction of future returns – will always fail. Logically, this should extend to any sort of policy predicated on predictive modelling. I think it ought to be possible to show that any sort of modelling problem – including calculating weights in decision trees – should be unsolvable whenever it has to model the behaviour of agents who are themselves trying to find the best model to guide their behaviour and who are equipped with equally sophisticated modelling abilities.

I would think this to apply to situations like the prisoner’s dilemma. Even when a cooperative strategy develops, there are still gains for defecting. I think it might be provably impossible to calculate whether it is better to defect from such a strategy when both players know that resuming cooperation even after one side has defected is still the optimal strategy.


msg 02.11.04 at 6:00 pm

Your embryonic constituents, the parental egg or sperm, see each other in your grandfather(s) and your grandmother(s), then the first tingling edge of attraction and the potential being you were vibrates toward the actual one you are, as they begin the mating ritual that culminates in your parents, who do the same thing. We call ourselves into being, in a way.
Your grandfather’s crystal ball is murky, because it shows what could be, more than what is; and if he got it from a reputable dealer he was told to be very very careful with it, lest he become responsible, by oversight or mis-step, for the fate of the world.
The anthropologist’s reflective aura, the impossibility of direct observation without affect, would seem to apply. Your grandfather would view a world he was in the act of creating, by looking.

The solidity of the past is, in some views, answered by the equally solid future. There’s no room for new information, and no place for that new information to come from. It’s a closed system. The moving finger writes, and moves on. Anyone with the right lens can see it all. And an eternal being looking at it all at once would see a solid thing; time as amber.

Other views are less fatalistic. The alternate-worlds trope of pulp science fiction, branching histories created by your trespass in time past or future. We can change this moment, why not all moments? The tight-wire act of free will above an abyss of possible wrong moves.

The dead grandfather problem is set within a kind of vehicular travel-through-time scenario, where the traveller has independent movement, volition, and physical presence.
A less problematic version has consciousness alone moving through the intervening temporal landscape, between the seer and the seen.
Or even less technically difficult, perception of eventual light – wave, packet, whatever; all that requires is that events generate signals both directions in time, and the presence of someone with sight enough to see them.

Thought problems are good exercise but they don’t always account for what’s observably real, present, here and now. The past five seconds were as solid as this moment is, but where are they?

The million monkeys with a million keyboards, and an infinite amount of time, founders on the lack of being for the monkeys’ end of it. They aren’t real monkeys, with real pasts, they’re theoretical. Real monkeys are a limited set, and can only do so much, even granted eternal life; and there goes your eventual inevitable complete Shakespeare. Maybe. Or maybe not.


Shai 02.11.04 at 7:30 pm

msg that reads like derrida wrote a book report about asimov foundation


Jason McCullough 02.11.04 at 10:25 pm

The Rumsfeld quote is just fine by itself. The problem is that it was given as an answer to a “where the hell are the WMDs” question; it’s a total political dodge, not a deep rumination on epistimology.


AAB 02.12.04 at 4:16 pm

Like Jason said, Rumsfeld’s quote as applied to epistemology is not actually a big deal. He is actually missing one case: unknown knowns. His quote as applied to WMD is a political dodge.


Aramis Martinez 02.13.04 at 4:27 am

The idea is quite mind-blowing, in the same manner that quantum physics was so mind-blowing that Schrodinger and Einstein were unable to accept all the implications of their work.

Does anyone know off-hand the assumptions he used, or if they’ve been tested? I would love to find out just what the scope of this is, as well as the odds he’s right in reality as well as on paper!

Comments on this entry are closed.