On Humphreys opacity, Reverse Engineering, and Social Externalities of LLMs.

by Eric Schliesser on June 29, 2026

I start with characterizing a term, Humphreys opacity’ (or, if you prefer, ‘epistemic opacity’):1 this involves the inability to surveil the steps of a process from a known input to a known desirable (or truthful, useful, beautiful, etc.) output in a timely manner to the decision-maker or responsible agent. (For more on the origin and nature of this characterization, recall this post.) In what follows, I set aside to what extent such Humphreys opacity is the effect of features of physical reality or is merely the result of a pragmatic cost-benefit analysis.

Humphreys opacity is in the news because the ideal to generate a so-called ‘glassbox’ AI — in which AI systems and machine learning models where the internal processes are fully visible, transparent, and interpretable to humans — seems so hard to achieve. In fact, Humphreys opacity is a design feature of contemporary LLMs that are rapidly being deployed in all kinds of organizations. At the moment neither end-users nor engineers can survey the steps that lead to an LLMs output in real time. It is by no means obvious that they could do so even after the fact in all salient contexts. Interestingly enough, at the moment such Humphreys opacity also seems a feature of any (say) Opus 4.8 token (in the sense of the token/type distinction) one may be interacting with as an end-user. Such tokens lack luminosity about the inner workings of ‘their own’ underlying machinery, too. This much is familiar enough in public debate and also the scholarly secondary literature.

As LLMs are incorporated in all kinds of social practices they (predictably) generate new sources of Humphreys opacity and will intensify at least some existing ones. On the latter (intensification), as LLMs are inserted in administrative, design, and production processes (etc.) they will actively displace surveyable steps; and as various kinds of social organizations and corporations learn to use LLMs at scale, they predictably, induce the redesign of a process around the advantageous use of LLMs.2 The reason why I call these an example of intensification is because the bureaucratic processes that will be rejigged in light of the uptake of LLMs are, inter alia, themselves attempts or instruments at managing the effects of Humphreys opacity.

As Rousseau noted (recall this post) in the Third Discourse one reason bureaucracies are introduced and maintained at often great cost is because an executive cannot be in all places and times at once. Now it’s familiar from scholarship inspired by James C. Scott and/or Foucault that state bureaucracies both make populations ‘legible.’ But bureaucracies also generate new forms of Humphreys opacity. In fact, generalizing a bit, bureaucratically-induced Humphreys opacity has meant the recruitment of skilled specialists (in the management of information/technology and people) and the development of all kinds of institutions that secure, monitor, and audit the reliability and quality control of the bureaucratic processes as well as the people staffing these, and so on. In fact, what we may call ‘the management of Humphreys’ opacity’ is constitutive of the art of government since the eighteenth century. Or to restate that a bit more cautiously, one feature of skill in governance is being good at the management of forms of ignorance, including the ignorance generated or made worse by government itself.

This ratcheting up and intensification of epistemic opacity is also one of the main engines for the enormously widening scope of the state as (Nick Cowen and I emphasize) a machinery of record and, (recall) to use Tom Pink’s felicitous phrase, witnesser of truth. In both the state certifies many social facts. In so doing these certified facts function both as traditional public goods as well as constitutive principles or conditions of many social practices. For example, emission trading practices presuppose a structure of property rights, the conceptualization and monitoring of emissions across the economy, and a system of converting units of emission to (say) tax credits/debits (and so on).

Crucially, the state also helps provide the infrastructure (and is often a party to) necessary adjudications of what the facts really are. The modern state often maintains an infrastructure for this alongside the bureaucracy, and, as Hannah Arendt emphasizes in her essays, this is why law and science have a special status among the many public institutions. These are the relatively costly and relatively slow-moving institutions that are made authoritative on the facts. And this is also the underlying reason why the uptake of LLMs within the law and the universities has generated controversy and even push-back.

Whatever one’s attitude toward the law is, one of the court systems’ main functions in the modern state is to certify the (fallible) truth on some bone of contention (e.g., which property deed is authoritative, etc.), including ones that are themselves the effect of managing Humphreys opacity. Of course, that very legal process itself has sites of Humphreys opacity. And this helps explain why judges have been so critical of LLM induced mistakes. They undermine the quasi-auditing aspect of the adjudicating role of the institutions.

The upshot of my attenuated example is that institutional density and adjudication practices inevitably grow as processes that manage and generate/intensify such opacity, and this will only be accelerated by the uptakes of LLMs (in light of their own thus far ineliminable Humphreys opacity features). Of course, that’s not unconstrained: diminishing marginal returns, interaction effects, political and social priorities, and rents all play a role. (As my posts on the anthropological work on civilizational collapse by Tainter have suggested, we’re not the first civilization to encounter these dynamics.)

Here’s a diagram made by Claude’s Opus 4.8 of the dynamic I have in mind (but without all the implied marginal returns and cost benefit analysis not to say political and technological pathways—if you unpack each box in this diagram you find a Russian doll scenario of more such boxes). I call this dynamic, ‘the recursive Humphreys engine:’

As an aside, part of the social problem this engine generates is that the skills needed to keep all the processes running and to deal with the inevitable failures of implementation are themselves at risk of becoming scarce or non-renewable as de-skilling occurs. This makes the timing of the contemporary assault on higher education in the US and UK especially baffling.

But, leaving that aside, the governance problem of LLMs themselves is also challenging. As I learned from Socrates (see here) in the Phaedrus those that invent a product tend not to be very good at foreseeing and understanding the social externalities they may generate. It’s not that Silicon Valley and its intellectual enablers may be overselling the possible benefits of LLMs, although surely that is also happening, but that they do not have much incentive to think about the drawbacks.

Now, the reader may object to the last sentence of the previous paragraph by pointing to the literature on ‘existential risk’ associated with AGI. Fair enough. But the focus on existential risk occludes all kinds of less dramatic implementation challenges associated with the uptake of LLMs. And I want to close this post with a more general reflection on a feature that may make the externalities more challenging than usual. That is, my general view is that LLMs do not generate a new kind of governance problem, but they do generate risks that I gestured at by using the language of ‘intensification.’

Now that LLMs are being embedded in all kinds of processes, a familiar social externality is becoming visible. For one of the more unfortunate side-effects of processes in which LLMs are embedded is that when they fail, they also make one of the best known strategies in learning from disaster, reverse engineering, also less fruitful. For, while one can figure out how the output of LLMs contributed to an error or disaster, it is not entirely clear that one can be wholly confident what set of instructions/prompts or, more subtly, internal states caused the unexpected output of the LLM. The Humphreys opacity inside the LLMs makes reverse engineering very difficult if not impossible sometimes. So, the feedback loop between error and learning becomes much more attenuated. Let me make this a bit more tangible below.

Reverse engineering as a tool for discovery is a common thread in the work of my teachers, the philosophers, George E. Smith, Dan Dennett, and Bill Wimsatt. They have taught how reverse engineering reconstructs a hidden structure from visible traces or effects. Reverse engineering is incredibly widely used in institutions dense with engineers, and an important tool in biology and anthropology.

So, for example, I learned from George Smith, who remained a turbine failure specialist throughout his philosophical career, airplanes are partially designed with black boxes that record many of the plane’s self-monitoring devices to facilitate such reverse engineering. Those are by no means decisive; often all the recoverable parts of the crashed plane also need to be reassembled painstakingly in giant hangars. Crashed planes have more than a little bit in common with giant fossils. Smith once noted that “in the case of aircraft failures, for example, even the total absence of evidence that some particular factor did not cause an airplane to crash–in other
words, a total inability to eliminate that factor–can be sufficient reason to warrant doing something to safeguard against its causing a future crash.” (p. 549) As he and Jed Buchwald note in the review from which I quoted, this demanding attitude toward risk-factors is sensible in the context of the possibility of catastrophic crashes.

The governance structure of airplane safety is quite complex and evolving even Stateside alone, and I do not mean to suggest it is the right model for LLM safety. But at the moment no airplane-black-box equivalents exists for LLMs; I sometimes wonder if self-monitoring devices at reasonable cost are even possible for these artifacts. This makes it all the more remarkable that so far no institutional infrastructure is implemented or on the horizon that can promote the management of the social externalities of the incredibly rapid uptake of LLMs. Expect turbulence ahead.

 

 

 

1  In reality, Humphreys opacity is just a species of epistemic opacity.
2 This need not be a feature of all automation. Sometimes, as Babbage conceptualizes it (recall), automated processes are de facto or in principle surveyable; but as Adam Smith recognized (and Paley picks up) this is not always so. (But see here for a corrective to both.)

{ 8 comments… read them below or add one }

1

Kenny Easwaran 06.29.26 at 5:08 pm

It strikes me that a lot of the Humphreys Opacity present in reliance on LLMs is of the same character as the Humphreys Opacity present in reliance on expert humans. Previous decades of computerization and automation involved making things more transparent, but there is some sense in which the LLM transition is undoing this, rather than intensifying it.

2

Eric Schliesser 06.29.26 at 10:49 pm

Hi Kenny,
First, I agree that Humphreys opacity in reliance on LLMs is of the same character as the reliance on expert humans. (That’s a really feature of my argument.) I am influenced by Millgram’s work on The Great Endarkenment and Jeffrey Friedman’s Power without Knowledge (which make arguments in the vicinity of mine).
Second, part of my argument is that when we have tried to eliminate some existing opacity through computerization and automation this generally created other forms of opacity. (There is by now a huge literature on how efforts at generating transparency basically get overwhelmed by enormous amount of apparent noise.) Many databases were premised on transparency, and now almost nobody exists that can preserve them (or they interact in odd ways with other systems).

3

joeyjoejoe 06.30.26 at 1:15 am

“Humphreys opacity is in the news because the ideal to generate a so-called ‘glassbox’ AI — in which AI systems and machine learning models where the internal processes are fully visible, transparent, and interpretable to humans —”

Does Humphreys opacity apply to automobiles? TVs? Computers? My iphone? Google Searches? Xray or MRI machines? How my mountain bike or water pipes were manufactured? What milk pasteurization does? Exactly how my refrigerator works? Because I don’t know how any of those work. Or virtually modern tool, or process, in contemporary existence.

But I guess ‘tech bro’s bad.’ So those don’t matter.

joe

4

J, not that one 06.30.26 at 2:56 am

It isn’t obvious to me that the concern with transparency in LLMs is the same as the concern in human experts. One concern with LLMs seems to be that most who want to use them assume no training period will be necessary, nothing like peer review will be necessary, and so on. Not to mention that we have nowhere near the background familiarity with AI that we have with humans. That raises a whole bunch of new concerns that seem specific to the technology.

Anyway, I always get a little concerned with discussions that conflate transparency to debugging with transparency to end-users. LLMs appear to be opaque in both ways, but there are plenty of cases where developers can follow the process that led to a result, where the processing can’t really be described in a way most people could grasp. (The same is true of any system involving math, too, I guess, and similar things. I wouldn’t understand how the airplane’s data explained anything, either.) I always feel like the text is always on the verge of saying software developers don’t understand that transparency is really only meant to refer to the latter.

5

Eric Schliesser 06.30.26 at 6:44 am

Humphreys opacity is explicitly relative to the responsible agent. So the answer to your implied objection (yes?) is ‘it depends on circumstances.’
I agree that when it comes to specifics it matters a lot what function and what kind of externality one is worried about.

6

D. S. Battistoli 06.30.26 at 10:38 am

An absolutely fascinating post, and an intriguing development of this line of thought!

I do think there is a distinction to be made between Humphrey’s epistemic opacity of LLMs and Rousseau’s epistemic opacity of bureaucracy.

Namely: a sovereign could dispatch an auditor ex post facto to find out why the bureaucracy produced a certain outcome. Now, there may be issues with informants’ truthfulness or recall, but in principle, what may not have been possible to know in the moment remains discoverable.

The same is not the case with AI. In creating an LLM, you provide it with training data, and it produces outputs. There exists and, as far as we know, there can exist no auditor capable of mapping how those outputs were produced, after any interval of time.

There is the famous example of Amazon’s attempt to use AI to determine the characteristics to look for in new hires by providing it training data consisting in both the CVs its staffers originally submitted and the subsequent performance of those staffers. Now, like most companies, Amazon had a bias toward hiring and promoting men, so it found, among other things, that having terms referring to womanhood on your resume was a mark of a probable underperformer, and using words like “conquered” was a mark of a probable high-acheiver. Amazon tried re-weighting and re-training the model endlessly, but, despite having a good sense of what about the input data was causing the undesired outputs, it could not eliminate the undesired outputs by doing differently anything that it was in its power to change, except discontinue the program altogether (which it ultimately did).

This is because LLMs are not analogous to human beings or human societies. They are analogous to the union of the human mind and brain.

The Amazon example may complicate two of your points, Eric.

One is that companies making LLMs may understand poorly the negative externalities of using LLMs. The makers of LLMs are, in practice, among their first users. So, while in their role as inventers, they may not understand the drawbacks of their inventions, in their role as users, they may indeed. This is not to say that they are supremely trustworthy (Dr Jekyll understood before anyone else the side-effects of the serum that turned him into Mr Hyde. . . . ), but rather that they are not collapsable to the case of Socrates’ Thoth in Phaedrus.

The second, perhaps more pernicious, relates to your question of deskilling and parallel bureaucratic systems like the courts. Because as far as we know there exists no auditor, however constituted that can check the LLM (very much unlike human-staffed bureaucracies, there is no level of skill maintenance or upskilling that can counteract it, absent perhaps maintaining a perfectly parallel alternate and human bureaucracy to which the sovereign may revert when it inexplicably performs undesirably.

7

oldster 06.30.26 at 12:07 pm

joeyjoejoe @3:
“Does Humphreys opacity apply to automobiles? TVs? Computers? My iphone? Google Searches? Xray or MRI machines? How my mountain bike or water pipes were manufactured? What milk pasteurization does? Exactly how my refrigerator works? Because I don’t know how any of those work.”

I thought the question of H’s O was equivalent to: when this system fails, can the relevant experts reconstruct the causes of the failure?

In which case, the answer to most of your questions is “no, H.O. does not apply, or not much.” When the parts of a mountain bike fail, it is generally not hard to work backwards to the source of the manufacturing defect. When a batch of milk spoils, then dairy experts work backwards to figure out why the pasteurization process failed.

When devices have more parts, then the process may require a more extended post mortem and more varieties of expertise. But Schliesser’s example of air crash investigations shows that the answer for cars and fridges is still, “no, not a lot of H.O. here either.”

Your own ignorance — my more comprehensive ignorance — is entirely irrelevant to the question.

8

Eric Schliesser 06.30.26 at 2:06 pm

Thank you for your kind comments, D.S., and for the extension of the implied framework. I agree that the model in the Phaedrus is too simple, also for the reasons you note. And I very much like your first point. On the second, I don’t want to rule out future developments that may make the internal steps of LLMs more auditable. But the drift of your argument is very much a kindred spirit to my own.

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>