In the Next Great Transformation AI will not eliminate genuine expertise; rather it will make it more valuable

by Eric Schliesser on March 2, 2026

These problems are like distant locations that you would hike to. And in the past, you would have to go on a journey. You can lay down trail markers that other people could follow, and you could make maps.

AI tools are like taking a helicopter to drop you off at the site. You miss all the benefits of the journey itself. You just get right to the destination, which actually was only just a part of the value of solving these problems.—Terence Tao interviewed in The Atlantic, February 24, 2026 [HT: Ryan Muldoon]

About thirty years ago, a Stanford educated philosopher, Paul Humphreys (1950-2022), realized that when connectionist models started to be developed within AI, that a set of questions and debates about Monte Carlo simulations might be salient.* In particular, the fact that connectionist networks might be very complex, inscrutable matrices need not be an objection to their epistemic usefulness. This inscrutability of AI is known as ‘the Black Box problem’ in recent scholarship. After all, some Monte Carlo simulations were in practice also inscrutable, but this didn’t prevent physicists from using them. (There is a nice, accessible discussion by Eric Winsberg of the significance of Humphreys’ work in the philosophy of simulation here.)

In the course of his many papers on related topics, Humphreys coined a term, ‘epistemic opacity’ or Humphreys opacity, that characterizes one of the key aspects of such inscrutability. (See also here; or here). Such epistemic opacity — and now I paraphrase Humphreys — involves the inability to surveil the steps of a process from a known input to a known and desirable (or truthful, useful, beautiful, etc.) output in a timely manner to the decision-maker or responsible agent. I put it like that to make clear that this ignorance is pragmatic in character and could be modelled in terms of trade-offs between the quality or benefit of the output and the cost of surveillance. (Of course, it’s possible the opacity is not pragmatic, but ontological in character.) In addition, I use the ambiguous language of ‘surveillance’ because the process can be computational, social, or natural in character.

I make no claim that epistemic opacity is unique to AI. Often human minds are opaque to each other in this very sense. And in other cases such opacity is characteristic of our self-knowledge. Even if one wishes to keep one’s distance from Freud and his school, it is uncontroversial that there are lots of brain processes that are inaccessible to ourselves even though we can track the input and output to them.

In fact, epistemic opacity in Humphreys’ sense has been long recognized in the study of natural, psychological, and social processes. For example, for a very long time ‘sympathy’ was the term used to describe (a/the) cosmic and psychological mechanism(s) in which the process was invisible, even though the start and end of the process were visible. My interest below is not in this particular example, but I will suggest that the history of social awareness of the significance of epistemically opaque mechanisms may illuminate our discussion of the unfolding impact of AI.

In the quoted passage from the pull-quote at the top of this post, Terence Tao (a Field’s medalist in mathematics) describes the very specific species of ignorance that I have been calling ‘Humphreys opacity’ (or, if you prefer, ‘epistemic opacity.’) What’s neat about this particular instance, is that at the moment, Tao’s state of opacity about the process (the ‘journey’) that led to the AI proof mirrors the opacity of the machine that ‘helicoptered’ there. At the moment there is no way of recovering the machine’s journey to its answer. (Presumably with time and effort some kind of reverse engineering might be possible, even if it involves an intentional stance.)

Tao’s view is that in mathematics the process of discovery is very valuable, even though that process may be slow and involves a lot of possible dead-ends. We may say that during the older process of discovery, one didn’t just learn the truth, but also quite a bit about the tools of the trade that can be used to discover the truth (and how different mathematical objects and fields relate to each other). Now that AIs start reaching truth quickly, or to put it more precisely, without access to the underlying mathematical landscape, we encounter a trade-off between truth and (let’s call it) informativeness.

In the interview, as reported, Tao never uses the phrase ‘truth.’ Rather, he phrases his analysis in terms of the ‘answer’ the machines provide. It’s worth conveying how he puts it:

One very basic thing that would help the math community: When an AI gives you an answer to a question, usually it does not give you any good indication of how confident it is in this answer, or it will always say, I’m completely certain that this is true. Humans do this. Whether they are confident in something or whether they are not is very important information, and it’s okay to tentatively propose something which you’re not sure about, but it’s important to flag that you’re uncertain about it. But AI tools do not rate their own confidence accurately. And this lowers their usefulness. We would appreciate more honest AIs.

In reflecting on Tao’s comments, it’s worth distinguishing between two issues: first — and this is the topic highlighted by Tao –, the AI machine excelling at mathematics does not report its own ‘confidence’ in its own answer accurately. Second, even if it offered such confidence accurately, it could still be wrong about the answer it provides (and, perhaps, misreporting its own confidence.) This is especially so, with AI that is embedded in LLMs (Large Language Models). After all, there is no evidence that such AIs have eliminated hallucinations altogether, or that this is even possible (at low enough cost and time).

To be sure, the current generation of commercially available flagship LLMs (GPT 5/Claude OPUs 4.5, etc.) are genuinely impressive. (And presumably the ChatGPT that solved these outstanding math puzzles, on which Tao comments, is even more ahead of the curve, etc.) During the last month, they have finally reached the level of interesting research assistance in my own field. But second, don’t let anyone claim they have stopped hallucinating. (If you dislike that phrase, I am happy to call it ‘ungrounded content.’) Crucially, for a lot of purposes this makes LLMs inefficient tools, because you often can’t just eyeball the errors–you really need to pay attention and double-check their output. Keep this in mind, too.

There are super-interesting issues lurking here about what it would mean to have AI’s internally model or represent their own confidence. (Would they be simulating human confidence reports as if Terence Tao or some much lesser mathematician, or would they develop their own approach; would they have debates about Bayes? etc.) But that’s not my present main interest.

As readers will be undoubtedly aware, there is a persistent strain and increasingly vocal line of thought that AI will eliminate all knowledge work. And it is no doubt the case that the fate of junior and mid-level computer coders in the moment foreshadows a more general disruptiveness. Let’s stipulate that AI will indeed threaten lots of white- collar work (I call this the ‘next great transformation’). And that even in the sciences it will transform discovery and how disciplines will interact with each other, as Tao suggests. (Go read the interview.) So, philosophy of science will have a busy time ahead.

My main interest is this: Tao’s comments alert us to the fact that there are a class of problems where answers supplied without surveyable information on the means or steps for finding it are themselves fragile. At the research frontier, somebody very skilled needs to check the ‘answer.’ This is why even in mathematics there is a social component to a process of justification. And, as AI eliminates all the low hanging fruits, the difficulty and costs of checking themselves go up as understanding of the landscape becomes very thin. Even if we could build machines to check the AI (and so on), there would be a need for diagnostic tools that need to be maintained and repaired and so on. Since these machines will suffer from Humphreys opacity, this challenge becomes endemic.

As the next great transformation advances, before long AI may well drive discovery and justification and, thereby, as Tao suggests, transform different sciences. We may well have to get used to playing second fiddle to AI practices. However, Tao’s remarks also suggest that genuine expertise will be at a premium as we transform to a world suffused with modern AI. This is because modern AI systematically introduces Humpreys opacity and hallucinates alongside the cutting-edge answers it provides. As the output or ‘answers’ of AI scale up, we will increasingly need the skilled judgment of humans as part of quality control. Of course, to what degree genuine expertise will be able to capture that value in our oligarchic data economy is a different question.

That’s the main point I wanted to make. But there is a second point lurking here. The institutional infrastructure of a universe full of Humphreys opacity is itself quite dense. In his (1755) Third Discourse, Rousseau notes that epistemic opacity is introduced as governments switch in scale from estate management to management of whole peoples and nations. That is, such a sovereign cannot survey the population in real time. In fact, the sovereign must introduce regular government and begin to rely on intermediaries (a bureaucracy, viceroys, tax-farmers, delegated parliaments, etc.) in order to overcome the effects of this first-order epistemic opacity. Unfortunately, the very mechanism by which epistemic opacity is tackled often introduces a different, second-order (or ‘derivative’) epistemic opacity.

This phenomenon can be illustrated by the following thought: once one diagnoses how the very social mechanism (say a bureaucracy) one introduces to tackle first-order epistemic opacity generates higher-order epistemic opacities, one may be tempted by two strategies. First, one may introduce monitoring mechanisms that surveil the bureaucracy one has instituted. Second, one may invest in mechanisms (a census, real time street cameras, infrared tags, a machinery of record that tracks births, deeds, etc.) that make the population that is being governed more legible. In both cases, these mechanisms will themselves generate further kinds of third-order (‘second derivative’ etc.) opacities and so on. So, for example, inspired by Rousseau, and the scandals at the East India company, Adam Smith begins to diagnose principle-agent problems. Eighteenth century European thinkers look to China to learn how to develop an effective bureaucracy and the accompanying institutions that allow the management and governance of dispersed and heterogeneous populations.

From the middle of the eighteenth century to the present, we can discern a fairly large growth in institutional structures to manage epistemic opacity over time. Governments have organized not just records to make populations legible (as James C. Scott and Foucault might have put it) and to provide public goods that can coordinate commerce, but have also developed and maintained mints and central banks, measures, weights, and all kinds of data/information/statistics on the natural and social environments not the least the economy and environment. This has, of course, generated opportunities for profit as well as strategic agents who may wish to undermine trust in the government’s machinery of record and measures.

This is not just a role for government. Companies both encounter and generate epistemic opacity in their own products. Sometimes the quality control to maintain the standards and homogeneity of a mass-produced product may be as costly as the manufacturing and packaging of the underlying product. And companies may well come into conflict with each other as they do so.

This natural growth in government activity in managing epistemic opacity and providing a legal framework for managing these conflicts is, by the way, the enduring lesson of Walter Lippmann’s (1937) The Good Society. Accurate information that is appropriately public — perhaps even, to adapt a phrase from Tom Pink, witnessed as truth (recall) — requires an enormous machinery of record and a legal infrastructure that helps adjudicate conflicts over identity of and property rights in that information and the consequences of its use. Even some lawyers may survive the next great transformation.

*As it happens, during the last two weeks, I had opportunity to have long talks with Katie Creel (Northeastern) and Ryan Muldoon (Buffalo) during my visits to their programs. Their views have shaped my own here. In addition, I have benefitted from Nick Cowen’s and Neil Levy’s comments on earlier post at DigressionsNimpressions (here).

{ 31 comments… read them below or add one }

1 J-D 03.02.26 at 1:51 am: As the next great transformation advances, before long AI may well drive discovery and justification and, thereby, as Tao suggests transform different sciences. We may well have to get used to playing second fiddle to AI practices.

Something like this may happen.

Or, then again, it may not happen.

It seems an estimate of the odds might be useful, it anybody can produce one.
2 David Duffy 03.02.26 at 5:31 am: It could be argued that the protein folding solutions (ie predictions of the 3D structure of a protein without having to do X-ray crystallography etc) from AlphaFold, which won a 2024 Noble Prize, are fairly opaque. It bumped up the number of solved proteins from 200,000 to 20,000,000 in 4 years. Just as interesting are the novel predicted proteins, which the first paper called “deep network hallucinations” ie they have the same cause as the annoying ones in one’s reference list, but are a lot more useful as candidates for novel therapies.
3 Neville Morley 03.02.26 at 7:33 am: Tangential and trivial, but I don’t think the helicopter metaphor works very well; if nothing else, one might observe its initial flight pattern to develop hypotheses about its route to the destination which could then be followed up. Rather, it’s like an alien teleport machine; you set desired coordinates, step into it and disappear, reappearing somewhere else (which may or may not be where you wanted – how can you be sure?) having somehow travelled via goodness only knows.
4 engels 03.02.26 at 2:52 pm: I’m sure I’d be happier about losing my job to an AI if I learned it meant I was never a “genuine” expert to begin with.
5 Eric Schliesser 03.02.26 at 3:09 pm: Fair point. Neville
6 somebody who remembers that essentially all economic growth in the industrialized world for the last 10 years is this slop garbage 03.02.26 at 4:17 pm: it is all well and good to speculate about whether the spicy autocomplete websites will replace all human knowledge based on having knowledge, but it will be more instructive to consider whether they will replace all human knowledge based on the people who hate human knowledge having all the money in the world to eliminate it permanently. right now there’s a school in oklahoma which fired its teachers and teaches its students over a loudspeaker, where an AI voice reads AI scripts, because the teachers were woke and probably gay and trans. this is how all education will be forever, not because it’s good or effective but because it’s desirable for those with power for it to be this way. what other power is there in the world? does it have a trillion dollars in its pocket to fight this? because the chatbot boys have a trillion dollars in their pocket, and a marvelous dream of a world where they never have to speak to a girl again.
7 J, not that one 03.02.26 at 6:35 pm: One thing I worry about is simple lack of awareness that an AI could be wrong. Recognizing this requires being aware not only that gaps in training would create hallucinations as the system attempts to fill them from faultily generalized knowledge, and also as the system overgeneralizes to the point of ignoring counterevidence, but also that the input to the system needs not to be considered absolute and final complete Truth, because that’s something we humans don’t yet have either. It would require being aware that just funneling all the facts into a computer would be unlikely to create usable generalizations even if it found all the generalizations that actually exist. It would require being aware that the data being fed into the system is linguistic data and not “facts” in the sense empirical sciences use, and that there’s a difference between how those two types of knowledge are formed. (It would also require being aware that philosophy of AI that assumes a perfect AI might not be a practical guide to whether an actual AI should be trusted.)

It would be entirely possible for a consensus to form, among people with the right kinds of power and influence to enforce that consensus, that the AI doesn’t make mistakes, and what appear to be mistakes are actually some kind of user error, misinterpretation, willful resistance to the facts. It would be entirely possible for a consensus to form that even though AI is fallible, it’s better than the alternatives and that there are benefits to lying about its capacities. Opacity might be seen as acceptable in those circumstances, given the tradeoffs.

As a practical matter, however, how to get out of the gravitational field of a limited AI, after it’s been given absolute epistemic power, would remain as a potentially existential danger.
8 Cheez Whiz 03.02.26 at 7:19 pm: AI and the society it intends to transform are in some sort of quantum uncertanty, lots of unknowns, lots of angles, lot of ins and outs to keep track of. One of the few things we know is it takes burning an insane amout of cash to keep the current uncertain system running, a textbook example of a thing that can’t go on going on until it can’t. How dependent are the AI transformation predictions on this state continuing indefinitely?

Second, it is extremely clear that the problems of fact and truth are big stumbling blocks for the transformation. Are the people who supposedly understand all this aware of how the AI companies are addressing this, if at all? Or are we atill at “and then a miracle happens”?
9 somebody who looked around once during their life 03.02.26 at 10:37 pm: Cheez Whiz @ #8 characterizes the complete inability to report facts and truth by AI as a stumbling block. Very true, for a normal person, but the wealthy and powerful are not normal people, their minds do not operate in such a way. (it is left for an exercise for the reader how harvard and yale might react differently to this technology than, say, the university of california los angeles.) a critical feature of the technology is its passionate embrace by white supremacists and fascists, the most normal footsoldiers capital has on offer these days. they, like billionaires, fear the interiority that ordinary people have, and know that their power means they can get someone to tell them whatever they want, rendering the ordinary courtier’s praise of their genius (stupid) ideas meaningless and obedience to their beautiful (stupid) orders hollow. but the emperor doesnt need clothes, if grok will always say he’s dressed.

in fact, a certain degree of hallucination is desirable. after all, reality is just some stupid roadbump on your way to total sexual and racial domination. as you choke down your glue pizza and eat three rocks a day because the ai voice on your favorite grindset superpodcast orders you to, you smile, knowing SHE would never eat a pizza like this. “women don’t understand” the ai voice – female coded, of course – coos in your earbuds, then asks if you want to order more steroids to stop the vaccines – ten percent off with promo code “SAMALTMAN”.
10 J-D 03.02.26 at 11:11 pm: right now there’s a school in oklahoma which fired its teachers and teaches its students over a loudspeaker, where an AI voice reads AI scripts

Which one?
11 J-D 03.02.26 at 11:22 pm: It would also require being aware that philosophy of AI that assumes a perfect AI might not be a practical guide to whether an actual AI should be trusted.

There’s enough information available about how people do things and about how existing computer programs do things to conclude reliably that when computer programs (for example) play chess, or Go, they’re not doing it the way humans do it and that, likewise, when LLMs answer questions they’re not doing it the way humans do it. However, all the philosophical arguments I’ve ever seen that try to establish that it’s impossible in some absolute sense to program a computer to do things the way humans do them fall down because not enough is known about how humans do these things to draw such impossibility conclusions reliably. On the one hand, the limits on present understanding of how humans do things impede attempts to prove that computers can never be programmed to do things humans do in the same way that humans do them; on the other hand, those same limits are a reasonable basis for being dubious about the prospects. In short, it’s reasonable to suppose that it is in some sense possible that we’ll get there one day, but it’s equally reasonable to conclude that so far we’re nowhere close to it.
12 Jim Harrison 03.03.26 at 3:54 am: Nick Bostrom introduced the idea that an AI system set up to maximize the production of paper clips might destroy the world in its monomaniacal quest to do just that. Which got me thinking that there’s already a system dedicated to maximizing something, namely capitalism, whose raison d’être is endless accumulation. Maybe AI is best understood as the market economy developing something like consciousness, capital coming to a head like a pimple.
13 Moz of Yarramulla 03.03.26 at 7:13 am: One thing I’ve been thinking about is the various attacks on LLMs, especially attacks via ‘poisoned’ training data, and how to distinguish those from all the other inputs we feed into the LLMs. How do we tell whether a given input will be good or bad?

I don’t think we know, and I don’t think we have a very good understanding of how we might find out. We’re more at the “noxious miasma” stage of AI-medicine than even the cellular life stage. There’s a lot of intuitive knowledge, but not a lot of systematic knowledge.

You see this most publicly is the various “AI made to do something insane” articles, but for me writing software every day the more practical one is… actually, “the AI doing something subtlyinsane” is more accurate. Rather than glue on pizza, it’s a complete re-write of a test such that it no longer functions and thus the new code that “passes the test” in fact does not.
14 Jolly Roger 03.03.26 at 10:26 am: Which one?

And how many of their students haven’t beaten each other up, vandalised the class, wandered off, talked over the recording, of just ignored whatever it was saying?
15 MisterMr 03.03.26 at 12:16 pm: @J-D 11
“However, all the philosophical arguments I’ve ever seen that try to establish that it’s impossible in some absolute sense to program a computer to do things the way humans do them fall down because not enough is known about how humans do these things to draw such impossibility conclusions reliably.”

I don’t think that it makes sense to say that it is absolutely impossible to program a computer to think exactly as a human, but this is a different argument from “present day computers think exactly as humans”, that obviously false.
There is the question on how much exactly as a human is enough, for example I don’t think anybody wants to program AIs to feel anxious or envious, however feelings do play a role in human thinking.
I think the confusion comes from the fact that people are trying to simulate only the “higher” functions of the human brain, but “intelligence” relies on the lower functions too.
16 J, not that one 03.03.26 at 4:04 pm: Moz @ 13

a complete re-write of a test such that it no longer functions and thus the new code that “passes the test” in fact does not

My feeling is generally that I can code many things, certainly in a work situation, faster than I can have an conversation with a chatbot where I gradually persuade it to do what I asked. On the other hand, if I wanted to code a simple app, and an LLM could replace an IDE, I’d find that helpful. (Maybe not over time if I was doing the same thing over and over again, but if requirements change every year or two and I don’t code apps very often.) On the other hand, if LLMs end up being a ~~slightly~~ significantly more sophisticated markup language, with the conversation replacing a debugging session, that’s useful but not quite general intelligence (and it sounds like we’re not there yet either).

A couple of decades ago you’d find people complaining that programmers needed more methodology, code reviews, etc., etc., all those things we put in place. And now we’re going to abandon that just because we’ve replaced the programmer with a chatbot? (An “AI” that was deliberately programmed not to be rigorous in the way those methodologies impose.) Really? Those complaints were just that people are unreliable but we’ll gullibly use the first “machine” that claims to be able to replace them? (And again, a “machine” whose claim to fame is that it acts more like a human than a machine.)
17 Aardvark Cheeselog 03.03.26 at 5:00 pm: Just some superficial reflections on OP and the thread(s):

I wonder about people who coin terms like “epistemic opacity,” how much time they have spent on the pursuit of trying to feel their way around the very opaque issue of how knowledge differs from unjustified (or inadequately-justified) belief. As phenomena that demand explanations go, things don’t come much more opaque than that. Perhaps if one finds oneself tempted to coin a neologism with the adjective “epistemic” one ought to examine closely exactly what one means.

OP cites famous mathematician observing that “AIs” (by which are meant LLMs) don’t judge their confidence reliably, and the the would “would appreciate more honest AIs.” Aye, indeed, Famous Mathematiker has pointed directly at the Holy Grail: a mechanism that would make a LLM STFU when its training data is too thin to allow it to guess accurately what a correct answer would look like. I personally remain certain that this capability will not arise spontaneously in any model that is trained the way LLMs have been trained so far. This is the 2nd time just today that I have observed in a comment that the most noteworthy shortcoming of the LLM approach is that it can never, under any circumstances (other than ad-hoc rules about specific questions put in after training) answer “I’m sorry, Dave. I don’t know the answer to that.”

In this thread are a lot of people who would probably like to be taken seriously, but whose insistence on referring to LLMs with epithets like “spicy autocorrect” militates against that.
18 somebody who hadn't checked in a while 03.03.26 at 6:31 pm: Thanks to those asking for details about the Oklahoma school I described. I hadn’t actually checked in a while and was delighted to learn that the minds behind it – a charter school with only a few teachers! the ultimate expression of american education technology – were now on trial for racketeering and embezzlement. The school itself seems to have, ah, changed direction since that time. enjoy the drama and don’t worry about these guys too much. ai boys get pardons like kids on halloween get candy.
19 Alex SL 03.03.26 at 8:31 pm: During the last month, they have finally reached the level of interesting research assistance in my own field.

I am always puzzled by this, because whenever I try them, they fail me. A few months ago, I tried a presumably “agentic” system that was purpose-built for research, and its literature review function was like a clown car driven into a ditch that is also on fire. In the first attempt, it repeatedly used references to support statements that they did not actually support, and in the second, it got stuck down a rabbit hole of a small subset of the space it was meant to summarise, doing the equivalent of a student who is asked to provide a ten page overview over WW2 and then spends six pages on the sinking of the Bismarck.

The usual response is that I should shut up until I have tried the pro subscription of the currently most publicly hyped model (at the time of writing, Claude), which reminds me forcefully of religious apologists insisting that I read book after book after book written by Sophisticated Theologians and never accepting the legitimacy of being an atheist because there are still more books that could convince me to convert.

More to the point, the helicopter ride is a very bad idea for research. Science isn’t science if we cannot explain why we got a result, and science isn’t science if the results cannot be reproduced. This means that generative AI should not be part of any analysis workflow. And using it for writing manuscripts or grant proposals but then putting a human’s name onto the title page is quite simply fraud. What is more, until there are cheap robots doing all of the data collection, lab work, field work, experimentation, moving around of specimens, etc., even a hypothetical AGI will not be able to replace researchers and research technicians.

That being said, whether AI works reliably enough, or whether it can actually replace researchers, ultimately doesn’t matter. The world is full of a seemingly constantly growing number of people who do not care about aligning their worldview with reality or even about making it logically consistent, who in a year of record high temperatures point at snow and say “so much for global warming, eh?”, who take Musk seriously when he talks about Mars colonisation or hotels on the moon, who cheer war with countries they would not be able to find on a map. It is full of managers who want to hear only positive sound bites from staff whose jobs they are cutting, who they unnecessarily force back into the office, who they stuff into open plan offices despite all the studies showing those to be less productive and incubators of infectious diseases. It is full of people who take carbon capture seriously because they have seemingly never heard of the laws of thermodynamics, or who think that betting markets and deflationary currency are great ideas, or who argue that driving all of nature to extinction is fine because we can just GMO new plants and animals. The combination of self-confidence and not understanding how anything whatsoever in the universe works at all that is exhibited by most people is horrifying.

So, I am not worried that a probabilistic word generator can replace me, but conversely, I am not comforted by the idea of genuine expertise either. If I lose my job during the great AI transformation, it will not be because of that. It will be because too few people care about knowledge, expertise, and accuracy, and will accept LLM-generated word salad as good enough even where it objectively and demonstrably isn’t; they will just go lalala I can’t hear you at those who point that out.
20 both sides do it 03.03.26 at 8:53 pm: “somebody who” . . . something just clicked for me, are you . . . Atrios?

would love to know if I’m right

please drop me a line at “bothsidesdoit” gmail (it’s ironic lol) if I’m barking up the right tree, I won’t indicate publicly either way

oh right the post itself:

it highlighted the connection b/t legibility in a model and legibility to a state / institution, which I had been dancing around when thinking about these things for the last few months but hadn’t quite pinpointed. seems like it’ll be a useful spanner in some ways

thanks
21 Moz of Yarramulla 03.04.26 at 2:25 am: if LLMs end up being a slightly significantly more sophisticated markup language,

To be useful the output has to be deterministic. Otherwise “what changed” turns into a nightmare of “everything”. I suspect similar problems apply to people who use LLMs to “proofread”, where they make a pile of innocuous changes while quietly negating a core argument. Or just relabelling a graph so it’s nonsense. Your post-proofreading step ends up being “pore over the whole document with a microscope”.
22 notGoodenough 03.04.26 at 9:14 am: While it may be somewhat passé to say “the problem is less technology, but more technology under capitalism”, I confess to a sneaking suspicion that AI (by which I mean adaptive algorithms that improve through data and feedback generally, not just LLMs) would largely remain occasionally useful, occasionally employed tools if it were not for our elite overlords insisting on embedding it into every facet of modern life. This expansion, naturally, raises the spectre of a whole host of problems: the devaluation of labour; the centralisation of human endeavour into systems largely controlled and owned by a handful of tech billionaires; the widespread deployment of systems incapable of bearing responsibility for their decisions; the unprecedented ability to generate large volumes of superficially plausible misinformation or harmful content; and, not least, the exacerbation of our elites’ tendencies toward self-reinforcing delusions via access to devices capable of producing plausible-sounding justifications for their beliefs; etc.

Even setting aside all that, I think there is a lot to be concerned by. To echo others, there is a threat not that people could be replaced by AI but more that they will be replaced regardless (with the ensuing price paid by the public). The increasingly popular idea of AI “coaching” employees evokes a delightfully dystopian vision, combining the tireless surveillance of a panopticon with the worst forms of micromanagement – all packaged in a veneer of pseudo-helpfulness likely to drive one to madness. We might also imagine a speculative bubble bursting, obliterating vast investments (inevitably followed by massive public bailouts to stabilize the industry). Meanwhile, AI contributes to raising the cost of technology across the board: not just high-end gaming computers, but every inconspicuous “grey box” from dialysis machines and portable ECG/EEG devices to signaling systems and environmental monitors. The net effect is a further concentration of wealth into ever fewer hands – still, after all, isn’t that the real purpose of life under capitalism?
23 JimV 03.04.26 at 11:11 pm: Here’s a small data point as to whether human thought processes have any commonality with neural networks:

The first neural network I learned about, in the early 1990’s, was trained to recognize hand-written decimal digits, from 0 to 9. It had this property: to whatever you scribbled in one of its input boxes, it would assign a numerical digit. It would never indicate that it wasn’t sure or couldn’t respond.

My eyesight is getting bad, and I recently squinted at a two-digit number on my computer screen. My optical neurons showed me “89”. Then I removed my glasses (I’m near-sighted) and looked again with my nose almost on the screen. This time, with equal confidence, I saw “55”.

So it seems to me my optical neurons made their best guess the first time, wrongly, and presented the hallucination to me with no indication of a lack of confidence.

I understand that brain neurons and their synapses are about 1000 times more complex than the capability of a single node in a neural network, so it takes huge numbers of them to do anything interesting, and it will probably take combinations of many of them with different tasks. Which so far has not been the case, I think, AlphaGo had only two sets according to the paper. I don’t how how many an LLM has but I think they lack a separate executive function to estimate things like confidence.
24 J-D 03.05.26 at 2:42 am: Thanks to those asking for details about the Oklahoma school I described.

Knowing the name of the school, I was able to search the Web for more information about it. It seems it specialised in online education, which doesn’t seem to match up with the specific description given earlier.

(That it should turn out that the people who set up a charter school were grifting is no surprise to me.)
25 J-D 03.05.26 at 2:50 am: I am always puzzled by this, because whenever I try them, they fail me. A few months ago, I tried a presumably “agentic” system that was purpose-built for research, and its literature review function was like a clown car driven into a ditch that is also on fire.

Bret Devereaux at A Collection Of Unmitigated Pedantry reported asking a Large Language Model to describe the relationship between two specified books (neither of which I have read myself, so I can’t comment from that kind of knowledge). The answer supplied gave brief (and, apparently, flawed) summaries of the two books and then concluded with a characteristically generalised bromide about how the topic was similar for the two books but they approached it in different ways.

Using as a research tool not a Large Language Model but an ordinary Web search engine easily turned up a review of the later published book which revealed the key fact which the Large Language Model had failed to report, that it (the later book) was presented in large part as a rebuttal to the earlier book.

The Large Language Model’s ability to provide (even faulty) summaries of the books showed that its training data set contained information about them, presumably reviews; but it was unable to pick out from that information the point which was really salient.
26 Kaleberg 03.05.26 at 6:03 pm: Someday, LLMs will be as quaint as Diderot’s Encyclopedia. They’ll have a certain dated charm and provide historians with a distorted but valuable window into the past. They stand on the shoulders of giants having digested all of those pirated books, papers and websites personal and corporate. Even now, that commons is being enclosed. Worse, once LLMs come into production, the waters will get muddier.

Training future LLMs becomes more challenging when LLM or LLM assisted output becomes part of the training mix. The eigenvectors of mediocrity will come to dominate. Experiments with image iteration show that the end results is a dozen or so stock photos, and there is no reason to believe text or video will fare better.
27 Alex SL 03.06.26 at 8:53 am: J-D,

Yes, the key here is that LLMs do not actually understand anything. They extrude words, and often they guess right what words we expected, and then we are also biased towards assuming that something that can produce language is sentient, so we fill in the rest.

It helps if the outputs of an LLM can be easily verified. Two good examples are coding and literature/references. Run the code, see if it does what we expect, if not try again. An “agent” can do that, and in the end the code does what the user wanted, and the question is only how many tokens it cost. In research, the “agent” can check if a paper that is mentioned in the generated text can actually be found in a literature database. What is much more difficult to check is if the LLM understood correctly what that actually existing paper was saying.

Kaleberg,

I don’t know how worried I have to be about model collapse, but it has occurred to me long ago that AI engineers are uniquely relaxed about that possibility. No other field that uses any kind of model is like them. For example, when the possibility of running out of training data was first discussed, many AI CEOs and their hangers-on said they would just train the next models on the outputs of the previous generation. If we translate that idea into the context of, say, ecological or evolutionary models, it is immediately revealed as completely deranged.

What now, I fit a model to the data, then I make some predictions, then I fit the model on my predictions, I make more predictions, then I fit the model again on its predictions… if you suggested that to colleagues, they would start asking questions to the effect of whether you have bumped your head or how many figures you think they are showing to you. But in LLM land, that idea is just Tuesday.
28 Zamfir 03.06.26 at 12:09 pm: @Alex,
The LLM developers are gathering a huge collection of real world usage information. That means that “train on the outputs of the previous version” does not happen in an information-free vacuum. It comes a with additional in

formation on which output was considered useful by its users, and which was not.

Some of that is people extruding slop without any care. But it also includes dozens of millions of people with genuine expertise who are daily feeding and reviewing their work through the servers of the LLM people.

A typical example: the IT guy of the company where I work makes instructions on how to use.all kinds of software. He would start with instructions form his yppliers, then modify them. Now he asks some Microsoft LLM tool for a first draft, and modifies that – or more typically tells it to make changes.

As he is very aware, he (and his thousands of colleagues in other companies) are basically doing free work for microsoft. Even if they dont provide any info the LLM, just accepting the output or not is relevant information.

Something sinilar happens on a more zoomed-out scale. Think of software spit out by LLM. The LLM people get the detailed feedback from reviewers, but even for mostly unreviewed code they often get to see which code makes it to production and stays fairly unmodified, and which code ends up on wastebin.
29 J, not that one 03.06.26 at 4:07 pm: “What now, I fit a model to the data, then I make some predictions, then I fit the model on my predictions, I make more predictions, then I fit the model again on its predictions…”

Yes, the idea seems to be that you fit a model to the data, and that gives you truths (not predictions), so naturally feeding those truths back to the same system would produce even better truths! There’s a failure here which should be obvious, and apparently a lot of people don’t notice. We should fix that.
30 JimV 03.06.26 at 6:50 pm: From Dr. Aaronson’s “Shtetl-Optimized” website today:

“Donald Knuth has published a 5-page document* about how Claude was able to solve a tricky graph theory problem that arose while he was working on the latest volume of The Art of Computer Programming—a series that Knuth is still writing after half a century. As you’d expect from Knuth, the document is almost entirely about the graph theory problem itself and Claude’s solution to it, eschewing broader questions about the nature of machine intelligence and how LLMs are changing life on Earth. To anyone who’s been following AI-for-math lately, the fact that Claude now can help with this sort of problem won’t come as a great shock.”

https://www-cs-faculty.stanford.edu/~knuth/papers/claude-cycles.pdf

Excerpt:

Shock! Shock! I learned yesterday that an open problem I’d been working on for several weeks had just been solved by Claude Opus 4.6—Anthropic’s hybrid reasoning model that had been released three weeks earlier! It seems that I’ll have to revise my opinions about “generative AI” one of these days. What a joy it is to learn not only that my conjecture has a nice solution but also to celebrate this dramatic advance in automatic deduction and creative problem solving. I’ll try to tell the story briefly in this note.

Here’s the problem, which came up while I was writing about directed Hamiltonian cycles for a future volume of The Art of Computer Programming:

Consider the digraph with m^3 vertices ijk for 0 ? i, j, k < m, and three arcs from
each vertex, namely to i+jk, ij+k, and ijk+, where i+ = (i+1) mod m. Try to find
a general decomposition of the arcs into three directed m^3-cycles, for all m >2.
31 Alex SL 03.06.26 at 9:22 pm: Zamfir,

What I am talking about is e.g. not having enough videos to train video generating models and then claiming that they can be trained on videos they created themselves, or the same for novels. Sure, if humans invest enormous amounts of time to create fresh data, like your colleague times hundred million other people, then there are fresh data. But the claim I am addressing was that this can all be automated except for some annotation of the training data.

The problem is not what if the LLMs of 2030 have been trained on a carefully quality controlled, guaranteed accurate corpus of data, but what if the LLMs of 2030 have been trained on social media that are by then 50% LLM-generated propaganda, scams, and shrimp jesuses, on hundreds of thousands of LLM-generated eBooks that were mass-produced by a handful of scammers at a rate of five per day in the hope that each would be bought by about eight or ten people by accident, and on websites that are low-effort LLM-generated “information” existing entirely to sell something.

I have suspicions regarding Botanical Realm, for example; it has alleged “information” on most plant species I have ever googled for in my line of work. It features tens of thousands of pages, an effort that would require an enormously large team or community effort to write, maintain, update, and quality-check if it was done by humans. The text on every page I have checked so far screams LLM-generated:

“[Plant species] ([Latin name]) is a fascinating species within the [plant family], known for its striking appearance and resilient nature. This herbaceous perennial plant has garnered interest from botanists, ecologists, and gardeners alike due to its unique characteristics and ecological role. In this article, we will delve into the various aspects of this plant, from its physical traits to its importance in our ecosystems.”

By the way, this is from the page about a globally invasive weed that no gardener would find attractive, and it isn’t perennial either.

And then there is a shop tab where you can buy some indoor potting mix, which doesn’t feel like it could fund an operation large enough to write the website. This was almost certainly extruded by an LLM with no quality control whatsoever. Will people who train a later iteration of LLMs necessarily realise that? Will they do so for all the other garbage that is out there?

In the Next Great Transformation AI will not eliminate genuine expertise; rather it will make it more valuable

Recent Comments

Search

Archives

Pages

Book Events

Contributors

Fine Print

Lumber Room

Old Wood

Meta

Recent Posts

Tags