There’s a great anecdote about Roman Jakobson, the structuralist theorist of language, in Bernard Dionysius Geoghegan’s book, Code: From Information Theory to French Theory. For Jakobson, and for other early structuralist and post-structuralist thinkers, language, cybernetic theories of information, and economists’ efforts to understand how the economy worked all went together :
By aligning the refined conceptual systems of interwar Central European thought with the communicationism of midcentury American science, Jakobson envisioned his own particular axis of global fraternity, closely tied to forces of Western capitalist production. (He colorfully illustrated this technoscientific fraternity when he entered a Harvard lecture hall one day to discover that the Russian economist Vassily Leontieff, who had just finished using the room, had left his celebrated account of economic input and output functions on the blackboard. As Jakobson’s students moved to erase the board he declared, “Stop, I will lecture with this scheme.” As he explained, “The problems of output and input in linguistics and economics are exactly the same.”*
If you were around the academy in the 1980s to the early 1990s, as I was, just about, you saw some of the later consequences of this flowering of ambition in a period when cybernetics had been more or less forgotten. “French Theory,” (to use Geoghegan’s language) or Literary Theory, or Cultural Theory, or Critical Theory** enjoyed hegemony across large swathes of the academy. Scholars with recondite and sometimes rebarbative writing styles, such as Jacques Derrida and Michel Foucault, were treated as global celebrities. Revelations about Paul De Man’s sketchy past as a Nazi-fancier were front page news. Capital-T Theory’s techniques for studying and interpreting text were applied to ever more subjects. Could we treat popular culture as a text? Could we so treat capitalism? What, when it came down to it, wasn’t text in some way?
And then, for complex reasons, the hegemony shriveled rapidly and collapsed. English departments lost much of their cultural sway, and many scholars retreated from their grand ambitions to explain the world. Some attribute this to the Sokal hoax; I imagine the real story was more interesting and complicated, but have never read a good and convincing account of how it all went down.
Leif Weatherby’s new Language Machines: Cultural AI and the End of Remainder Humanism is a staggeringly ambitious effort to revive cultural theory, by highlighting its applicability to a technology that is reshaping our world. Crudely simplifying, if you want to look at the world as text; if you want to talk about the death of the author, then just look at how GPT 4.5 and its cousins work. I once joked that “LLMs are perfect Derridaeians – “il n’y pas de hors texte” is the most profound rule conditioning their existence.” Weatherby’s book provides evidence that this joke should be taken quite seriously indeed.
As Weatherby suggests, high era cultural theory was demonstrably right about the death of the author (or at least; the capacity of semiotic systems to produce written products independent of direct human intentionality). It just came to this conclusion a few decades earlier than it ideally should have. A structuralist understanding of language undercuts not only AI boosters’ claims about intelligent AI agents just around the corner, but the “remainder humanism” of the critics who so vigorously excoriate them. What we need going forward, Weatherby says, is a revival of the art of rhetoric, that would combine some version of cultural studies with cybernetics.
Weatherby’s core claims, then, are that to understand generative AI, we need to accept that linguistic creativity can be completely distinct from intelligence, and also that text does not have to refer to the physical world; it is to some considerable extent its own thing. This all flows from Cultural Theory properly understood. Its original goal was, and should have remained, the understanding of language as a system, in something like the way that Jakobson and his colleagues outlined.
Even if cultural theory seems bizarre and incomprehensible to AI engineers, it really shouldn’t. Rather than adapting Leontieff’s diagrams as an alternative illustration of how language works as a system, Weatherby reworks the ideas of Claude Shannon, Warren McCulloch and Walter Pitts, to provide a different theory of how language maps onto math and math maps onto language.
This heady combination of claims is liable to annoy nearly everyone who talks and writes about AI right now. But it hangs together. I don’t agree with everything that Weatherby says, but Language Machines is by some distance the most intellectually stimulating and original book on large language models and their kin that I have read.
Two provisos.
First, what I provide below is not a comprehensive review, but a narrower statement of what I personally found useful and provocative. It is not necessarily an accurate statement. Language Machines is in places quite a dense book, which is for the most part intended for people with a different theoretical vocabulary than my own. There are various references in the text to this “famous” author or that “celebrated” claim: I recognized perhaps 40% of them. My familiarity with cultural theory is the shallow grasp of someone who was trained in the traditional social sciences in the 1990s, but who occasionally dreamed of writing for Lingua Franca. So there is stuff I don’t get, and there may be big mistakes in my understanding as a result. Caveat lector.
Second, Weatherby takes a few swings at the work of Alison Gopnik and co-authors, which is foundational to my own understanding of large models (there is a reason Cosma and I call it ‘Gopnikism’). I think the two can co-exist in the space of useful disagreement, and will write a subsequent piece about that, which means that I will withhold some bits of my argument until then.
Weatherby’s argument pulls together cultural theory (specifically, the semiotic ur-theories of Jakobson, Saussure and others), with information theory a la Claude Shannon. This isn’t nearly as unlikely a juxtaposition as it might seem. As Geoghegan’s anecdote suggests, there seemed, several decades ago, to be an exciting convergence between a variety of different approaches to systems, whether they were semiotic systems (language), information systems (cybernetics) or production systems (economics). All seemed to be tackling broadly comparable problems, using loosely similar tools. Cultural theory, in its earlier formulations, built on this notion of language as a semiotic system, a system of signs, in which the meaning of particular signs drew on the other signs that they were in relation to, and to the system of language as a whole.
Geoghegan is skeptical about the benefits of the relationship between cybernetics and structural and post structural literary theory. Weatherby, in contrast, suggests that cultural theory took a wrong turn when it moved away from such ideas. In the 1990s, it abdicated the study of language to people like Noam Chomsky, who had a very different approach to structure, and to cognitive psychology more generally. Hence, Weatherby’s suggestion that we “need to return to the broad-spectrum, concrete analysis of language that European structuralism advocated, updating its tools.”
This approach understands language as a system of signs that largely refer to other signs. And that, in turn, provides a way of understanding how large language models work. You can put it much more strongly than that. Large language models are a concrete working example of the basic precepts of structural theory and of its relationship to cybernetics. Rather than some version of Chomsky’s generative grammar, they are based on weighted vectors that statistically summarize the relations between text tokens; which word parts are nearer to or further from each other in the universe of text that they are trained on. Just mapping the statistics of how signs relate to signs is sufficient to build a working model of language, which in turn makes a lot of other things possible.
LLM, then, should stand for “large literary machine.” LLMs prove a broad platform that literary theory has long held about language, that it is first generative and only second communicative and referential. This is what justifies the question of “form”—not individual forms or genres but the formal aspect of language itself—in these systems. Indeed, this is why literary theory is conjured by the LLM, which seems to isolate, capture, and generate from what has long been called the “literary” aspect of language, the quality that language has before it is turned to some external use.
What LLMs are then, are a practical working example of how systems of signs can be generative in and of themselves, regardless of their relationship to the ground truth of reality.
Weatherby says that this has consequences for how we think about meaning. He argues that most of our theories of meaning depend on a ‘ladder of reference’ that has touchable empirical ground at the ladder’s base. Under this set of claims, language has meaning because, in some way, it finally refers back to the world. Weatherby suggests that “LLMs should force us to rethink and, ultimately, abandon” this “primacy of reference.””
Weatherby is not making the crude and stupid claim that reality doesn’t exist, but saying something more subtle and interesting. LLMs illustrate how language can operate as a system of meaning without any such grounding. For an LLM, text-tokens only refer to other text-tokens; they have no direct relationship to base reality, any more than the LLM itself does. The meaning of any sequence of words generated by an LLM refers, and can only refer to, other words and the totality of the language system. Yet the extraordinary, uncanny thing about LLMs is that without any material grounding, recognizable language emerges from them. This is all possible because of how language relates to mathematical structure, and mathematical structure relates to language. In Weatherby’s description:
The new AI is constituted as and conditioned by language, but not as a grammar or a set of rules. Taking in vast swaths of real language in use, these algorithms rely on language in extenso: culture, as a machine. Computational language, which is rapidly pervading our digital environment, is just as much language as it is computation. LLMs present perhaps the deepest synthesis of word and number to date, and they require us to train our theoretical gaze on this interface.
Hence, large language models demonstrate the cash value of a proposition that is loosely adjacent to Jakobson’s blackboard comparison. Large language models exploit the imperfect but useful mapping between the structures within the system of language and the weighted vectors that are produced by a transformer: “Underneath the grandiose ambition … lies nothing other than an algorithm and some data, a very large matrix that captures some linguistic structure” Large language models, then, show that there is practical value to bringing the study of signs and statistical cybernetics together in a single intellectual framework. There has to be, since you can’t even begin to understand their workings without grasping both.
Similarly, large language models suggest that structural theory captures something important about the relationship between language and intelligence. They demonstrate how language can be generative, without any intentionality or intelligence on the part of the machine that produces them. Weatherby suggests that these models capture the “poetics” of language; not simply summarizing the innate structures of language, but allowing new cultural products to be generated. Large language models generate poetry; “language in new forms,” which refers to language itself more than to the world that it sometimes indirectly describes. The value matrix in the model is a kind of “poetic heat-map,” which
stores much more redundancy, effectively choosing the next word based on semantics, intralinguistic context, and task specificity (set by fine-tuning and particularized by the prompt). These internal relations of language—the model’s compression of the vocabulary as valued by the attention heads—instantiate the poetic function, and this enables sequential generation of meaning by means of probability.
Still, poetry is not the same as poems:
A poem “is an intentional arrangement resulting from some action,” something knit together and realized from the background of potential poetry in language: the poem “unites poetry with an intention.” So yes, a language model can indeed (and can only) write poetry, but only a person can write a poem
That LLMs exist; that they are capable of forming coherent sentences in response to prompts; that they are in some genuine sense creative without intentionality, suggests that there is something importantly right about the arguments of structuralist linguistics. Language demonstrably can exist as a system independent of the humans who employ it, and exist generatively, so that it is capable of forming new combinations.
This cashes out as a theory of large language models that are (a) genuinely culturally generative, and (b) incapable of becoming purposively intelligent, any more than the language systems that they imperfectly model are capable of becoming intelligent. Under this account, the “Eliza effect” – the tendency of humans to mistake machine outputs for the outputs of human intelligence – is not entirely in error. If I understand Weatherby correctly, much of what we commonly attribute to individual cognition is in fact carried out through the systems of signs that structure our social lives. In this vision of the cultural and social world, Herbert Simon explicitly rubs shoulders with Claude Levi-Strauss.
This means that most fears of AGI risk are based on a basic philosophical confusion about what LLMs are, and what they can and cannot do. Such worries seem:
to rest on an implicit “I’m afraid I can’t do that, Dave.” Malfunction with a sprinkle of malice added to functional omniscience swims in a soup of nonconcepts hiding behind a wall of fictitious numbers.
Languages are systems. They can most certainly have biases, but they do not and cannot have goals. Exactly the same is true for the mathematical models of language that are produced by transformers, and that power interfaces such as ChatGPT. We can blame the English language for a lot of things. But it is never going to become conscious and decide to turn us into paperclips. LLMs don’t have personalities, but compressions of genre that can support a mixture of ‘choose your own adventure’ with role-playing game. It is very important not to confuse the latter for the former.
This understanding doesn’t just count against the proponents of AGI. It undermines the claims of many of their most prominent critics. Weatherby is ferociously impatient with what he calls “remainder humanism,” the claim that human authenticity is being eroded by inhuman systems. We have lived amidst such systems for at least the best part of a century.
In the general outcry we are currently hearing about how LLMs do not “understand” what they generate, we should perhaps pause to note that computers don’t “understand” computation either. But they do it, as Turing proved.
And perhaps for much longer. As I read Weatherby, he is suggesting that there isn’t any fundamental human essence to be eroded, and there cannot reasonably be. The machines whose gears we are trapped in don’t just include capitalism and bureaucracy, but (if I am reading Weatherby right), language and culture too. We can’t escape these systems via an understanding of what is human that is negatively defined in contrast to the systems that surround us.
What we can do is to better map and understand these systems, and use new technologies to capture the ideologies that these systems generate, and perhaps to some limited extent, shape them. On the one hand, large language models can create ideologies that are likely more seamless and more natural seeming than the ideologies of the past. Sexy murder poetry and basically pleasant bureaucracy emerge from the same process, and may merge into becoming much the same thing. On the other, they can be used to study and understand how these ideologies are generated (see also).
Hence, Weatherby wants to revive the very old idea that a proper education involved the study of “rhetoric,” which loosely can be understood as the proper understanding of the communicative structures that shape society. This would not, I think, be a return to cultural studies in the era of its great flowering, but something more grounded, combining a well educated critical imagination, with a deep understanding of the technologies that turn text into numbers, and number into text.
This is an exciting book. Figuring out the heat maps of poetics has visible practical application in ways that AGI speculation does not. One of my favorite parts of the book is Weatherby’s (necessarily somewhat speculative) account of why an LLM gets Adorno’s Dialectic of Enlightenment right, but makes mistakes when summarizing the arguments of one of his colleague’s books about Adorno, and in so doing reveals the “semantic packages” guiding the machine in ways that are reminiscent of Adorno’s own approach to critical theory:
Dialectic of Enlightenment is a massively influential text—when you type its title phrase into a generative interface, the pattern that lights up in the poetic heat map is extensive, but also concentrated, around accounts of it, debates about it, vehement disagreements, and so on. This has the effect of making the predictive data set dense—and relatively accurate. When I ask about Handelman’s book, the data set will be correspondingly less concentrated. It will overlap heavily with the data set for “dialectic of enlightenment,” because they are so close to each other linguistically, in fact. But when I put in “mathematics,” it alters the pattern that lights up. This is partly because radically fewer words have been written on this overlap of topics. I would venture a guess that “socially constructed” comes up in this context so doggedly because when scholars who work in this area discuss mathematics, they very often assert that it is socially constructed (even though that’s not Handelman’s view). But there is another group that writes about this overlap, namely, the Alt Right. Their anti-Semitic conspiracy theory about “cultural Marxism,” which directly blames Adorno and his group for “making America Communist,” will have a lot to say about the “relativism” that “critical theory” represents, a case in point often being the idea that mathematics is “socially constructed.” We are here witnessing a corner of the “culture war” semantic package. Science, communism, the far right, conspiracy theory, the Frankfurt School, and mathematics—no machine could have collated these into coherent sentences before 2019, it seems to me. This simple example shows how LLMs can be forensic with respect to ideology.
It’s also a book where there is plenty to argue with! To clear some ground, what is genuinely interesting to me, despite Weatherby’s criticisms of Gopnikism, is how much the two have in common. Both have more-or-less-independently converged on a broadly similar notion: that we can think about LLMs as “cultural or social technologies” or “culture machines” with large scale social consequences. Both characterize how LLMs operate in similar ways, as representing the structures of written culture, such as genre and habitus, and making them usable in new ways. There are sharp disagreements too, but they seem to me to be the kinds of disagreements that could turn out to be valuable, as we turn away from fantastical visions of what LLMs might become in some hazy imagined future, to what they actually are today.
[cross-posted at Programmable Mutter]
- I can’t help wondering whether Leontieff might have returned the favor, had he re-used Jakobson’s blackboard in turn. He had a capacious intellect, and was a good friend of the poet and critic Randall Jarrell; their warm correspondence is recorded in Jarrell’s collected letters.
** Not post-modernism, which was always a vexed term, and more usually a description of the subject to be dissected than the approach to be employed. Read the late Fredric Jameson, who I was delighted to be able to send a fan letter, thinly disguised as a discussion of Kim Stanley Robinson’s Icehenge, a year or so before he died (Jameson was a fan of Icehenge and one of Stan’s early mentors).
{ 29 comments }
MisterMr 07.11.25 at 1:46 pm
As someone who had to study some structuralist theory at uni, I’ll note that tere are two different reasons it “broke” at some point:
At a strictly theoretical level, linguists realized we can’t take out pragmatic inferences about what the other person is trying to say from the interpretive process (pragmatic in the sense of the distinction pragmatics/semantisc/sintax); basically structuralism was an attempt to take awy the theory of mind from “meaning”, but they realized this didn’t work.
In the late 90s early 00s I studied for my degree in media studies [scienze della comunicazione] at the Bologna university, and in that period Umberto Eco worked there, so I ended up reading a lot of Eco’s theorical books about semiotics (Eco being the most important semiotician in Italy).
In his early book “Semiotics”, which is super boring, he starts with a definition of semiotics that uses as an example a system that reads the level of water in a dam and sends a signal to open a dam as an example of “code”; however his later books like Lector in fabula are all about interpretive bets.
The reason is that structuralists realized that there were too many problems even in very simple sentences that couldn’t be resolved without an interpretive bet, and therefore the attribution of intentionality (theory of mind).
On the other hand, the cultural impact of these theories faded because they were so overused as to become abused (so for example the Sokal hoax has nothing to say about structural linguistics, it just shows that a certain language was abused to the point of becoming meaningless) and IMHO also because of the shift of importance in academy with the USA eating 99% of the pie and other places like France becoming much less important, and since this kind of structuralism was largely an euro thing it simply lost relevance (being substiuted, as generic catch-all well-educated way of thinking, IMHO by analythic philosophy).
LFC 07.11.25 at 3:13 pm
On a quick reading, I think the OP is probably too favorable to parts of (what it presents as) Weatherby’s position.
For example: “Weatherby is ferociously impatient with what he calls ‘remainder humanism,’ the claim that human authenticity is being eroded by inhuman systems.”
What’s being eroded is not human authenticity, but the ability to distinguish between situations where human authenticity is operating and situations where it is not operating. If, for instance, a student writes an essay, composing the sentences and deciding which words to use and which order to put them in, that is an example of human authenticity, even if it is an authenticity conditioned by the “systems of signs that structure our social lives,” in the OP’s words. If, on the other hand, the student puts a prompt into an LLM and the LLM “writes” the essay, that is not an example of human authenticity. It may be an example of a certain sort of human ingenuity ‘(or cleverness), if it took some tries to get the “right” prompt, but ingenuity and authenticity are not the same thing. (I’ve never actually used an LLM to do anything, so I’m just going by what I’ve read about them and how they work.)
Bill Benzon 07.11.25 at 3:42 pm
FWIW, I was a sophomore at Johns Hopkins when the notorious structuralism conference was held. Though I didn’t attend any of the sessions, which were held in French, you’ll find a comment from me in the volume that came out of the conference: Comment on Dyson-Hudson’s essay on “Levi-Strauss and Radcliffe-Brown,” The Languages of Criticism and the Sciences of Man, 1970, pp. 244 – 245. I became hooked on structuralism (and went through it to cognitive science) before it disolved in deconstruction.
Thus, not long after I began playing with ChatGPT I did some experiments on stories that were modeled on Lévi-Strauss: The structuralist aesthetics of ChatGPT, New Savanna, January 8, 2023. I subsequently developed that into a working paper: ChatGPT tells stories, and a note about reverse engineering, March 3, 2023. Here’s the abstract:
Somewhere on the web you’ll find Steven Harnad (founding editor of Behavioral and Brain Sciences) saying something to the effect that what’s most remarkable is how much LLMs can accomplish without understanding. That seems right to me.
Aaron Tan 07.11.25 at 6:56 pm
I’ve recently began experimenting with using LLMs to compose essays. It started as just trying to consolidate my more interesting conversations into a more presentable and easily referenced form factor, but has transformed into viewing AI-mediated writing as a legitimate medium in itself. Last night, I was dialoging with an LLM on LFC’s exact Heideggerian concern: “What’s being eroded is not human authenticity, but the ability to distinguish between situations where human authenticity is operating and situations where it is not operating.” What came out of that was an essay On Writing with AI and the Persistence of Sorge (“Care”), if anyone is curious to read it and provide their thoughts:
https://substack.com/@autumncamouflage/p-168049340
Mtn Marty 07.11.25 at 8:23 pm
Does this take a position on how semantics works? The action always seemed to me on how definitions occur. I’m not sure if we have found out that we are AIs and also that therefore definitions don’t exist, like undefined terms in math, so something else.
Kenny Easwaran 07.11.25 at 10:17 pm
I like that idea that a pure transformer-based LLM is just like a Derridean model of language – “il n’y a pas de hors-texte”!
But I wonder how much that is all still true with the contemporary multimodal assistants like Claude 4.0, Gemini 2.5, and ChatGPT 4.5. In addition to their basic training on trying to predict language from language, and their extra training on how to be a “helpful, harmless, and honest” assistant, they also now include a lot of multimodal training connecting text to images and audio and video, and they also get some reinforcement learning on actually using their words to get to verifiable answers in mathematical and computational problems. This last bit in particular seems to me like it introduces some more direct connections between the words and the things they are used for, rather than just between words and more words, like the ones from 2017-2023 did.
JPL 07.12.25 at 4:06 am
Probably more prominent than cybernetics, in the days of Jakobson, and just as megalomaniacal in its intellectual ambitions, was general system theory, as propounded by people like Ludwig von Bertalanffy. It was big, but it looks like people got bored with it after a while; but these were probably part of the same general intellectual tendency that we find today in the “generative AI” people. (BTW, one of the stated aims of the cyberneticists was to construct, via the engineering process, an actual human person, like a Frankenstein monster.)
I’m sorry to say that your account and the account of Leif Weatherby that you’re reviewing strike me as an elaborate jambalaya of ideas, tasty, but lacking precision. The big problem in talking about the phenomenon of human language, for linguists as well as philosophers, let alone literary critics, is the problem of getting clear about precisely what aspect of this complex phenomenon they are referring to when they want to talk about language. (And I would disagree with your use, apparently following Weatherby, of the term ‘reference’ in the above. Reference is all about how language “hooks on to the world”, to use Putnam’s phrase, whereas relations of meaning within texts are covered under the heading of textual cohesion, such as relations of anaphora. Even when I say “language” in the previous sentence it’s crude usage, since the ultimate relation of “hooking on to the world” involves the semantic categories (meanings of lexemes and morphemes) and propositional schemata and their instances in acts of perception and language use. BTW formal language texts (e.g., proofs) have no capacity for reference. However, these systems of categories do have an objective existence independently of any particular individual user in the systems of speech-community norms for natural languages. The big question is, given a description of what ,e.g., a category can refer to in the world and how it differs from a related category, how did it come to be that way? And so forth.) I would guess that LLM based AI robots would not work as efficiently or not at all if all they had to deal with were the sequences of letters in texts; to a great extent they are helped by the fact that, by the analysis implicit in human orthographic conventions, these symbols are organized into linguistically significant units, like words and sentences. For the human user who knows the language used, these significant forms provide a constant relation of indexation to the system of semantic categories and schemata, which make possible the further significance of the concrete symbols. The AI model/robot has no access to these categorical systems of further significance. However, you might say, in a literary sort of way, that when a user instantiates the categories in an act of language use, it is the categories that generate the significance as possible instances, and that there is an “activity of the categories”. The AI robot can produce the linguistically significant form, the sentence, but only the human user can produce the form’s further significance, what we call “the meaning of the sentence” (or “what is expressed”, or, to use Frege’s term, “the thought”). (And it is something that has to be produced by action, not an abstract ideal object.)
Chris Bertram 07.12.25 at 6:37 pm
If you take a text like Rousseau’s dedicatory epistle to the 2nd Discourse, you have a set of words, sentences, grammatical structures and the like. But the meaning of the text is neither a simple matter of how it hooks onto the world, nor a question intra-linguistic relationships because, nothwithstanding Theory (circa 1980) we have to examine the question of what Rousseau intended to communicate and how the audience he had in mind would have received the text. Possibilities include (a) that the text was produced as a sincere effort to praise his Genevan compatriots by someone with woefully false beliefs about the nature of the Genevan polity, or (b) that the text was intended sarcastically, taking the form of apparent praise, by someone all-too-well-informed about the nature of 18th-century Geneva. Now I don’t doubt that LLMs can produce texts, and even that those texts may contain ambiguities and other interpretive puzzles, but the fact of human authorship and audience means that there is in principle something to get right or wrong about what the text is for and how it achieves that goal (e.g., without getting too Straussian, by containing clues “between the lines” that the intended reader can recognize). And that getting right and wrong can’t simply treat the text as a discrete and purely linguistic object.
engels 07.12.25 at 7:32 pm
#1 According to François Cusset at least, French Theory is American. Like Pizza (ducks).
https://parisinstitute.org/depictions-article-french-theory-an-anti-american-american-invention/
https://www.dailymail.co.uk/news/article-12076531/Pizza-know-invented-America-NOT-Italy-declares-Italian-professor-food-history.html
David in Tokyo 07.13.25 at 6:00 am
JPL said a bunch of sensible stuff, but:
“would not work as efficiently or not at all if all they had to deal with were the sequences of letters in texts; to a great extent they are helped by the fact that, by the analysis implicit in human orthographic conventions, these symbols are organized into linguistically significant units, like words and sentences. ”
That would have been my thought as well, but Japanese has no word breaks*, and ChatGPT was doing Japanese just fine almost immediately. Japanese does have sentence breaks, so it’s not a complete negation of your sensible sensibility.
My opinion of the LLM idea (statistics on undefined tokens without any sort of “world model” (whatever that means)) is that when the smoke clears, it will be seen as the most incredibly stupid god-awful waste of time and money in the history of computer science. But that was at the second generation of these things (pre-ChatGPT), and the stupidity continues. Go figure.
*: In real life, the word breaks are completely obvious. In over 45 years at it, I only remember being badly bitten by it once.
MisterMr 07.13.25 at 12:48 pm
@engels 9
“Pizza as we know it was invented in America NOT Italy, declares Italian professor of food history”
The words “as we know it” are making a lot of work here, but basically traditional neapolitan pizza bas bread + tomatom + mozzarella (and perhaps anchovies, but it’s not sure), which is to say what we today call “pizza Margherita”.
It is dubious if the idea of putting other stuff on it happened first among italian communities in the USA (where it is first attested) or in Italy (where it is attested slightly later, but since it is not a big change might have happened earlier).
Keep in mind that even pizza Margherita became well known in Italy quite late, in the late 19th century (was named so in 1889 after a queen of Italy and later became famous).
The article makes a bit of confusion because before 1889 the word “pizza” had a different meaning in italian, so basically:
Since the time of the ancient greeks there was in the mediterranean the “pitta”, a loaf of bread with stuff on it, from which the italian word “pizza” comes.
Then (quite later since tomato came from the americas and became a staple around 1600) in the port city of Neaples the local “bread witrh stuff on it” became “bread with tomato on it, and perhaps anchovies or the local mozzarella”. We don’t know exactly when this happened because it was a poor man dish and so not attested.
Then in 1889, a few years after italian unification, in the occasion of the visit of queen Margherita in Neaples, a local cook had the idea of producing a lot of this neapolitan pizza in the version tomato + mozzarella + basel (the colors of the italian flag), and this version became famous in Italy as pizza margherita (so this version is clearly attested as italian).
Then in the early 20th century it became common to have the “pizza Margherita + something else on it” that is what we mostly mean now as pizza. The first attested versions of this appear on the menus of italian-american pizzerias, so it is likely (but not certain) that this change happened in the USA and then retro-imported to Italy.
And this proves that my opinions on french structuralism are correct.
wetzel-rhymes-with 07.13.25 at 5:04 pm
Aaron Tan I enjoyed your thoughtful essay. I have been thinking a lot about Heidegger and AI. At one point, you wrote,
“The question isn’t whether you chose each word, but whether the final product bears authentic witness to your encounter with your ideas.”
I think that an AI produced text may be satisfying to a producer, and they may believe they have made an authentic expression, but a communication in which ideas are shared is like a cooperative game between author and reader. However, in AI generated language a non-cooperative aspect is baked in, a kind of lie. Why do I hear the “voice” of the writing? Whose voice am I hearing? An existential property of language is this expressive dimension, the voice, or inwardness of the author. Without this, the expression is just a pantomime of the phenomenology of language. To accept “the death of the author” because AI seems to confirm the claims of post-structuralism is like accepting capitalism’s totalitarian view of history where what wins must be what’s right.
Thoughtful essay, though I doubt it.
J, not that one 07.13.25 at 6:15 pm
Interesting anecdote at the beginning. I always assumed Jakobson borrowed from cybernetics-style theories deliberately, because he liked the metaphor.
J, not that one 07.13.25 at 6:37 pm
My hunch is that theory hasn’t gone away so much as it’s become invisible. It was incorporated into the English departments (and related departments) and their graduates to such a degree that it’s common sense.
Incidentally, I decided to read Lyotard finally and discovered that an awful lot of The Postmodern Condition is about how horrible it is that even the English departments think everything is a computer now. We’ll all have to be turned into numbers, all our words will have to be turned into numbers, so the managers can use their computers to control society as they inevitably will. It seems very, very off-point in 2025. (Once I figure out how to more or less get the new google book search working I saw neither author under review seems to mention him. The OPer may be disappointed to learn that “inauthor:farrell” produced no results, but then so does “inauthor:derrida”.)
Bill Benzon 07.16.25 at 8:15 pm
Come to think of it, I don’t find the cultural technology thesis terribly interesting. It strikes me as being a dodge. It allows you to take a positive attitude toward LLMs without buying into the hype. That’s good. It saves you from having to sputter unilluminating invective about stochastic parrots.
But it doesn’t tell us much of anything about what the LLMs are doing, and that’s the BIG question that everyone wants answered. Oh, I suppose it’s a way of showing us that culture is mentation that’s so routinized that it can be “captured” by a relatively simple scheme: build a statistical model by predicting the next token. What we really want to know, though, is how that statistical model works. What kind of internal structure does it have – for it must have an intricate and sophisticated structure – and how that structure allows it to generate often interesting and insightful statements in responses to queries? All “cultural technology” achieves slapping a label on this thing we don’t understand.
JPL 07.17.25 at 10:14 pm
Bill Benzon @15:
“But it doesn’t tell us much of anything about what the LLMs are doing, and that’s the BIG question that everyone wants answered.”
Back in the old days, the room full of monkeys pounding on typewriters once in a while produced an interesting sentence, but nobody credited the monkey with the thought the sentence expressed to us. Now we have machines that are much better at generating sentences that are interesting to us, but we don’t know how they do that, since they have apparently not been given any explicit generative principle (so the machine is not functioning as a theoretical model), but only a large number of texts to somehow find patterns in. Is that the question? (I wouldn’t say the robots have “discovered” something (nor would I expect them to even hazard a guess about it), but we probably would like to know what are the principles that determine just the patterns that the texts display and not other possible patterns? I’m not surprised that it’s probably not the principles of Chomsky’s syntactic theories, which used to be generative in the appropriate sense, because he’s been looking in the wrong place, due to the lingering Machean influence.)
MisterMr 07.18.25 at 11:15 am
I had a vague understanding that LLMs worked based more or les on Generative semantics (Lakoff) but not on Generative grammar (Chomsky) because Generative grammar was tried and didn’t work, however searching online I can find no reference to this.
engels 07.18.25 at 1:26 pm
Then (quite later since tomato came from the americas and became a staple around 1600) in the port city of Neaples the local “bread witrh stuff on it” became “bread with tomato on it
As I understand, Signor Professore was denying this: it was Italian-Americans who had this idea.
But okay, apart from tomatoes, tomato sauce, using tomato sauce as a topping… along with mozzarella, pepperoni, mushrooms, ham, olives, etc… what have the Americans ever given pizza?
MisterMr 07.18.25 at 5:29 pm
@engels
I think he was either misquoted or mistranslated, the origins of pizza margherita are quite uncontroversial, though the word pizza was used for other things too (the first attestation is for pizza ligure, which is indeed a focaccia but not sweet, https://en.m.wikipedia.org/wiki/Sardenaira )
engels 07.18.25 at 9:53 pm
MrMr, this has a bit more detail on the Margherita origins point:
https://www.pmq.com/food-historian-pizza-as-we-know-it-today-originated-in-the-u-s-not-italy/
engels 07.19.25 at 12:11 am
Wake up, sheeple!
https://www.nationalgeographic.com/history/history-magazine/article/pizza-margherita-may-be-fit-for-a-queen-but-was-it-named-after-one
engels 07.19.25 at 7:43 pm
All your pizza bases are belong to US.
https://www.bbc.com/travel/article/20250227-is-there-no-such-thing-as-italian-cuisine
MisterMr 07.20.25 at 8:40 am
@engels
Noooh I can’t accept it! How could you… [head explodes in a gush of blood and tomato sauce]
Seriously though, in defense of French structuralism, I have to say that if a dish of bread with mozzarella, slices or dices of tomato, and perhaps basil was common in Naples around 1880, this is enough for me to say that modern day pizza originated in Naples . Also, tomato sauce was originally made at homes as a way to conserve tomatoes, hence both pizza and most italian pasta sauce (googling around I found a reference that says that the first reference to tomato sauce is from a cookbook from 1790 called “l’Apicio moderno”, so later than I expected but still a century earlier than 1889 and I think before the invention of canned food).
J, not that one 07.20.25 at 6:16 pm
@15 Is how the structure works really what we want to know? Or is that an abbreviation for “when does it work, where does it work, what does it work with,” which all imply “when does it NOT work?” Sometimes the mechanics of it all will answer those questions. Sometimes treating it like a black box will. Anyway what I’ve mostly seen wrt LLMs is an assumption that it works as described because it’s obvious (when both those approaches seem to suggest clear flaws) and that all the other questions demonstrate something like a lack of education.
engels 07.20.25 at 9:00 pm
Italians: you can’t put pineapple on pizza
Also Italians: who cares whether or not you chop the tomatoes
I suppose we’d better to agree to disagree in case someone wants to talk about Barthes.
MisterMr 07.21.25 at 2:36 pm
@engels 25
[twirls mustace]
MisterMr 07.21.25 at 4:46 pm
An additional reflection:
When I was a kid, I was a cub scout. One day my mother gave me my lunch in a closed pack, told me that it was pizza, and sent me to the cub scouts.
That day the chiefs told us that we had to share out lunches, either with the whole group or with our subgroup of 5-6 kids.
I told tò my subgroup to come with me instead of sharing with others, because I had pizza.
But when I opened my lunch I discovered that it wasn’t pizza, but a lousy focaccia with olives; the other kids where very disappointed with me, it was a humiliation.
When I went home I asked an explanation from my mother, but she answered that in the city where she was from (in the Marche region in central Italy, while the incident happened in Lombardy, northern Italy) that kind of focaccia was actually called “white pizza”.
The moral of this is that pizza is, and has always been, a social construction, that participants define and redefine, like bricoleurs, in order tò maximize their social capital.
engels 07.22.25 at 12:12 am
pizza is, and has always been, a social construction, that participants define and redefine, like bricoleurs, in order tò maximize their social capital
I interpret this to mean I can have pineapple and stuffed crust.
TM 07.23.25 at 7:28 am
Pizza notes:
1 Many famous and iconic culinary creations were originally poor people’s food, perhaps because the rich had no need to be inventive.
2 Many supposedly old cultural /national “traditions” are quite recent.
3 The concept of a national cuisine is quite dubious given that food culture is intensely regional. This is especially true for a country like Italy that hasn’t even existed bevore 1860.
4 And of course, the idea of pure cultural / national traditions independent of outside influence is bullshit.
Comments on this entry are closed.