The text is not the product

by Lisa Herzog on May 12, 2026

Academics, especially in the humanities, produce texts, and they teach students to produce text. This is a standard assumption, often taken for granted, and maybe not too surprising in times in which productivity is a supreme social norm. Think of the relief – by students and faculty alike – when a text has been submitted before the deadline. Think of all the praise for writers and texts that goes around in our fields (“prolific,” “rigorous,” “accessible,” …). Think of the proud social media posts with a pile of books fresh off the press (I’ve been guilty of that myself).

Generative AI, for all its problems, has one virtue: it forces us to rethink that assumption. The ease with which AI can spit out seemingly coherent text, or help rewrite a few convoluted sentences into elegant prose, has been perceived by some academics as a threat to the very meaning of our professional existence. “I feel like one of those coal miners must have felt when it was already clear that the mines would be closed soon,” a colleague recently said to me.

I want to resist this idea – maybe out of a desperate desire to cling to my professional identity, but with what I have come to think of as an important distinction: texts as products, or texts as means to something very different.

There may be situations in which texts are really products in and of themselves. I wanted to provide examples (certain types of cheap fiction writing? user manuals? the small print in contracts?), but the longer I think about it, the harder I find it to come up with examples that would really fit. We treat texts as products; they get bought and sold (think of everything around copy right and IP). But in reality, texts are almost always something else. Here is an incomplete list of what texts can be:

a means for communicating certain facts or ideas,
a means for communicating that one knows certain facts or ideas,
a means for helping others solve problems,
a means for establishing a certain formal status, e.g. by defining or shifting or excluding legal liability,
a means for establishing a certain informal status, e.g. by claiming authority over certain ideas,
a means for establishing social bonds, e.g. ingroup and outgroup relations,
a means for transferring emotions,
a means for opening up one’s soul to another human being.
…

Depending on what it is that a text is actually meant to do, it is more or less appropriate to use AI (and by the way – if we had less IP on certain texts, we could do much more with good old copy+paste, for many practical purposes, and would not need AI). Around some things, legal or social norms will probably change (Is it okay to use AI to write a birthday poem for grandma? Will proud authorship claims be made about prompts rather than final products?).

Back to academia, and the doomsaying about the humanities that my colleague expressed. The “product” of a course is not the stack of essays that lands in our inbox, or on our desks, at the end of it – and that might be corrupted by the use of AI. The point of a humanities education, I would argue, is not even to “produce students who can write good texts.” The point is to produce human beings of a certain kind: who understand certain things, who have certain forms of knowledge, who have certain skills such as critical thinking and creativity – and who, as a byproduct, can write good texts.

The availability of AI tools forces us to rethink what it is that we want to achieve with our pedagogical methods. Enter (drum roll) – the “Dublin descriptors.” If you work in European academia and ever had to set up a new program or get an existing one reaccredited, you’ve probably come across this list of words that are meant to describe what students learn (here, for example; for specific programs one then needs to then specify them at each level, and show how each element of the program contributes to the overall goal). When I first came across them, I found this a tedious bureaucratic exercise. Many traditional pedagogical strategies, after all, are meant to achieve a combination of them, e.g. knowledge about a certain classical text and critical exegetic skills and the ability to formulate arguments and exercise judgment. But in times in which AI requires us to rethink many traditional forms of examination, this exercise is actually quite useful for thinking about what one wants to achieve in one’s teaching (and which pedagogical strategies and form of examination fit with those goals).

It is a widespread fallacy that by using AI, students can learn faster. Another dean of my university (let’s be graceful and not mention the discipline) recently said in a meeting that students could use AI to let it summarize “500 pages of text” for them. But why should an employer want to hire graduates who have just read the AI summary of these 500 pages, rather than actually having worked through them? How would such a student later contribute to *expanding* knowledge in the relevant field, by thinking creatively about what is already known and by asking the right questions about what is still unknown? This will still require the cognitive process of going through the 500 pages and understanding them.

The hard work of suffering through such learning processes cannot be replaced by AI. They include many emotional side effects – enthusiasm frustration, triumph, disappointment when the sense of triumph turns out to be premature, etc. From that pedagogical perspective, insofar as writing is part of it, it is very much the process of writing and rewriting that matters, the reaction to feedback, the refinement that comes from someone saying: “I don’t understand what you mean.” It is no accident that learning has almost always been organized in social settings.* You need peers to go through these processes together, and someone to guide and motivate you when things don’t go as smoothly as you would wish. I very much doubt that AI will take over that deeply human role of pedagogy; certainly not for younger children, but probably also not for the young adults we typically teach at universities.

And then there is a last thing over which I’ve recently been mulling a lot. A key point of a good text, written by a person, is that it expresses a sense of that person standing by the words they wrote, of taking a stance because it matters to them: because they want to correct what they see as a fallacy or wrong position, because it is connected to certain interests or values, because they care.

AI, in contrast, cannot care about anything because it is a machine and not a person, it has no vulnerabilities, no dignity, nothing that could be hurt. Insofar as it sounds emotional and engaged, it has copied that tone from texts written by humans who were emotional and engaged. Despite that copying, all too often – at least in the experiments I did with AI so far – it often sounded incredibly bland and indifferent, producing bullshit without accountability. I often couldn’t help thinking, about its tone: a privileged kid, a bit drunk and therefore overconfident, who grew up knowing their daddy will pay for the lawyer to get them out of whatever nonsense they produce with their indifference to truth….

Learning to write, as a human, also means learning to understand what one cares about, and what one is willing to take a stance on. It means learning to weigh one’s words, in written even more than in spoken contexts, because the words are there to stay (the same holds for spoken words that are recorded, of course). The texts may come to contribute to defining who one is, or at least how others perceive one’s public persona. There are still many settings in today’s world, in which what you write can get you shunned, or unemployed, or even killed. In such cases, it takes bravery to stand up for one’s words – and yet it is precisely this courage that often leads to text that really matter.

Maybe it is this attitude, the virtue of truthfulness and the courage to find the right words for what one really thinks, that our ways of teaching students should focus on much more? Then we can be quite sure that no AI will ever replace us.

* I very much enjoyed reading this text about the social nature of human intelligence.

{ 54 comments }

1 Kenny Easwaran 05.12.26 at 5:58 pm: This is a really important set of points! The texts that serve to show one has mastered some material, the texts that aim to open one’s soul to another person, and the texts that establish social bonds, really can’t usefully be produced mechanically. The texts that aim to communicate facts and ideas, or to help others solve problems, might be usefully produced mechanically. One category you didn’t mention that probably also fits in the first category is the kind of texts we produce as ways to improve our own understanding of something – this includes both things like notes (that we usually don’t show other people) but also often things like grant proposals and dissertation abstracts, which are nominally about something else. (And I think that this is one of the big things we want our students to produce.)

I do want to push back on a few things though. For one thing, I do think that “produce students who can produce good texts” is one point of a humanities education, though it’s certainly not the only one. There is value in producing good texts – especially as contrasted with superficially good texts (which are a lot of what AI produces).

More importantly, I don’t think it’s a fallacy that by using AI, students can learn faster. I think it’s a fallacy to think that by using AI to produce the assigned texts students can learn faster. But there are other ways of using AI that can enable faster (and better) learning. We all know that we can get a deeper understanding of a new subject from a one-hour conversation with someone who is familiar with it than we can from spending one hour reading a textbook. And we can get an even deeper understanding by then following that one-hour conversation with a one-hour task where we have to explain it to someone else who pushes back at various points. Students can use AI to have versions of these experiences – surely not as good on the former as an hour one-on-one conversation with the instructor, but probably better on the latter than assigning them to debate with a random other student in the class.

Of course, it’s rare for students to attempt to use AI in these ways, and even rarer for them (or an instructor) to have a good idea of how to set up the AI to effectively provide these kinds of interaction. Figuring out how to tap into this possibility of using AI to improve some parts of education, while figuring out how to avoid the role of AI in pretending to improve other parts, is important.
2 Ray Vinmad 05.12.26 at 8:49 pm: If I am understanding your argument, I believe I agree with you.

The paper is given value and critique in order to structure the experience of writing the paper for the student–so the student will value it, enter into the process, discover the intensity of creating something, struggle with it, etc.

It’s agonizing at times but also an achievement–and one develops skills at reasoning, persuasion, seeing ideas take shape in your own mind, discovering something either in a reading or in the world. Doing this gives people a sense of their own mind working–you are seeing who you are and what you are capable of. You can be curious about what your mind might come up with next.

Eventually anyone who tries it will get better and better at it, as one would always see with students who could only give us rudimentary papers in their intro class and complex and impressive papers in their upper level classes. (We can also look at our own rudimentary papers if we saved them from college.)

The ‘paper’ as the object wasn’t the point but the quality of it had to matter and the feedback on it had to matter in order to motivate and give meaning to the process. The professor is the audience for what you can do with your mind.

The assessment of a ‘thing you turn in’ wasn’t about the thing–it was about you doing the thing.

So the chatbots have wrecked that for the kind of students who dislike the struggle too much or don’t know what it is for–something the Gen AI mavens like to help them with by telling them it has no meaning. So they don’t try it (it is hard) and they never find out if it does have meaning.

These are generally the weaker students or the ones who are more gullible about what they are being told about AI–that in the future no one will need to reason, persuade anybody, struggle with their own thoughts or depend on their minds to assess ideas. They will simply need to ‘have something to say’ and the AI will check to see if the thing they say will pass muster with others as ‘the right thing to say.’ It needn’t come from them at all. They become close to superfluous, a passive recipient of information and not a thinking being.

I do think it’s a significant loss to many people to have so much interference with their chances to have this experience of delving into words and ideas and formulating something that arises from their own mind. There are still many ways around this interference in order to create some of what it was once fairly easy to give students by assigning a paper as the ‘big project’ at the end of the semester. Also, there are certain students one can still assign papers to–though there’s still frustration of having 10 students challenged to write a paper in multiple drafts, present, meet, etc. and having one student insist on trying to do this through AI even if class policy tells them not to– failing a number of assignments miserably because they cannot explain anything and the chatbot has hallucinated stuff, etc.–can demoralize a teacher a bit.

Also, other types of assignments of in class writing and oral components, research narratives, etc. that require one to verbally explain and be accountable for a process can give a lot of the same effect as a big paper project that was supposed to comprise this. This is why I like debate and presentation assignments (and students even like them though they can freak out about them initially). Most students get that same jazzed feeling about proving their mettle we got about giving our papers to the professor but it is about demonstrating the worth of one’s thoughts to the other students as well. Most do see the point of being able to think for themselves since just repeating what a computer tells you to say doesn’t motivate people the same way–there’s very little ‘you’ in what AI produces but people still have selves and interact with others. People still bake cookies rather than pick them up at the Supermarket.
3 Alex SL 05.12.26 at 10:07 pm: Yes, I have concluded the same for my field (in biology): AI cannot replace us. But that doesn’t help much. If we get shut down, it won’t be because AI now does what we do, it will be because not enough people care about what we do. They will either have convinced themselves that AI can indeed replace us because they do not care if the outcomes of our work (be it education or research) are fit for purpose, accurate, and reliable, or they will have simply decided that society can do without our work, full stop.

And unfortunately, there is no guarantee that every society has to care about education, critical thinking, expertise and competence, its own history and high culture, the conservation of its environment, or sustainable economic practices more than it cares about not paying any taxes and paying as little as possible for any service or product ever. Just a few minutes ago I saw a news item about private colleges in Canada skipping basic elements of training and examining truck drivers, such as turning left at intersections. This isn’t even can you explain what the author might have been trying to say here, this is I don’t want my child to be run over by an incompetent moving forty tonnes at terminal speed. The kind of mind that accepts that corner-cutting as normal in their work life despite also being participants in traffic themselves will not understand why it might be a problem when everybody summarises 500 word texts and stops thinking. They simply will not grasp what you are saying if you try to explain it to them; it would be like teaching supply chain logistics to a cat.

Will proud authorship claims be made about prompts rather than final products?

We went through that ca. two to three years ago. The term prompt engineer was used non-ironically instead of as a sneer, and there were widely circulated cases of people ripping off the works of others through generative AI but also complaining that somebody else “copied my prompts”. All that was met with so much derision and shaming that hardly anybody does it anymore. They are still puzzlingly proud of their inane prompts (see Andreesen’s recent example of you are very smart and educated and never make mistakes; do not hallucinate; don’t be woke), but now the approach seems to be to share them publicly with each other for clout instead of treating them like a new protected class of intellectual property.
4 Jim Harrison 05.13.26 at 2:30 am: Figuring things out for yourself is the new cursive.
5 Dingbat 05.13.26 at 4:59 pm: In my first year of teaching, I had a seventh-grader blurt out in frustration, “just tell me what to write!”

Five years of teaching in, now, and it’s very much a process to get students to move from “Write this” to “Put thinking into this, and write to show what you thought.” Some get further than others in the time I have with them.
6 Liam 05.13.26 at 11:34 pm: This is fundamentally right that human text is purposeful in a way that AI text can by definition never be. Even bad text, semi-literate text, is distinct from perfect products of a language model, because it is authored. But I’m cautious of the lesson being drawn that the difference means human text can’t easily be substituted, to everyone’s loss: there’s plenty of examples in the history of capitalism where the ersatz has been good enough to displace the real article—and perhaps it’s more likely, automatic text competing with human text will just create classes of the commodity, at different price points.
7 engels 05.14.26 at 2:17 pm: human text is purposeful in a way that AI text can by definition never be. Even bad text, semi-literate text, is distinct from perfect products of a language model, because it is authored

What about Wikipedia pages?

I suppose you could say this about paintings vs photos, but the latter obscelesced the former for almost all purposes bar a few privileged niches.
8 Lisa Herzog 05.14.26 at 7:01 pm: Thanks for all the comments! Just to clarify, I’m not categorically opposed to any use of AI in pedagogy. Though for some of the points mentioned here, my spontaneous reaction is: wouldn’t it be nicer if students did this together with other students? And is there a risk that the availability of AI seems an easy way out of the pressure to make new friends at university? That would be a big loss, after all…
I’m also convinced that people will continue to write in order to learn to think. For a while, I was really convinced that this is the only way; at the moment I think that in-depth conversation can really also play an important (and maybe complementary) role.
Sadly, it is true what Alex SL says, that the value of humanities degrees also comes under pressure from those who simply don’t see the value of critical thinking and expertise – e.g. because they think that their religion or ideology already tells them all that needs to be known. I have some semi-formed thoughts about the challenges in this area, maybe I’ll write more about them in a future post….
And Liam’s point, about different texts for different classes – that’s something to be expected in many areas, AI for the masses, human service for the privileged (think of how banking is going…).
9 engels 05.14.26 at 11:38 pm: “Will proud authorship claims be made about prompts rather than final products?” We went through that ca. two to three years ago.

Andy Warhol was doing this in the 60s.
10 MisterMr 05.15.26 at 7:56 am: @engels 7
“I suppose you could say this about paintings vs photos, but the latter obscelesced the former for almost all purposes bar a few privileged niches.”

I think a more relevant analogy would be bycicle VS motorcycle: if you need something to go from point A to point B, motorcycle is more efficient; if you need something for doing physiscal exercise, bycicle is better as you will do more phisical exercise.

Something similar goes for AI and art: if you compare art with sport, you can have bycicle races and motorcycle races, but you can’t really have a race where there are both motocycles and bycicles.
In a similar way, you can see painting as art or photography as art, but you can’t really compare a painter to a photographer because these are two very different arts.
The problem with AI compared to e.g. illustration is that it is difficult to distinguish between a handmade illustration and one done by the AI, so it would be like a motorbyke camouflaging as a bycicle, or a photo camouflaging (and competing with) painting.

OTOH if we get out of this “art as sposrt” comparison, and e.g. I just need an illustration to put on the cover of my new book to sell it or something like this (where the “art” is not the actual content and is just decoration), there is no reason to pay for an actual illustration and so AI will kill the jobs for all those illustrators who used to work for this stuff (though for example many book just had a random old painting on the cover and that’s it, so the problem isn’t exactly new).
11 MisterMr 05.15.26 at 8:14 am: Following my previous comment, I think we should distinguish between 3 different types of AI use:

Instrumental: when the AI product is something we use for the final product but is not the focus of interest, like if I search for the recipe of tiramisù on the net, or if I need “art” for purely decorative reasons, like a cover of a book or random images on flyers; wikipedia falls in this category IMHO.

Training: this is the problem in academia, when the purpose of a text or an exercise is not the text in itself but the fact that the student needs to train a certain skill (so obviously it’s the student who has to do the work, if the AI does it it makes no sense). This is comparable to doing pushups in the gym: the pushup itself isn’t the product, it is the effect of the pushup on the one doing it.

Humanistic: for lack of a better word. This is the situation where the text is important because it somehow expresses someone’s opinion, and this is the point of the text. This is the non-decorative, argueably non-entertainment conception of art and literature (art and literature as we study them at school, basically).

The question is how much and how often this humanistic kind of texts are important. For example if we speak of literature, certainly the idea that literature expresses the author in somke way is important in most conceptions of “art”, however we generally study only a comparatively small number of text and most people aren’t “authors”, let alone famous ones.
In fact a large part of our scholastic concept of art consists in discriminating true art from lower art, but of course every person can potentially express themselves, so the scholastic concept of art also has limits.

Other situations where the “I’m expressing my opinions” is important are also dubious, like e.g. politics (in some cases it is important that political statements express someone’s will, but the specifical authoriality is not).
12 engels 05.15.26 at 6:11 pm: Also worth considering: “Writing is that neutral, composite, oblique space where our subject slips away, the negative where all identity is lost, starting with the very identity
of the body writing.”
https://dn721906.ca.archive.org/0/items/TheDeathOfTheAuthor/the%20death%20of%20the%20author_text.pdf

(I could not possibly comment.)
13 Thomas Hewitt 05.15.26 at 6:29 pm: I’m thinking of a somewhat different field of learning. Math science, and learning/teaching programming. But I think the pedagogical decisions are similar. AI can definitely help learning if used properly. It can also be used as a crutch for knocking off assignments, with the only goal of the student being to get a grade. So a lot depends on the attitude of said student. If he/she is interested in advancing their knowledge base -or in
simply being awarded a certificate.
There is also the element of time and learning efficiency. I could comb through fifty pages of text, looking for a few salient facts. Maybe using AI to dig out those facts and save me from a couple of hours of my scarce time, will help me to learn even more. And prompt engineering is becoming an important skill, it shouldn’y be derided. It requires considerable time and effort, and thought to become good at it.
14 somebody who remembers the underground grammarian discussing the perfection of algebra 05.15.26 at 7:13 pm: Here are my own blinders at work. When obtaining a mathematics degree, it was very clear, very early, that writing out proofs and formulating answers to exam questions or project questions was not a test of the assertion in question – it’s quite rare that a mathematics exam will ask you to prove something that’s false! The critical determination was how you developed an approach to the answer, showing your understanding of the underlying subject by seeing which suppositions and connections could be drawn to the specific question. And the same was true of my philosophy degree; my Aristotle professor (the wonderful Julia Annas) knew more about Aristotle that I ever would. Her exams were not meant to produce text about Aristotle. She didn’t need a squirrelly undergraduate to do that, and this was extremely evident. We were tested not simply on our understanding of the facts of “what Aristotle thought” but on the consideration evident in our response. From the moment I entered college I never felt I had to, even once, produce what you might call an instrumental text, where I was asked to simply report the facts of something without demonstrating something about my own internal self. I must admit that this training led me to be devotedly repulsed by everything ever produced by the lying-and-asskissing machines the ultra-wealthy have crammed into our society like wet dynamite, exploding into suicides, homicides, and people insisting that they must eat three rocks a day along with their glue pizza. These fucking things can’t even count, and I’m supposed to be so dissatisfied with my own mind that I want them around? I can count! I can play a woodwind instrument! The art all looks the same – garbage – the writing all sounds the same – complete trash – the music all sounds the same – bland, pathetic – and the only people impressed with it even slightly are the most crawling, diseased minds who have ever slithered into first class on an airplane. No wonder they think the rest of us should be thrilled with the glue pizza, yellow-tinted cartoons of ICE agents smashing an elderly woman into a wall, and nonconsensual porn of “the girl at starbucks who wouldn’t give me her number even though i was fucking NICE to her, and i have a COMPUTER job, a REAL job, grok!! i have a COMPUTER!!”
15 JPL 05.15.26 at 10:57 pm: There’s way too much money involved in just the simple aim of getting a college education. The immense amounts of money involved are all out of proportion to the straightforward task of learning the best way to think about things. It makes it impossible to “get back to basics”, like cultivating the intellectual passions. All that gets lost in the shuffle. Now the neoliberal managers say it has to be all about “data analytics”, quantifiable metrics and a centralized totalitarian administration of the money-making business. But papers students write are not just autonomous texts; it’s a matter of writing up the results of a passion-driven inquiry to understand a problem they themselves have identified as significant, and the office-hours discussions of these experiences are much more important than the “objective” mark you have to put on the paper for the bureaucratic process. Students these days seem to be more forced to be bogged down worrying about their grades than they used to be, instead of being engaged by the substance of the learning experience. I would venture to guess that this “objective” “data-analytics” approach will just tend to produce more of the kind of crackpots, incompetents and unserious people we have in the Trump administration, making a lot of money, but doing a lot of damage. The idea of the university as a lucrative money-making venture needs to be made obsolete.
16 JBL 05.15.26 at 11:30 pm: Ray Vinmad, “the kind of students who dislike the struggle too much or don’t know what it is for … they don’t try it (it is hard) and they never find out if it does have meaning. These are generally the weaker students or the ones who are more gullible …” resonates strongly with my experience this year teaching a proof-based linear algebra class to prospective math majors.
17 mw 05.16.26 at 11:27 am: But why should an employer want to hire graduates who have just read the AI summary of these 500 pages?

They shouldn’t, but that seems far from the best way to learn using AI. Think, instead, of interacting with AI as if it were a free, always available, infinitely patient expert. “How does x work?”, “Would y be a good way of thinking about it?”, “Do experts in the field agree that z is true or do they disagree, and why?”. Asking AI to produce a summary, reading it, and stopping there only begins to take advantage of what AI can do.

Having recently been thrust into the role of a trustee, I have been using AI and have learned a lot about trusts, contracts, appraisals, taxes, and so on. Learning has not been the primary point — figuring out how to handle fairly complex matters is the goal — but doing the same thing with a series of sample problems seems like it would be an excellent approach. I could imagine giving an assignment: “Here are 5 sample problems. Choose 3, pick the AI system of your choice, and craft your recommended solutions together. Post both your final recommendations and your AI transcript”
18 engels 05.17.26 at 5:43 pm: Think, instead, of interacting with AI as if it were a free, always available, infinitely patient expert.

…that occasionally spews complete bullshit.
19 Alex SL 05.17.26 at 9:25 pm: mw,

Leaving aside a towering stack of ethical problems (theft of IP for training, their use being effectively laundered plagiarism, and the ecological footprint), an expert is precisely what these models aren’t. A key problem is that they do not understand. And that is where that sentence ends. I have twice tried an LLM-based system that was specifically built for literature reviews and writing in science and advertised to me as “transforming how we do research”, and the results were in both cases equivalent to a clown car that has just been driven into a ditch while also on fire.

LLMs do not understand what the references they cite actually say. They use them like an extremely lazy author who vaguely realises they have to cite references to support what they want to say but doesn’t care if a reference actually supports the statement and hopes nobody will check. How do I know? Because I tried the system with queries where I already know the answer, and it cited my own papers to support statements that those papers would not support if anybody with understanding actually read them. Just last week, a colleague asked me for a few references on a method we have been using and, when I had listed some, wrote back that I “did much better selecting these than copilot”. I mean, yes, because I am not a model whose entire ‘reasoning’ process derives from being trained on variations of guessing what comes next after “veni, vidi”

LLMs stochastically predict text from text and/or images from text and/or images. This means that plagiarism and spam generation are their optimal use cases. Fortuitously for the LLM salespeople, it turns out that coding is a potentially lucrative field for earning money from blatant plagiarism, especially because code is a highly rule-based language and can quickly be checked for running without error. But none of that is the case in other applications such as e.g. summarising documents or doing research. Using LLMs for those kinds of purposes is today’s equivalent of people a few years ago trying to apply blockchain to everything because it was fancy and new.
20 J-D 05.18.26 at 6:18 am: … it’s a matter of writing up the results of a passion-driven inquiry to understand a problem they themselves have identified as significant, and the office-hours discussions of these experiences are much more important than the “objective” mark you have to put on the paper for the bureaucratic process …

Ursula Le Guin, The Dispossessed:

He was appalled by the examination system, when it was explained to him; he could not imagine a greater deterrent to the natural wish to learn than this pattern of cramming in information and disgorging it at demand. At first he refused to give any tests or grades, but this upset the University administrators so badly that, not wishing to be discourteous to his hosts, he gave in. He asked his students to write a paper on any problem in physics that interested them, and told them that he would give them all the highest mark, so that the bureaucrats would have something to write on their forms and lists. To his surprise a good many students came to him to complain. They wanted him to set the problems, to ask the right questions; they did not want to think about questions, but to write down the answers they had learned. And some of them objected strongly to his giving everyone the same mark. How could the diligent students be distinguished from the dull ones? What was the good in working hard? If no competitive distinctions were to be made, one might as well do nothing.

“Well, of course,” Shevek said, troubled. “If you do not want to do the work, you should not do it.”
21 J-D 05.18.26 at 6:20 am: The idea of the university as a lucrative money-making venture needs to be made obsolete.

And we need unicorns!
22 John Q 05.18.26 at 7:04 am: Engels @18. It wouldn’t be a very good simulation of a research assistant if it didn’t occasionally spew complete bullshit.
23 Tm 05.18.26 at 2:46 pm: Turns out the kids are all right? They might have have better judgment than 90% of university administrators…

“Former Google CEO Eric Schmidt was booed throughout this commencement speech at the University of Arizona for his praise of AI. This comes just a week after another commencement speaker who mentioned AI was booed at a school in Florida.”

https://bsky.app/profile/404media.co/post/3mm2ivguvq22x
24 agcooper 05.18.26 at 3:21 pm: @14: Julia Annas’s “An Introduction to Plato’s Republic” is the best kind of textbook. In the same vein as you’re saying, it’s a guide to learning how to understand the text as opposed to mere exposition. I’m don’t know if it’s typical to refer to textbooks as enriching but this one certainly was.

As for the bulk of your post, my response would be a non-debate stimulating: “I wholly concur”.
25 Alex SL 05.18.26 at 10:35 pm: The beauty of social media is that after yesterday seeing a post linking to a substack that does the centrist both sides are wrong dance, arguing that there is a bubble but that LLMs are revolutionary:

“The revolution is real. The technology is going to transform the production of software, the operation of businesses, the texture of everyday life. The fact that the current capital allocation is misaligned does not mean the technology is a fraud.”

…I today get a post from a scientist who demonstrates that his institution’s “new lab-sanctioned chatbot” reproducibly produces fake stats including P values and correlations of different factors and a discussion of the implications of those stats if he puts the name of an excel sheet into the prompt but does not give the text prediction model any actual excel sheet with data. This being a text prediction model instead of an intelligence that thinks, wait a second, there are no data, all the stats are, well, text prediction outputs, or as the media politely say, hallucinations.

And that is not surprising at all. Again, sorry for the repetition, but this is clearly not appreciated widely enough: LLMs are stochastic text (or image) predictors. That’s it. Yes, when we see something responding in language, we are extremely good at fooling ourselves into concluding that there is reasoning in it, just like we can see faces in clouds or rock formations (and for the same reason). Yes, if some company like Anthropic puts an LLM into a ‘harness’ that checks in the boring, old non-LLM way that actually works whether the code the LLM spits out runs without error and shows the expected behaviour, unskilled developers can create enormous amounts of thoughtless spaghetti code that runs, very quickly, and that seems magic and revolutionary.

None of that means that these things can be trusted with research or with writing reports or with analyses or with providing information or with anything that cannot be checked so quickly and easily that we can just as easily do it ourselves without the LLMs. Because they just guess (albeit extremely well!) what text might be a reasonable-sounding response to the text we gave them as a prompt instead of doing research or understanding anything. And also, their reactionary billionaire creators have the finger on the post-training and system prompt scale to make those responses more helpful to their interests, so that can’t be good for research and for relying on them for information either.
26 mw 05.18.26 at 10:47 pm: AlexSL @ 19 “LLMs do not understand what the references they cite actually say. They use them like an extremely lazy author who vaguely realises they have to cite references to support what they want to say but doesn’t care if a reference actually supports the statement and hopes nobody will check. ”

Asking an AI for the best references to support a point isn’t something I’ve used models for, but apparently it’s not a good use case, and the AI model will tell you so — but it will also suggest what to do instead (don’t just ask for a list of relevant papers, but also ask for an explanation of why they are relevant). The best approach to using AI in this is going to require back and forth and active participation by the researcher, not just the ‘lazy’ pasting the answer of the first query. Would it be better to ask the expert in the office across the hall who happens to have just the right expertise in the area? Probably — but how often is that going to be a viable option? Here’s what CoPilot/ChatGPT has to say about using AIs for references:

1. Why “explain why this paper is relevant” works better

When you ask an LLM:

“Give me references on X.”

…it will often hallucinate or pick papers based on superficial keyword associations.

But when you ask:

“Give me references on X and explain why each one is relevant.”

…you force the model to perform a second layer of reasoning. This does three things:

It exposes hallucinations (the explanation won’t make sense).
It forces the model to check internal consistency.
It gives you a way to evaluate whether the citation is actually appropriate.

This doesn’t eliminate errors, but it dramatically reduces blind hallucination.

And once you’ve identified your target papers, if you provide the text, AI models are good at analyzing and critiquing them, pulling out relevant excerpts, etc.
27 J-D 05.19.26 at 2:07 am: In this blog post, Bret Devereaux describes asking ChatGPT the relationship between two books: the response missed the fundamental fact that one book was intended as a rebuttal to the other:
https://acoup.blog/2023/02/17/collections-on-chatgpt/
In addition, when asked to list the most important critics of the second book, the response omitted the most important critics, presumably because they weren’t mentioned in ChatGPT’s training dataset.
28 agcooper 05.19.26 at 10:18 am: @Alex SL: as, I believe, a biologist, do you have any opinions on AlphaFold (I appreciate this may be meandering slightly away from the core subject of this thread)? In my unqualified opinion, that seems to be one of the more significant contributions from this new age of “AI”. Also, it’s no coincidence that while it partially uses LLMs, like with all of Google DeepMind’s models, there are also bespoke non-LLM machine learning elements which are necessary and sufficient to address the task at hand. While last year’s Nobel Physics prizes awarded to “AI” pioneers made no sense to me, including Google DeepMind’s Demis Hassibis and John Jumper among the Chemistry Laureates seemed like suitable recognition for their achievements. That said, I should stress I have no professional qualifications to substantiate such a judgement; I’m just going on conversations with other biologists and the headline description “AlphaFold, ~200 million protein structure predictions”.
29 mw 05.19.26 at 6:37 pm: J-D @27

I don’t doubt Bret Deveraux’s experience, but when I posted his question into ChatGPT powered CoPilot (the current, bog-standard free version), I got the response below (including immediately pointing out that one book is a rebuttal to the other). In the world of AI models, 3+ years ago is a long time. Even so, does the response given now still suck? If so, why?

How the Two Works Relate
Their relationship can be summarized in three ways:

A. Isaac as a direct critique of Luttwak
Isaac’s book is widely understood as a systematic dismantling of Luttwak’s framework. He challenges Luttwak’s use of evidence, his assumptions about Roman rationality, and his interpretation of frontier archaeology.

B. A debate about how to read Roman evidence
Luttwak reads patterns in fortifications, troop deployments, and imperial actions as signs of strategic planning.

Isaac argues that these patterns are coincidental or driven by non-military factors.

C. A methodological clash
Luttwak: Top?down, theoretical, influenced by modern strategic studies.

Isaac: Bottom?up, skeptical, grounded in ancient ideology and the limits of administrative capacity.

Why This Debate Matters
The Luttwak–Isaac debate shapes modern scholarship on Roman imperialism:

It forces historians to question whether ancient states had the capacity for long-term strategic planning.

It highlights the danger of reading ancient evidence through modern frameworks.

It has influenced frontier archaeology, military history, and studies of imperial governance.

Concise Summary
Luttwak argues that Rome had a rational, centrally planned grand strategy; Isaac argues that this is an anachronistic projection and that Rome’s actions were inconsistent, locally driven, and ideologically motivated. Isaac’s work is essentially a rebuttal to Luttwak’s model, and the two books represent opposing interpretations of how the Roman Empire managed its frontiers and military power.
30 Alex SL 05.19.26 at 9:35 pm: And once you’ve identified your target papers, if you provide the text, AI models are good at analyzing and critiquing them, pulling out relevant excerpts, etc.

That is not my experience. As mentioned, a system specifically designed for literature review and research did not understand what papers actually said or meant. I just pulled one of my attempts up again to refresh my memory. In one case it cited two papers in support of the statement that some species previously considered to be native to Australia have now been demonstrated to be introduced/exotic, but the second of the two references was about reclassifying undeniably Australian native species from one genus to another and was not based on genetic data either. In another, the agentic whatnot rambled vaguely about some populations of a species displaying “cryptic diversity or morphological intermediacy that may reflect ongoing hybridisation or parallel introductions” and referred to to the same obscure, high-level taxonomic classification paper that it already misused above plus a DNA sequence dataset that I had published on a data repository. If I was peer reviewer or editor for a manuscript doing that I would have to have some words with the authors.

To my understanding, LLMs are fundamentally incapable of reasoning, analysing, and critiquing, and no surprise. Sorry to repeat myself again, but an LLM is, while very, very large and complex, ultimately still an in -> out network of weights trained to guess what word is missing in “It was the best of times, it was the _____ of times”. They cannot reason, second layer or first. They merely simulate communication, nothing more, and then the human mind fools itself into thinking that something more is going on because what is simulated is words from human language. Expecting them to analyse and critique is equivalent to expecting a search algorithm to do model protein folding; it can’t, it isn’t designed for that, it is a search algorithm. Same for LLMs; we need to understand that they are stochastic text guessers.
31 J-D 05.20.26 at 9:49 am: It wouldn’t be a very good simulation of a research assistant if it didn’t occasionally spew complete bullshit.

It wouldn’t be a very good simulation of any kind of human being if it didn’t occasionally spew complete bullshit.
32 J-D 05.20.26 at 9:57 am: Even so, does the response given now still suck? If so, why?

How would I know? Why are you asking me? Bret Devereaux might have an answer to that question; I do not.

I know this much, though; Bret Devereaux explicitly said in the post that some things which had not been achieved at the time of writing might be achieved at a later date. That didn’t change the fact that claims that were being made at that time about what had already been achieved were inaccurate. My conclusion about what has been achieved now: more than had been achieved three years ago, but not as much as boosters like to claim. Remain suspicious of hype.
33 engels 05.20.26 at 3:26 pm: Democracy dies in drivel.

ChatGPT and other AI bots made huge errors before Scottish election, study finds
https://www.theguardian.com/technology/2026/may/20/ai-chatbots-chatgpt-replika-grok-gemini-misinformation-scottish-election-demos

ChatGPT, the most heavily used AI service, gave wrong information in 46% of its answers, including making up an expenses scandal, giving inaccurate replies on voter eligibility rules and getting the date of the election wrong by two months.
34 mw 05.20.26 at 5:47 pm: Alex SL @30. “That is not my experience. ”

But you’re not describing the same thing. LLMs ‘know’ about works they have been trained on, and that’s what they’re depending on when you ask them for citations. But they don’t preserve anything at all close to verbatim copies of every training document in their models. However, if you provide a copy of the text (or part of the text or the abstract, etc) directly into their context window (by pasting or uploading), then — during that session only — they DO have access to the text and can provide a much more detailed analysis with greater fidelity, and you can easily check the work (if the AI pulls out a quote from page 22, you can easily verify that it’s there). That’s more effort obviously, but it is a viable way of getting to trustworthy results.

engels @33 The old admonition was not to automatically believe anything you read on the Internet. The new one is not to automatically believe anything that an AI model tells you. You could add, “Don’t automatically believe anything an expert tells you” too (they are sometimes biased, misinformed, outdated, etc). That doesn’t make the Internet, AI, or experts useless.
35 Alex SL 05.21.26 at 11:48 am: agcooper,

I am not in the protein folding field, so I have no direct experience with it. In my own research, I have used CNN computer vision models. One of the great advantages they have for science is that their outputs are reproducible: if I show such a model an image, and it says this is species xyz with 89.127% confidence, and then I show it the same image another nine times, it will say this is species xyz with 89.127% confidence another nine times. I have also done (and published the results of) a little test project on using LLMs for extracting information from text and processing it downstream with Python scripts. One of the great disadvantages they have for science is that their outputs are not reproducible, and they have a habit of ignoring direct, even very simple instructions such as “do not skip any table rows”, which made my downstream Python scripts fall over.

So, I must admit to ignorance regarding protein folding models, but there is my take on LLMs in research: I expect scientific workflows to be reproducible. That is kind of one of the things that makes them science! Apart from that, I am exasperated at the number of papers even in my field that publish a way of doing with an LLM what we did much more efficiently twenty years ago using approaches like heuristic search. Long solved problem, meet somebody who wants an entry in their publication list that says “AI”.

mw,

I am not sure I understand. If there is a single text that is small enough to fit into the context window, why wouldn’t I just read it myself*? And even if that is some kind of time saving and ethical, I still wouldn’t trust the LLM to understand what it sees. I have not done careful experiments on that myself, but I have read a few indications in the writings of others who apparently looked into it that LLMs do not really analyse and summarise but merely excerpt from the larger text. And again, my direct, personal, lived experience in my own work is that they do not understand anything but merely produce language-shaped guesses. Nearly every time I use them, they fail me. They don’t understand the contents of papers, simple Python scripts need six or so iterations of correcting the AI to get it right, and just a few months ago I even still got an image of a person who had three hands. This string of failures includes cases where I try to ask for information from the weights and cases where I put things into the context window to extract information from. And those are the good models, like ChatGPT or Claude; when I try running open weights models on device in the hope of lower environmental footprint, they perform much more poorly and sometimes just output sixty pages of gibberish.

*) I recently saw with some exasperation that a full quarter of the web page of a journal whose paper I was checking out as a potential reference was covered by a sparkly AI frame that suggested various buttons to press to variously summarise the contents of the paper or ask questions about its contents. Even leaving aside my general distaste at constantly getting LLMs shoved into may face against my will, I genuinely struggle to understand what the purpose of this is apart from wasting electricity. The paper had the admittedly old-fashioned feature we in the business call an “abstract”, which is a perfectly fine summary that makes an AI summary unnecessary. It also had a “highlights section”, which was in itself a deeply idiotic attempt by some corporate suits to reinvent the concept of a summary despite abstracts already existing. Now we have three things all trying to be summaries, only the LLM wastes untold times the resources that the other two do, and it is the one out of the three that I would not trust because it wasn’t written by the people who understand the paper best, the authors. What are we doing here? Is everybody going insane?
36 engels 05.21.26 at 9:55 pm: The old admonition was not to automatically believe anything you read on the Internet.

Yes I remember that well: “actually it’s good that the internet is full of BS: it teaches people to read critically and check everything they see on ‘net against trusted sources… which by definition won’t be those blogs, social media, etc but the very “legacy media” they are putting out of business… and it won’t be problem when they do because HEY LOOK IT’S THE GOODYEAR BLIMP!”
37 mw 05.22.26 at 11:58 am: Alex SL@35 “I am not sure I understand. If there is a single text that is small enough to fit into the context window, why wouldn’t I just read it myself*? ”

Because the context window is not small — even with the free versions of current LLMs. Gemini says, “As of May 2026, the free versions of major LLMs offer competitive context windows, often surpassing (100,000) to (1 million) tokens. While maximum context windows are large, practical PDF analysis is often constrained by file size limits (usually (30-100 MB)) and “effective” context limits, where performance degrades on very large documents.

A token corresponds roughly to a word in text document, so at this point you can upload very long texts.

“And even if that is some kind of time saving and ethical, I still wouldn’t trust the LLM to understand what it sees. ”

Do you worry about the ethics of running a paper or a book through a tool that calculates word frequencies or translates a foreign language work into English as you do about using an LLM to analyze a document? If not, why not?

And are current LLMs more or less trustworthy in this regard than, say, a typical research assistant?

engels @36. The legacy media have been in steep decline for decades — long before LLMs were even an idea. Craigslist, for example, was the original ‘killer app’ for local newspapers (classified ads were a major revenue source), not ChatGPT.
38 Alex SL 05.22.26 at 10:43 pm: mw,

I am surprised that you are asking about the ethics, because there has been a lot of public discussion about this over the last few years. Concerns include but are not limited to:

The ecological footprint of using LLMs, including not only CO2 emissions but also the effects on people living near data centres. From my personal perspective, it offends me how much of LLM use is in areas that already had much more energy-efficient and perfectly fine solutions long before LLMs, like web search or translation.

The entire technology being build on massive and blatant theft of intellectual property and clearly not being economically viable if the creators were to be properly compensated. (Whether it is economically viable at all even if they are allowed to get away with theft several orders of magnitude larger than what open science advocates or computer game bootleggers were driven into suicide for is still an open question, because all frontier model use is currently heavily subsidised, and all these companies are making billions in losses on LLMs.)

Feeding text into an API call to a model makes the text available to the API provider, potentially against the wishes of the author. This applies to published books, poems, or research papers but even more so to situations like peer review of manuscripts or grant proposals, where the assumption is that it is done in confidence, with the risk that ideas could be stolen before publication.

Beyond that, the societal effects of LLM adoption like deskilling people, the spread of AI psychosis, and the flooding of eBook stores, scientific journals and grant programs, video repositories, and the internet overall with disinformation and slop. This last one is perhaps not as directly relevant to your use case, but I would say it falls under wanting to avoid the normalisation of LLM use out of concern for one’s society and the general welfare of humanity, thus also an ethical question tangential to your use case.

I am not counting here that LLM use for writing is plagiarism, because, again, that is not what your use case of asking the LLM to simulate that it understands a text is about. But in a broader sense, not understanding something, asking the LLM to summarise it, and then pretending to understand it to one’s peers, reviewers, or stakeholders when one is merely repeating the LLM summary is IMO at the very least somewhat intellectual fraud adjacent.

I know nothing about you and am not claiming that that is what you do, but you must certainly understand that that is what a lot of other people will do if they think, mw is right, I can ethically have the LLM explain stuff to me that I am too lazy to read myself, just look at how long this is, come on now. For evidence that that is so, there has been an entire discourse cycle on BlueSky in the last week over whether researchers should actually read the references they cite in their works, and the ‘no’ side exhibits the same exasperated confidence that they are right as the ‘yes’ side. No shame whatsoever, nor any understanding of what their job as researchers is, apparently.

And are current LLMs more or less trustworthy in this regard than, say, a typical research assistant?

Bit difficult to answer for me because I have never had a research assistant and am not entirely sure what they are. I lead a team of other scientists and technical staff, and if I need to understand something, I read it myself instead of outsourcing my thinking to one of my colleagues, who have their own projects to work on or a lab to run. I know some of our team may occasionally use LLMs for coding assistance or for checking a text they wrote themselves for errors or suchlike, but unless I am very mistaken, we all want to understand what we are doing. I genuinely struggle with the idea of making either an inexperienced assistant or a word predictor bot run by a psychopathic billionaire do the core thing that I am paid to do and that my reputation and self-esteem hinge on: being an expert.
39 J-D 05.23.26 at 12:30 am: Do you worry about the ethics of running a paper or a book through a tool that calculates word frequencies or translates a foreign language work into English as you do about using an LLM to analyze a document?

I have used Google Translate myself, recreationally, and don’t think it was unethical to do so. What I do worry about is people placing greater reliance on machine translation tools than is justified by current performance (as opposed to speculation about what will be possible three, five, or ten years from now). The official practice of my employer (the University of New South Wales) has been to require documents presented in languages other than English to be accompanied by translations prepared by NAATI-accredited translators. Of course it would be physically possible to just accept the documents and then run them through Google Translate or something similar, but there’s good reason not to offer that as an official option.
40 mw 05.23.26 at 11:46 am: AlexSL @38. “Feeding text into an API call to a model makes the text available to the API provider, potentially against the wishes of the author. ”

I’m aware of the general ethics arguments. I was specifically asking about your concerns about upload documents for analysis. You can opt out of having LLMs use any text you provide or data you upload for training (though you may have to opt for a paid version to do that). I’m aware of a case of using LLMs on medical records (for translation and also some basic analysis), and an LLM that guaranteed privacy protection and no retention of uploaded data was required. By the same token, if you use an automated translation tool (LLM or not), the same concerns apply — is the site retaining a copy of the work you’ve uploaded?

As for ‘theft of intellectual property’ — that’s not how copyright works. As a copyright holder, you are protected from people illegally reproducing your works (beyond the limits ‘fair use’). You are not protected from having people read, learn from, remember, and explain your works. The analogy is obvious — LLMs are reading and learning from copyrighted works, not reproducing them (which is why LLMs can’t provide you verbatim excerpts). One thing LLMs can do is an automated version of Cliff Notes, and Cliff Notes has not been found liable for copyright infringement. Now, you could argue that we should not treat human readers and AI readers the same, and we should expand copyright law to ban the practice of LLMs reading copyright works for training. But that is not covered by existing copyright law. Some people have started including ‘No AI Training’ notices on their works, but compliance is voluntary at this point. Perhaps respecting ‘no training’ should be codified in law? But as an author, would it make sense to opt out, given that AI queries are rapidly becoming the most likely way people would find out about your works? The analogy to the same question with respect to search engines ~20 years ago is clear.

“I can ethically have the LLM explain stuff to me that I am too lazy to read myself,”

Surely it’s a question of time rather than laziness? If an LLM leads you to a reference, should you verify that the reference is truly relevant? I’d agree that you should. Must you give the entire paper a close reading as if you were a reviewer? No — that seems like overkill. So upload the paper, ask for the relevant pages/excerpts to support the point. Read those to verify. Read the abstract. Check the journal, etc.

One benefit of finding references this way is it may level the playing field somewhat, and lead to more awareness of works that aren’t published in the leading western journals and to more citations of works other than your ‘usual suspects’ and the ‘usual suspects’ found in the references in papers by other well-known western researchers — out of, you know, ‘laziness’.

J-D @39. Don’t you expect that NAATI-accredited translators are using machine translation too (at least as a first pass)?
41 engels 05.23.26 at 1:02 pm: Anyway to go back to the original post: no, the text is not the product of higher education. The product is the thing that produces the text: the human cognitive labourer. Unfortunately it is that very product that is already being made unsaleable in huge numbers by AI; as others have pointed out, not because AI’s better but because it’s cheaper.

On the plus side, there’ll always be a few rich people who’ll want to read philosophy for fun (I hear Marcus Aurelius is popular with the tech titans…)
42 MisterMr 05.23.26 at 8:29 pm: @mw 40
“You are not protected from having people read…”

This is a bit of a pedant point because I don’t think this Is the core issue, but:
Some 30 years ago I had a copyright law exam in Italy and yes, legally you have tò pay copyright to read stuff, but it is included in the price you pay (e.g. when you buy a book), at least under italian/EU law.

A logical consequence of this is that if I buy a book I can lend it to close family, but in theory I couldn’t lend it to a friend. This part of the law is ignored by everyone, however this is the reason libraries have to pay more than the normal price for books.

I’m not sure of USA law but I suspect it is the same because some decades ago the USA largely adapted its copyright laws tò the EU ones, that were more capitalist friendly.
43 J-D 05.23.26 at 9:20 pm: My ethical concern would be, as I’ve already indicated, not about whether accredited translators are making use of machine translators but about the level of reliance they’re placing on them and whether it’s justified by the actual current performance of those automated tools.
44 Alex SL 05.24.26 at 3:17 am: mw,

This is going increasingly off-topic and in circles, so as concisely as I can (which may not be very, sorry):

if you use an automated translation tool (LLM or not), the same concerns apply

Good point; these days, one should probably assume that any storage or transfer of electronic text comes under some small print in a user agreement somewhere that allows a software company to steal the contents. But I was comparing not Google Translate with Gemini, but, for example, reading the PDF I exported from a journal’s editorial manager system with plopping that PDF into ChatGPT to have it do the peer review in my stead.

As for ‘theft of intellectual property’ …

I am not an IP lawyer, but we are not talking here about a human reading and learning from copyrighted works. We are talking about commercial companies building a commercial product whose intended and most appropriate use case is plagiarism of the works they trained the model on. You may use it only for having it simulate that it understands a text, but many others are using it for what LLMs are actually optimised to do, which is generate text, which means plagiarism and spam, be it slop websites like Botanical Realm, slop eBooks, slop grant applications, or slop journal articles like that recent one in Alexandria Engineering Journal 123 (2025) 448-459 (still not retracted, it seems, which is just amazing).

And the vision all these companies are very openly moving towards is one where the bot gives you an answer based on somebody’s work without leading you to that work, because you going off to read the original source reduces your time in their system when they can show you ads and/or make you use more tokens; see Google’s recent announcements. There is no analogy to search engines, which are (were?) merely indexing works so that a reader can find them.

So, again, not an IP lawyer. It is well possible that you are right, and the theft is all fine under the law. However, I was talking ethics.

Surely it’s a question of time rather than laziness? …

As with the first point, we are slightly talking past each other. No, I do not have to read an entire paper up to the last word if I use it as a reference. But I also do not need an LLM to tell me where the relevant pages are. I can, yes, read the abstract, read the methods and results sections, look at the key figures, or skim the introduction and discussion for other useful references. I fail to understand where the LLM adds value unless perhaps we are talking about a book of 500 pages instead of a paper of 20 pages, and the former are rarely relevant in my field except as undergraduate teaching materials. That being said, when I recently did have reason to build an argument in one of my papers on an entire book, I loaned it from our site library and read most of it.

You do you. But I will draw my own conclusions about how seriously I can take and how much I will respect one of my colleagues if and when I realise that they rely on LLMs to summarise the literature for them instead of reading and thinking themselves.

it may level the playing field somewhat

It is unclear to me how an LLM, which is (a) by design regurgitating the most probable response (it is a probabilistic model!) and thus converges on the common and the mean and (b) post-trained and harnessed to serve the needs of the company that hopes to earn money from it, would be more likely to lead me to previously overlooked papers in an Indian or Bolivian journal than a key word search with an engine like Google Scholar. The opposite seems more plausible.
45 engels 05.24.26 at 10:13 am: This connects what Alex has been saying with what I’ve been saying:

https://www.bbc.co.uk/future/article/20250611-ai-mode-is-google-about-to-change-the-internet-forever

Numerous analyses have found that AI Overviews appear to cut the amount of traffic Google sends to websites – known as the “click-through rate” – by between 30% and 70%, depending on what people are searching for. Analyses have also found that some 60% of Google searches are now “zero-click”, ending without the user visiting a single link.
46 engels 05.24.26 at 10:34 am: Perhaps this is simplistic but it seems to me LLMs are inherently bad at giving sources for their claims because they are not actually reasoning: that poses problems for the economy of intellectual credit and for our collective epistemic health.
47 mw 05.24.26 at 11:24 am: engels @ 41 “Unfortunately it is that very product that is already being made unsaleable in huge numbers by AI; as others have pointed out, not because AI’s better but because it’s cheaper.”

Well, of course. As Schumpeter put it, “The capitalist achievement does not typically consist in providing more silk stockings for queens, but in bringing them within reach of factory girls in return for steadily decreasing amounts of effort.” There are countless things that once were best produced slowly and exactingly by highly-trained artisans rather than via machines and mass production (though even afterwards, a few highly trained artisans may continue to find work crafting bespoke custom goods for the wealthy). But making the broad array of goods and services affordable to the masses usually involves accepting something less than the finest possible quality (though, that said, some of the most important modern goods are impossible to provide except through precision automated mass production).

MisterMr @42 “Some 30 years ago I had a copyright law exam in Italy and yes, legally you have tò pay copyright to read stuff, but it is included in the price you pay…I’m not sure of USA law but I suspect it is the same”

Interesting. As far as I can tell, that is not the case in US law. The ‘First Sale Doctrine’ means that once you buy a physical copy of a book, you can legally give it away, resell it, or loan it to as many people as you like (so long as you’re not creating additional reproductions).

Alex SL @44 “We are talking about commercial companies building a commercial product whose intended and most appropriate use case is plagiarism of the works they trained the model on.”

But that’s not the case — not as plagiarism has been understood up until the present. LLMs are generally incapable of producing verbatim excerpts from their training, they do not claim any of the text they produce to be their own original work, and they do include references. If you go to Chrome right now and enter, “How has plagiarism been traditionally undersood?”, the response includes references (yes, to articles about plagiarism). But I somehow suspect that your complaint here really isn’t that LLMs are insufficiently consistent in providing citations. If you don’t think book reviews or Cliff Notes are plagiarism (both published by private companies seeking profits), then it’s hard to see why the output of LLMs should be considered plagiarism.

“be it slop websites like Botanical Realm, slop eBooks, slop grant applications, or slop journal articles”

We didn’t need AI for there to be slop of all these types — ‘Creation Science’ websites for example, very low-quality journals that accept virtually anything, and vanity book publishers. LLMs do make it easier to produce these kinds of things, but it’s an old issue, not a new one. And it’s not as if it’s all downside. For example, consider an ordinary person who’d like to produce a memoir. They’re not famous — the book probably wouldn’t be of interest to more than maybe a few friends and family. And they’re not even a passably good writer, and not wealthy enough to pay for a ghost. Might they not, using a voice interface, sit down with a LLMs and put something together than they simply could not have managed otherwise?

“And the vision all these companies are very openly moving towards is one where the bot gives you an answer based on somebody’s work without leading you to that work”

Again, try the little experiment I suggested. The links are there — whether or not you click on them is up to you. It’s certainly easier to do so than it was historically to track down references in a published journal article (which usually involved at least a new trip to the stacks if not writing to authors to ask for a reprint and waiting for it show up in the mail). It hasn’t been that many decades since none of this was electronic and you might have even found yourself having to mess with microfiche.

“The opposite seems more plausible.”

But even if an LLM don’t do this without prompting, you can easily ask, “Please provide some of the most interesting results in medicine that were published recently in Indian or Chinese journals”? Again, the output includes many relevant links. How easily could you have done the equivalent before AI?
48 Alex SL 05.24.26 at 10:16 pm: Here is a fun Bluesky post I saw over breakfast today, by ?(at) sashagusevposts.bsky.social?:

“I assigned random gender/ethnicity labels to scientific abstracts from the literature and then asked Claude to do a thematic analysis. Claude identified a clinical versus computational split for female/male authors and a DEI focus for Black/URM authors. All in completely random data.”

This is not surprising, because the prejudice that such patterns exist is presumably all over the training data. What is surprising is that anybody thinks these models can be used to train, educate, analyse, or research. This goes far beyond “AI, in contrast, cannot care about anything” and straight into you cannot trust them to do anything where accuracy matters. And Claude is what most people would currently consider the best one available, including some of my uncritically AI-enthusiastic colleagues!
49 mw 05.25.26 at 12:13 pm: Alex SL @48 And if you performed the same experiment with humans rather than Claude, would you expect them to be more or less likely to produce the same results? Personally, I doubt that they would do better.

Can we reasonably expect LLMs to outperform humans in this way when the knowledge that makes up LLM training data is human knowledge to begin with? Well maybe we can (at least eventually) — the AI companies already do apply post-hoc filters on the output to prevent models from saying politically incorrect things. And, of course, with humans, the approach to prevent this sort of bias would be to blind them to the personal characteristics of the authors. Wouldn’t you expect the same to work with LLMs?
50 engels 05.25.26 at 12:29 pm: making the broad array of goods and services affordable to the masses usually involves accepting something less than the finest possible quality

In this case: truthiness instrad of truth, slop instead of music, simulated flattery instead of professional concern, …
51 MisterMr 05.25.26 at 2:22 pm: @mw 47

“Interesting. As far as I can tell, that is not the case in US law. The ‘First Sale Doctrine’ means that once you buy a physical copy of a book, you can legally give it away, resell it, or loan it to as many people as you like (so long as you’re not creating additional reproductions).”

After a quick googling, it seems that it is true that the First Sale Doctrine in the USA does apply to lending, whereas in the EU it doesnt (you can resell but not lend a book you bought).
That said, in strictly literal terms “reading” a book is still protected by copyright, it just happens that in the USA the lending right is also automatically assumed for the first sale; in a pure legalistic sense, if you read a book that someone forgot in the park you are doing a copyright infringment.

All this for pedantry, because nobody in the EU checks if you are lending your books to your friends; at most the problem is for libraries.
52 Alex SL 05.25.26 at 10:26 pm: mw,

I assume if I re-write your last post under my name but make a few changes to every sentence like replacing “But that’s not the case — not as plagiarism has been understood up until the present” with “No, plagiarism hasn’t been understood like that, up to now”, you will conclude that I am not plagiarising you? There have been more blatant cases, with genAI regurgitating entire sections of training data, the most famous case being with the New York Times, IIRC. And note also that the end user will not know if the words in the text they slap their name on have been reshuffled sufficiently compared to training data to fool a plagiarism investigation or if they have not, unless they do so much work to ensure that that they could just as well have written the paper or proposal or book themselves.

We didn’t need AI for there to be slop of all these types

I am getting extremely tired of this argument. That humans sometimes make mistakes or do bad things does not mean that I have to like and use a bot that makes more mistakes and does more bad things at much greater speed. Imagine a large machine that increases our electricity prices by half, that hurls free meals into every direction for miles so that we may slip on them on our way to school or work if we aren’t careful, and 40% of the meals are inedible or poisoned or contaminated, and reacting with “well, some human cooks are also bad”. To stay more on topic of the OP, imagine accepting that level of automated, high-throughput failure in the education of your own children.

The links are there

“are very openly moving towards”

How easily could you have done the equivalent before AI?

We are clearly living in completely different worlds. I go to Google Scholar, and instead of “Please provide some of the most interesting results in the systematics of the genus Acaena that were published by researchers in developing countries”, I type “Acaena taxonomy”, and I get pages of results. The second entry as the results currently present to me is from Chile. There is no problem to be solved here, just as genAI wasn’t necessary to record an autobiography using voice-to-text. Except for the slop generation and, admittedly, the slop code generation*, which truly seems to make a difference to many coders, we had everything you describe before ChatGPT 4 blew up in public consciousness and dropped their climate goals to build the slop production infrastructure.

*) I have seen a large chunk of code generated by a colleague using an LLM and was impressed (derogatory) by how difficult it is to read and how non-modular it is.
53 mw 05.26.26 at 10:59 am: AlexSL @52 “There have been more blatant cases, with genAI regurgitating entire sections of training data, the most famous case being with the New York Times, IIRC. ”

To induce them to do this, it was necessary to prompt with multiple paragraphs of the original articles. Since then, AI companies have implemented many changes to prevent this, but apparently there are still some more artificial techniques to get similar results (although it is increasingly difficult to do). But LLMs remain able to summarize individual articles with good completeness and accuracy. That, however, is clearly not copyright infringement, legally speaking. And it hardly makes sense to complain, on the one hand, that LLMs are so error prone as to be practically useless but then on the other to complain that they can summarize particular articles too well.

“Imagine a large machine that increases our electricity prices by half, that hurls free meals into every direction for miles so that we may slip on them on our way to school or work if we aren’t careful, and 40% of the meals are inedible or poisoned or contaminated”

That’s not a half-bad analogy. Wouldn’t the logical approach be to develop some quick tests to determine which meals were safe and, say, creating a public education campaign about using the tests while also working on improvements to the machine to increase the yield of safe meals but generally enjoying the incredible (and improving) bonanza in low cost food and energy?

“are very openly moving towards”

It’s all a work in progress. Are you sorry to see complaints addressed for fear that it will undercut arguments against them?

“I have seen a large chunk of code generated by a colleague using an LLM and was impressed (derogatory) by how difficult it is to read and how non-modular it is.”

I have seen chunks of code generated at my own request and, although perhaps not the most elegant (and maybe I sent it back to be redone with some improvements), the results were functional and took a small fraction of the time that would have been required to create the equivalent same from scratch. Good enough is very often good enough.

“We are clearly living in completely different worlds. ”

Clearly. It’s probably time to wrap this up for now.
54 engels 05.26.26 at 12:00 pm: Please stop calling it artificial intelligence (AI), it’s human capital that has been stolen, accumulated and privatised, proletarianising its owners in the process: intangible enclosure (IE).

Comments on this entry are closed.

The text is not the product

Recent Comments

Search

Archives

Pages

Book Events

Contributors

Fine Print

Lumber Room

Old Wood

Meta

Recent Posts

Tags