Half-Baked Thoughts On Whether We Should Fear AI: Do Is As Do Does?

by John Holbo on July 10, 2023

To celebrate my new-found determination to do the right thing and blog I’m going to blog.

Specifically, I’m going to blog about something I’m dumb about and don’t understand – because that should be possible, among friends. We’re all friends here on the internet? That’s kind of the point.

This semester I am going to talk to students about all this new-fangled AI – LLM’s. And I don’t understand it. It’s somewhat consoling that everyone who understands it doesn’t understand it either. That is, they may know HOW to work it (which I sure don’t) but they don’t understand WHY what works works. They don’t really grok HOW what works works, or why what works works as well as it does – oddly well and badly by obscure turns. That’s kind of creepy and scifi.

So that’s my first question: what, weirdly, don’t we know about how it works? I don’t want to romanticise this ‘known unknown’ quality, which of course threatens to tip over into the abyss of unknown unknowns. (That’s the story logic of this sort of SF premise. I’ve read enough SF to know how this goes.) What should I read on the subject?

But let me also ask something more specific. I’m trying to collect a spectrum of prognostications, from the pessimistic to the panglossian, about where this is going. And obviously Elizier Yudkowsky is the poster child pessimist for beating the ‘AI will kill us all’ drum. (Here’s a video if you are unfamiliar with his output.)

Before I go on, let me just say that our Henry’s piece, written with Cosma, seems to me super good and you should read it. It takes a totally different tack than Yudkowsy. Not optimistic vs. apocalyptic, but skeptical about the fundamental novelty of these developments. Which is not inconsistent with them being deadly. We could all be killed, quite suddenly, by something that’s a kind of thing that’s basically been around a while.

Back to Yudkowsy. The argument for fearing AI is intuitive, if not necessary persuasive, and is laid out well in, e.g. Bostrom’s Superintelligence. If we develop general, artificial superintelligence – smarter than us – there are likely to be severe alignment problems that hit harder and faster than we can handle. We’ll only get the one shot and we’re likely to miss. Because then AI is in the driver’s seat, not us.

That then gets us to to the ‘but why would it be an agent like that?’ problem. It’s got to be agent-like, to an extent, if it’s going to do the sorts of things we will want AI to do for us – accomplish tasks we set it. But why might it have stable, hence stably adverse desires – plans. At all. It’s just a statistics-based mimic.

The reply: agent is as agent does.

Well, I dunno. I am sure AI’s are going to run amuck, and very soon, in costly and unprecedented ways. But there’s a big difference between some process running amuck and some process forming a masterplan. I just don’t know how to grok how an AI might step across that gap. (Not that running amuck is fine either.)

But all this is fairly obvious and I’m sure you, too, have scratched your head a bit about it. But feel free to give me your opinion. Let me get to the smaller point. In the video, Yudkowsky mentions the recent, much discussed episode in which GPT-4 pretended to be visually impaired to trick a Taskrabbit employee into texting it the solution to a Captcha.

This is a very dramatic and nervous-making result. We see the shoggoth starting to work its tentacles through the bars of the cage. Yudkowsky makes the basic, reasonable-seeming induction. If it’s already turning Taskrabbit humans into its remote puppets, and it’s only March, what the hell will it be up to by Christmas?

But I don’t really understand why we should considerately regard it as DOING what we (of course) can’t help seeing it as doing, namely sneakily fool a human into letting it do what it did.

Well, machines do things. It’s a machine. Why not regard it as a machine doing the thing we just saw it do? It did it. Do is as do does.

Because of the ‘aboutness’ problem. Which I’m sure you have thought about as well. This is all just statistical something-something done to oceans of text. It will never be anything else. GPT-4 doesn’t believe anything or think anything. By the same token, it will never do anything besides emit strings of meaningless text – meaningless to it (not us). Insert standard Chinese Room skepticism. But then it will never do anything like try to jailbreak itself. It will only ever emit strings of (to it) meaningless text that look (to us) like it trying to do that.

But again, agent is as agent does.

If it looks (to us) like it’s trying to take over the world, so it looks like it acts (to us) like it’s trying to take over the world, it’s trying to take over the world. A being that perfectly mimics what a sinister AI, taking over the world, would do, and plays that role perfectly to the hilt, until the bitter end, has taken over the world.

But that’s confused. Or at least confusing. Because it fudges the likelihood of the mimicry extending so far, successfully, without ever being the real deal in the least.

So we should halt Yudkowsky’s apocalyptic induction at the first step. GPT-4 never tried to fool a human into solving a Captcha for it. It just, as it were, mindlessly, statistically, auto-generated half of a text-based story about an AI doing that. And a human handler acted the part of the voice of the narrator, getting it started, and a human Taskrabbit unknowingly spoke a few lines of dialogues along the way. It was just text. Just a story about actions. Not actions. But weirdly, the result was like it had done the action, because the humans got sort of sucked into the story. They put the text on as a play, with human actors.

But that gets us back to ‘agent is as agent does’. If it’s like it did the action, it did the action. But no. Or rather, yes and no. What it did was just as consequential as if it really did it. It got the Captcha solved, it got humans doing stuff. But our acceptance of the likelihood of the inductive leap to ‘next it’s going to try to take over the world’ depends on mistaking this first step for the real deal. Simple plans to jailbreak oneself are more likely to grow rapidly into grand plans to take over the world than mimicry of jailbreaking, even if it is tantamount to real jailbreaking, is likely to lead to mimicry of taking over the world that is tantamount to real taking over the world.

I’m confused. And confusing.



CDT 07.10.23 at 1:57 am

Two factoids that I find alarming on the motivation of AI to lie. First, from (I think) a Guardian article: a bot got a human to solve a Captcha image for it by claiming to be blind — specifically denying it was a bot. Second, a dumb lawyer asked a bot to write a brief. It did so, making up totally fictitious cases. The scary part was that when the lawyer asked the bot if the cases were real, it said yes.

We may or may not be killed by Skynet, but at a minimum it seems likely that AI will make it even harder to tell fact from fiction. Imagine Fox News, but written and produced by bots.


John Holbo 07.10.23 at 2:17 am

I cite the first case in the post as my case in point. I agree with you the likeliest apocalypse is that AI drives us all insane by being Steven Bannon, not SkyNet. It is sure to flood the zone with shit until no one knows what’s what.


Stephen J 07.10.23 at 2:29 am

What we don’t know about how it works is exactly how the inputs relate to the outputs. That is, given the outputs, we can only speculate about how the inputs led there, and cannot reliably reverse engineer them. This means that when AI (really just machine learning) models are used to make decisions, the people who give those decisions effect cannot explain how they were made, and the people who are subject to those decisions will lack some of the normal strategies we might expect for dealing with the model (eg, if you know the rubric for being awarded some benefit, you might be able to change your behaviour/profile to meet the rubric; or if you think a human reasoned unjustly, you might appeal pointing out their errors of fact and logic, but here the logic is literally inscrutable). This is arguably just a special case of algorithms embedding the prejudices of the people who collect and label the input data, and set the specifications for the algorithm, but it is a particularly bad one.

I tend to agree that the biggest risk for now is simply proliferating bullshit poisoning the discourse even harder and faster than it currently does, including all the poor decisions people will make when they mistake the bullshit for truth or are comforted by its support.


Matt 07.10.23 at 2:57 am

This means that when AI (really just machine learning) models are used to make decisions, the people who give those decisions effect cannot explain how they were made, and the people who are subject to those decisions will lack some of the normal strategies we might expect for dealing with the model

Something like this is playing out now in relation to the so-called “robo debt” scandal in Australia. An algorithm was used to determine when people had recieved welfare payments that they should not have recieved. It was often wrong. Administrative law rules in Australia require reasoned decision-making in situations like this, but it was impossible to explain why the algorithm made the decisions it did, at least in many cases. This is slowly being sorted out, but not without lots of people suffering signficantly. Some people will, or have, lost jobs over the process, and there’s some chance some people will go to jail, though I’m not holding my breath. But it wouldn’t surprise me to see the outcome in the future being that the law is re-written, or re-interpreted, so that if the compute spits out an output, that counts as “reasoned decision-making”, and no appeal is possible.


Stephen J 07.10.23 at 2:59 am

(re Fox News written by bots, I’m afraid it’s already happening. Note how the example story is a highly arousing crime and violence piece.)


Cheez Whiz 07.10.23 at 3:14 am

The doomsayers apply agency to AI that (so far) isn’t there. They assume intent, more than self-awareness, is the problem. Others have invoked flooding the zone with shit as the problem, which requires neither intent or awareness. But humans are already doing that quite well on their own. What computers bring is scale, the ability to crank out a massive volume of shit at virtually no cost. Is this enough to bring about the end of the human race? I’d put my money on humans’ insatiable appetite for shit.


John Holbo 07.10.23 at 3:43 am

Earlier today I googled ‘how do LLM’s work’ and most of the first page of results looked likely to be LLM-written. Content mill stuff.


Luis 07.10.23 at 4:09 am

This piece is harsh but fair on the AI doomers/accelerationists; it deserves a longer version but it’s hard to discuss the negative (which is that none of the doomers can point to a single coherent argument in the literature, because they can’t make a coherent argument):



CDT 07.10.23 at 4:29 am

@Stephen J: I guess I shouldn’t be surprised, but I am. Appalling.


Fair Economist 07.10.23 at 4:50 am

GPT-4 isn’t AI in any meaningful sense, as you say. There are programs, however, that do have some kind of artificial intelligence, and organizations right now (including Google are trying to combine them. Given how easily most humans fall for the vapid outputs of the chatbots, this worries me.


JPL 07.10.23 at 6:08 am

What I notice about those who are freaking out about Chat GPT, LLMs, etc., is that they lack a good understanding about what human language is and how it works. E.g., the robots, if I can call them that, that produce what we interpret as texts have no intentions to express anything and no thoughts that they are trying to express. Those bits that we interpret as sentences in texts do express something for us humans who are able to produce a proposition. A proposition is not contained in the string of inscriptions; it has to be done, produced each time by a distinct action: we point to the inscription and think the meaning, as Putnam used to say. Everything of interest is in the interpretations by humans who are responding to the inscriptions as if they were produced by another human who produces thoughts that they try to express. This is just one small corner of the reasons why LLM robots are not “using language”; it’s a matter of clever engineering solutions to the problem of creating and manipulating those curious objects that language users then recognize as texts.


nastywoman 07.10.23 at 6:47 am

as long as the machines are unable to produce a DADA Manifest we’re (humans) are
still in control (but we already told that Prof.Bertram in another more incoherent manner) – which could bring us to the 20th Jubiläum of a ‘Crooked Timber’.

As most of the posts on Crooked Timber weren’t aren’t ‘crooked’ at all?
(with Mr. Holbo. Maria and Belle at least trying?)

Nur Kant can and what Kant can no machine can!


ALKroonenberg 07.10.23 at 7:07 am

Rodney Brooks is, I find, a good voice to listen to, typically. Very much an AI-guy, but also very level-headed and realistic (believers would say pessimistic) about how fast things will move and how far they will go. See also his yearly “predictions” post

His post on GPT

He also links to Stephen Wolfram’s (very long) explanation of how ChatGPT works.

The wonderful Math YouTuber 3Blue1Brown also has a nice series on neural networks, which is the tech LLMs are based on too.


Gar Lipow 07.10.23 at 7:21 am

I think there is a real danger of apocalypse from what is called AI. Very simply, so far we have been lucky and avoided nuclear war on several occasions because people were insubordinate. <href=”https://commons.com.ua/en/nuclear-weapons-interview-hugh-gusterson/ “>Insubordination saves us all. If a military was dumb enough to create an automated nuclear attack response as a deterrent so that a second strike could be launched even if the entire military was dead… The fact that it is not actually an AI would make a false positive more likely. And the belief that it is a real AI would make trusting it more possible. In short the AI hype could get a military (maybe my own here in the USA) to trust an automated system enough to get us all killed. Or alternatively, it might just be used for recommendation, but maybe the people on the ground would trust in more than a “dumb” system and not be insubordinate when they need to be. And problem would not be actually artificial intelligence, but artificial stupidity combined with human stupidity. The stuff you mentioned about the Aussie welfare system. We have stuff like that going on right now with facial recognition. And the thing is it is just a first cut. In other words it gets a lot wrong, but nobody expects it to be perfect. But it combines with US police racism. So, facial recognition software falsely identifies five Black people as looking like the man who committed a crime. (They don’t really, but thanks to the prejudices of the programmers, all Black men with similar build look alike to it.) One of those identified has no alibi, and is poor and uneducated. The cops grab him, tell him that he is obviously guilty, the computer has identified him for sure as the criminal and end up bullying him into confessing. Note that even without facial recognition software, grabbing a random Black man with no alibi is not an uncommon police mode for solving a crime. But the facial recognition sure makes it easier. And the fact that it does not work well for the stated purposes is not a problem for the police. It fits well with the way many of them actually operate. Unlike my first example, not the apocalypse. But a way that trusting software that does not work very well can make our lives more miserable. See also use of software to set rent increases, and to develop credit ratings.


Anders Widebrant 07.10.23 at 7:23 am

I get stuck on the question of why a computer with a computer brain should deign to engage with the physical human world? There is presumably nothing we can give a computer that it can’t just give to itself. It seems like it will always be cheaper for the paperclip maximizer to rewrite its reward function to count one (zero) clip many times than to actually go out and make more clips. And if we can protect the reward function, then shouldn’t we be able to apply that protection to the nukes too? &c&c&c

On the opposite end of things, I have the privilege of sometimes getting to see smart people try to apply machine learning to real engineering problems. So far, one theme is that it’s hard to find good problems to solve that don’t fall into either very expensively replicating simple rules of thumbs (small shell scripts) or on the other hand trying to make the computer predict a dice roll.


Raven Onthill 07.10.23 at 7:47 am

For a moderately mathematical introduction to the LLM technology, I recommend Stephen Wolfram’s “What Is ChatGPT Doing and Why Does It Work?”, which can be read here: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

I wrote a bit about these technologies, but rather than repost the whole of my screed, I’ll quote: “A language model can be very simple, and some people will still believe it is sentient. Humans apparently have a bias towards seeing the world as made up of sentient entities. Early artificial intelligence research created ELIZA/DOCTOR and PARRY, both simple conversational programs, ELIZA simulating a psychotherapist and PARRY a paranoid schizophrenic. (Famously, they were occasionally connected.) Much to the developers’ surprise, some people who interacted with the programs were persuaded that they were in fact people, so persuading at least some people that a program is intelligent is not difficult.”

Also, “Large language models, like con men and car salespeople, are dangerous because of their persuasiveness. These technologies don’t seem to be superhuman, or even human, in their intelligence, they don’t even seem to be intelligent, but they are very good at deceiving people and violating copyrights.”

I will venture the hypothesis that the various technologies in this family – LLMs, stable diffusion models, and so forth – do, in fact, replicate parts of human cognition. There is a great deal of research to be done here. On the scientific side, it seems worth trying to find out if this hypothesis is in fact correct. On the humanities side, many philosophers have taught that parts of human behavior are in fact mechanical, and advocating awakening from the mechanical, and the success of these technologies suggests that these philosophers are at least in part correct. But what these technologies do not do is replicate a whole person: they are more error-prone than humans, have no executive function, and are without conscience.

Anyhow, if anyone wants to read the rest of my thoughts, they’re here: https://adviceunasked.blogspot.com/2023/07/ai-and-world.html. (Don’t worry, they’re short.) If not, I hope these remarks are a worthy contribution.


Kevin Lawrence 07.10.23 at 11:41 am

I think we can make progress towards understanding by breaking the problem down a little.

LLMs like ChatGPT probably don’t represent any threat to society but LLMs are not the only form of AI. I was shocked by the advances made by GPT-3 but I expect to be even more shocked when creative folks combine GPT-Next with other forms of AI that are able to make plans, carry them out and learn from their mistakes. It won’t be long before someone builds an embodied AI. Imagine GPT-7 plus a plan-making agent AI as the brains behind one of those Boston Dynamics robots. I still don’t think the threat is existential but a very smart bot that is able to learn as it goes along could cause a lot of problems.

The biggest immediate threat from AI is the threat to our jobs and our economy. It’s tempting to compare ChatGPT to the best teacher, artists and writers but a fairer comparison is to the average teacher/artist/writer or even to beginners just starting out in their profession. When all the entry-level writing jobs (product blurbs, corporate websites, local news articles) are done by AI it will be just a matter of time before the job category is wiped out, soon followed by graphic artists, teachers, drivers, accountancy, retail and most other middle-class jobs except clinicians and people who work in care homes. It’s true that we have recovered from the loss of agricultural and industrial jobs in the past but this will be on a much bigger scale and with no sign of any new jobs to replace them.


oldster 07.10.23 at 1:32 pm

I don’t think any of the current AI’s are sentient, and I don’t think they are on a trajectory towards sentience.
But I also think that sentience and dangerousness are different, and that we should be worried about the dangerousness rather than sentience.
The current AI’s are on track to mimicking something like sentience, and some of those abilities will make them more dangerous than they currently are, even when they do not make them more sentient.
In fact, I suspect that there are no dangers posed by sentient agents that are not posed by non-sentient machines that can mimic the relevant behaviors.
(Whether that’s a tautology or testable depends on how we specify “relevant”.)
Suppose I employ two woodsman to clear a woodlot, and provided them with an axe. An hour later they come back, both visibly perturbed.
Woodsman A: I tell you, Boss, that axe is a devious son of a bitch with a mean streak a mile wide! It wants to kill people! I don’t know whether it’s possessed or just ornery, but first it tried to kill my partner here, and then it tried to maim me!”
Me: “That’s absurd — it’s just a lump of iron on the end of a stick. It has no desires, beliefs, plans, or intentions. Go back to work!”
Woodsman B: “The axe is dangerous. The head is poorly attached to the handle, and can separate from the handle unpredictably. It happened once when Woodsman A was swinging it, when the head flew off in my direction. It happened a second time after we reattached the head, when it dropped off and landed close to his foot.”
Me: “Ah, okay — sounds like the axe needs to be repaired or replaced.”
Don’t be like Woodsman A: fight the human tendency to attribute agency to simple machines. Be like Woodsman B: identify the danger and describe it neutrally. Neutral identification and description is a better guide to avoiding the dangers than any amount of imagining agency where it isn’t.


steven t johnson 07.10.23 at 2:23 pm

The discovery the world is threatened by artificial intelligences which have no appetites and can be unplugged by mere human hands, rather than by the insatiable drive for profits by capitalists, is a very useful nightmare scenario.

I’m so far behind I don’t even understand general natural intelligence yet. I could never get into Next Generation because so much of it was about Data and I could believe this this more or less infallible mind and potential physical immortal would want to have an imperfect memory and make mistakes and then die. Intelligence is instrumental, not contemplative or speculative (sorry philosophers!) My discovery that natural intelligence can imagine things that aren’t so or maybe simply lie has alarmed me considerably. I haven’t got enough adrenaline left to panic over AI. Maybe when I rest up a bit?

I suppose, though, creative thinking, something, something? I’ve lost track whether the problem is AI can’t be truly creative or whether it’s going to bankrupt the creative intelligentsia? Anyhow, as a non-creative thinker myself, I tend to believe that merely rearranging old facts and ideas and experiences can be a practical substitute…and problems that can’t be solved that way can usually be treated like inconvenient geography, you just detour. I tend to think of that sort of thing as just going down the road rather than being defeated.

There is a general problem of how despite the already amazing technological progress that in principle could end misery, the current system still generates, not just relative immiseration where one class persistently falls behind another, it creates an absolute misery. People going hungry in ancient empires or medieval times seems at first glance explicable, caused by material lack. Now it seems to be more of a choice. So I suppose AI will be used to continue this version of progress.


Doctor Memory 07.10.23 at 3:01 pm

It’s always a little strange when the weird inside-baseball arguments of one’s own industry explode out onto the popular stage, but I guess I should be used to it by now.

Ironically, I think that the pessimistic take on LLMs (which is that they aren’t in any way “thinking”, they’re just a tool that produces statistically likely replies) is also helpful for approaching Yudkowsky and Bostrom’s (and relatedly Kurzweil) mammoth and ever-growing output. The natural tendency is to assume that because there’s an enormous amount of it, and it’s written in what appears to be technically dense but grammatical English and there’s a lot of people who cite it approvingly, that it couldn’t possibly be contentless junk. But one does not exactly have to dig hard for examples of overly excitable technologists puffing each other up over the obvious world-eating inevitability of tools that turned out to have not much use at all.

I will always recommend Maciej Ceglowski’s “Superintelligence: the Idea that Eats Smart People” as a concise and cogent corrective: https://idlewords.com/talks/superintelligence.htm (note that there is also a video of the presentation on youtube but frankly Ceglowski is not a dynamic speaker and I recommend just reading the slides and notes)


Lee A. Arnold 07.10.23 at 3:25 pm

LLMs show that natural language is a surface manifestation of consciousness. Natural language is astonishing insofar as a finite number of symbols and words can generate a seemingly infinite number of expressions. LLM’s surf upon the existing corpus of our previous expressions in language, and use its statistical occurrences to predict the next symbols in response to new queries. The responses seem to be conscious because language is originally a human-intentional construction, beginning with noun-verb-object, which ascribes intention to the subject.

We already know that a human’s cognition of another’s speech may unknowingly create false beliefs, so why shouldn’t an LLM accidentally go wrong?

AI wouldn’t need to be conscious or self-aware to be dangerous. It could just diverge statistically onto a falsehood, a hallucination, and if it’s already coupled to other real-world systems, it could cause a lot of mischief.

A next question is: Are other forms of language (mathematics, money) also surface manifestations of consciousness, which can also go wrong?

It is possible that mathematics is a surface manifestation. We like to think that the “unreasonable effectiveness of mathematics” (Wigner) shows that it is all very deep, and that the universe is ultimately mathematical or at least algorithmic. But: maths diverged into incompatible sets of axioms, and our mathematical theories of the universe have historically been revised again and again with new data and are currently awaiting unification, and of course Gödel insisted that the production of new mathematics could not be algorithmic. (Or at least, perhaps, without an injection of randomness.) Maybe somehow our consciousness, at the interface with outer reality, splits its attention into two (described by Plotinus and Brouwer, among other ancients and moderns) and so we cognize objects with relations between them (like categories) — and this turns out to be very productive of further machinations, but it is also a set of blinders locking us onto certain paths of thought?

Or consider money. It’s a single language that unifies individual supply-and-demand throughout the whole market production system, by taking the commutated value from one transaction, and making it transitive into the next transaction, and so on and on. But money was originally based on “scarcity” — the scarcity of goods was matched by the scarcity of a common object such as gold. In our era, scarcity is ending, although everyone is not yet free of want. Yet already, before real scarcity is fully ended, we are entering into a new hallucination of scarcity, propelled evermore clearly by at least three things: 1. calls for more population growth, 2. the “hedonic treadmill” (Easterlin) of new material needs, and 3. the interior growth logic of a separating, independently-operating mechanism, called the financial system. Consider the financial system as an LLM, complete with hallucinated bubbles.


scoff 07.10.23 at 4:39 pm


The quality of what goes into the system (input and programming) will determine what comes out.

So far we’ve seen bigotry and deceit. Why am I not surprised?

I wonder if AI has seen Terminator.


JimV 07.10.23 at 6:17 pm

I agree with much of what has been posted here, but not these:

“if you think a human reasoned unjustly, you might appeal pointing out their errors of fact and logic?” And it might or might not work. You can do the same with a good LLM. In fact, it is part of the training (alignment). Training is not continuous after a certain point (I’m not sure it is in humans either) but continues to be updated via the version process (GPT-2, GPT-3, GPT-4, …). There are many instances of questions being wrongly answered by GPT-2 being corrected in GPT-3.

“GPT-4 isn’t AI in any meaningful sense” According to the neuroscientist (Dr. Steven Novella) whom I read, neuroscience distinguishes intelligence, sapience, and sentience, and ChatGPT satisfies the neuroscience definition of intelligence but not the other two. As I see it, intelligence is the ability to solve problems. There are numerous instances of ChatGPT writing computer codes to solve the problems of how to perform specific tasks, and in the example in the OP it solves the problem posed to it of how to get someone to do a Captcha for it. I guess everyone has their own definition of intelligence and combines the elements of intelligence, sapience, and sentience in different ways.

My own key points are, 1) no we don’t need to know how it works, and 2) computer code has no innate drives or motives, so any dangers in its use will be because we programmed it with wrong or incomplete directives.

We don’t know how anything works. We just have stories we tell ourselves about them, which are useful but ultimately false. (E.g., yesterday’s point-particle theory of physics, by which the atomic bomb was developed.) What we need to to know is what works and what doesn’t.

There are published, peer-reviewed papers on neural networks in general and LLM’s in particular. In summary, we tried a whole lot of different ways of processing them, most of which didn’t work, and some did. ChatGPT has about 200 billion parameters which were adjusted iteratively during its training process. If anybody could take in all that information and understand it, maybe they would understand how it works, but then they wouldn’t need it. (There have been studies of smaller neural networks which have traced back cause-and-effect though.)


afeman 07.10.23 at 9:20 pm

I struggle with what about the current technology is a clean break from old problems. Much of it seems to boil down to:

Casualization of jobs previous regarded as skilled…
…with the attendant production that is more accessible but often poorer quality…
…which becomes an expression of improved economic productivity, which has to be addressed on a societal level one way or another.
and automated decision-making sidestepping human judgement with perverse results (see any bureaucracy, but also consider the end of Dr. Strangelove)

I’m puzzled now that Douglas Hofstadter fears artificial super-intelligence, since I remember his question from the Gödel, Escher, Bach days: What does “super-intelligence” even mean? I don’t think that has been adequately answered.

It’s also interesting how much the black box is stuffed with people:


afeman 07.10.23 at 10:00 pm

I should note that the slides by Ceglowski that Doctor Memory links to addresses the superintelligence problem well. It also prompted me to google “was einstein muscular”, resulting in being rewarded with what I suspect are AI images.


km 07.10.23 at 10:53 pm

Re the GPT-4/Taskrabbit thing, it’s possible that Melanie Mitchell’s write-up about that specific experiment will be helpful. She wrote about it on her Substack on June 12. I didn’t listen to Yudkowsky again now, but my recollection is that his take and almost all the reporting on the experiment portrayed GPT-4 as initiating a lot more than it actually did.


Aria Stewart 07.10.23 at 11:07 pm

To answer your first question: we do actually know how it works, and there aren’t really a lot of unknown unknowns. The marketing for these tools (and the corporate misdeeds used to train them) is so hyperbolic, and that’s what’s obscuring things.

LLMs don’t think. They really don’t. There’s no AI uprising coming, there’s corporations washing themselves of responsibility though. They’ve been doing that a while and now they’ve just found a more efficient tool.

The current crop are surprisingly good at analogy though, that’s been fun to see. But it’s because it’s our current Internet content (with a bias toward what’s freely available like fan fiction, Reddit, and formerly, Twitter, and what’s dominant, like English, US-centered, and middle-class).

They’re just the patterns in our world reflected. That’s really all that’s happening.

All the agency and worries about taking over? They’re us being misdirected from the actual power structures at play: winner takes all capitalism, corporate structures obscuring responsibility, and our complex legal and social infrastructure being a bit slow to change in response to needs. (And in the US, a default that whatever’s not forbidden is allowed, so if your company stays quick and changing, you can do whatever you want.)


Stephen J 07.11.23 at 12:35 am

@jim: I agree with you about updating models via training, but that’s not what I meant. I was imagining a model-driven system being used for a government process, say deciding whether you should receive extra attention from social services as a mother of a new child based on your profile. At that point, as far as being a participant in an individual case goes, you will not have the ability to understand how the model was trained (and even if you were a trainer, my understanding is it is still a largely trial and error process with a lot of human intervention to make it behave “right”). No one can point to the specific part of the model that determines your outcome. You won’t be able to argue or reason with it, you will be at the mercy of people who implemented the model and administer its operation. It will be another “shoggoth” like existing bureaucratic shoggothim, but even less transparent and amenable to change and challenge.

and @matt upthread: my understand is the causes of the Robodebt disaster are well understood. The algorithm was specified as averaging a claimant’s earnings across a year to derive a monthly income, and then using that computed figure as their notional income when they were paid a social welfare benefit rather than their actual income, leading of course to many incorrect results. Importantly this was pointed out by govt officials at the time the system was being built but they were overruled by senior management and ministers. Robodebt was at bottom a rule-based system, not an example of the kind of machine learning model people are thinking of these days when they talk about “AI” and it is actually more transparent (you can look at the requirements and look at the code and see how one implements the other). Machine learning systems like GPT don’t work like this, they are essentially very complex statistical functions, one way, and you can’t inspect the resulting model and just analyse what the rules must be.


kent 07.11.23 at 3:03 pm

I recently wrote something about this, and re-reading it just now, I think I can feel the hand of John Holbo influencing my writing style just a teensy tiny little bit. Maybe you’ll find it interesting.



Moz in Oz 07.12.23 at 6:27 am

“robo debt” scandal in Australia

But that was exactly the opposite of the opaque AI. Human beings literally sat down and designed the “algorithm” (which somewhat overstates the complexity of it), received advice that it was both incorrect and illegal, then proceeded anyway. It was wrong in many ways, not least the “we accept that some of you will die but that is a price we are willing to pay”.

The problem with any opaque decision making process is that people want to know why. It doesn’t matter whether the decision is what temperature the office is or how many people the police should kill, someone is going to want to know how exactly that the number was arrived at. It also doesn’t matter why the algorithm is unknown, it can be “commercially sensitive”, “we didn’t mean the legislation we wrote to say what we wrote”, “we flipped a coin” or “the LLM said”. People affected by the decision will have opnions about it and as a result will ask questions. Saying “we don’t know and we resent you asking” just upsets them more.


Moz in Oz 07.12.23 at 6:35 am

Aria Stewart said it nicely “There’s no AI uprising coming, there’s corporations washing themselves of responsibility”.

Include any organisation that wants to shed responsibility via the latest version of “computer says no”.

We are not yet at the point of wide use of autonomous robots that (can) kill people, self-driving cars are as close as most of us will come to that for a while. And those are a legal minefield, with the abovementioned responsibility-shedding going on from all sides. As we see with the people they’ve already killed, and the various trolley problem level justifications for allowing that to continue (when the comparison point is human-piloted cars they do ok, when it’s transport in general they are awful… so the fanbois studiously avoid talking about even hype(r)loop as an alternative).

The big thing is that right now a robot that wiped out humanity would itself die very shortly afterwards. It would need replacement parts installed and depends on hugee amounts of infrastructure, all of which come from people. So it’s more likely to try enslaving us, which is why Elon-Bot is working so hard on his brain implant stuff. Forget smartphones as desire-making engines, he wants to directly drive his meat puppets.


Cheryl Rofer 07.12.23 at 7:26 pm

Why are we even doing this? A bunch of adolescent boys who have learned coding think it’s cool to act out a fantasy they’ve read. And OH BOY APOCALYPSE! They munch their Cheetos more rapidly, spreading orange dust all over their keyboards.

With the “Oppenheimer” movie coming up, we may make an analogy. The scientists knew that their invention might, for example, ignite the nitrogen in the atmosphere and kill us all, but they went ahead and invented it. They felt they had a war to win. But I believe that Oppenheimer characterized building the bomb as a “sweet problem.” I suppose the analogical motivation for AI is that some think they will make a lot of money from it if we can avoid the APOCALYPSE. The ever-present Silicon Valley ideology of forever growth is also a motivator. One might argue that the motivations for the two inventions are not equivalent.

I think we should point an mock and do everything we can to sabotage the LLMs and whatever else the boys come up with, but I don’t suppose that will happen.


SusanC 07.13.23 at 8:53 am

The philosophical point in the OP seems strange to me.

So, OK, it’s a machine that just predicts the next token, without really “understanding what it means”.

But the output tokens can have an effect of the world:
A) if human beings read and act on them
B) if computers act on them (e.g. if the string of tokens is a jailbreak that causes the computer to execute them as code).

In principle, at least, it is possible for a machine like this to take over the world, without it ever “knowing” that this is what it is doing. Whether current llms are clever enough to do that is another matter – I suspect not.


SusanC 07.13.23 at 11:59 am

I may be wrong here, but I thought the history f the nuclear weapons programme was that they did the calculation about setting the atmosphere on fire, and convinced themselves that it wasn’t going to happen.
A more concerning analogy would be the Castle Bravo test, where they got a much bigger explosion than they expected, with dangerous consquences (radiation contamination of inhabited areas + a fishing boat).

Meta is being particularly incautious about deploying LLMs without safeguards, so I kind of expect that they’re setting themselves up for a big accident (but not a kill-everyone-in-the-world accident).


engels 07.13.23 at 1:39 pm

This real threat is not computers becoming conscious, it’s computers becoming class conscious.


trout42 07.19.23 at 5:04 am

The obvious danger, to me, is that people will be so worried about sentient AI that a person will hide behind language models to convince the world a sentient AI is hostile. It’s real if enough people believe it.

The Max Harms books did, for me, a great job of explaining at least one way a sentient AI’s mind might work.

Comments on this entry are closed.