Last September, Abe Newman, Jeremy Wallace and I had a piece in Foreign Affairs’ 100th anniversary issue. I can’t speak for my co-authors’ motivations, but my own reason for writing was vexation that someone on the Internet was wrong. In this case, it was Yuval Harari. His piece has been warping debate since 2018, and I have been grumpy about it for nearly as long. But also in fairness to Harari, he was usefully wrong – I’ve assigned this piece regularly to students, because it wraps up a bunch of common misperceptions in a neatly bundled package, ready to be untied and dissected.
Specifically, Harari argued that AI (by which he meant machine learning) and authoritarian rule are two flavors that will go wonderfully together. Authoritarian governments will use surveillance to scoop up vast amounts of data on what their subjects are saying and doing, and use machine learning feedback systems to figure out what they want and manipulate it, so that “the main handicap of authoritarian regimes in the 20th century—the desire to concentrate all information and power in one place—may become their decisive advantage in the 21st century.” Harari informed us in grave terms that liberal democracy and free-market economics were doomed to be outcompeted unless we took Urgent But Curiously Non-Specific Steps Right Now.
In our Foreign Affairs piece, Abe, Jeremy and I argued that this was horseshit. What we know about authoritarian systems (Jeremy has a great new book on this topic) is that it’s really hard for them to generate and use good data. When data is politically important (e.g. it shapes the career prospects of self-interested officials), there is invariably going to be a lot of stats-juking. We suggested that machine learning is very unlikely to solve this fundamentally political problem – garbage in most definitely leads to garbage out. And we claimed that machine learning will introduce its own particular problems. Many of the worries that have been expressed in US and European debates about machine learning – that it can create self-perpetuating forms of bias – are likely to be much more pervasive and much worse in authoritarian regimes, where there are far fewer mechanisms of open criticism to correct nasty feedback loops that keep on getting nastier.
That was a Foreign Affairs think-piece, which means that you wouldn’t want to have staked your life on its thesis: its boldly stated arguments are better described as plausibility claims. But now, there is actually some Real Social Science by a UCSD Ph.D. student, Eddie Yang, that makes a closely similar argument to ours. Eddie came up with his argument independently of ours, and furthermore did the hard and serious work of figuring out whether there is good evidence to back it up. Whereas we (and Harari) have sweeping arguments, Eddie has specific hypotheses that he tests against quantitative data. I think this piece is going to get a lot of attention and I think it deserves it.
What Eddie argues is the following. Authoritarian regimes face a well known set of trade-offs involving information. Ideally, they would like to be able to do two things at once.
First, they would like to make sure that their domestic enemies (which are often, in practice, frustrated minorities within the ruling class) can’t mobilize against them. This is why they so often impose extensive censorship regimes – they don’t want people mobilizing around their collective unhappiness with their rulers.
Second, they would like to know what their citizens actually think, believe, and are frustrated about. If citizens are really, quietly, angry, there is a much greater likelihood that crises will unexpectedly lead to the regime’s demise, as they did in Eastern Germany before the Berlin Wall fell, or in Tunisia before the Arab Spring. When there is diffuse and strong unhappiness with the regime, a small incident or a minor protest might suddenly cascade into general protests, and even regime collapse. This doesn’t necessarily lead to a transition to democracy (see: Arab Spring) but that is scant comfort to the rulers who have been deposed.
The problem is that these two desires are hard to reconcile. If you, as an authoritarian ruler, suppress dissent, you will flatten out public opinion, making people less likely to say what they truly think and believe. But if you do that, you also have a very hard time figuring out what people actually think and believe, and you may find yourself in for nasty surprises if things begin to go wrong.
This tradeoff explains why authoritarian regimes do things that might seem initially surprising, such as e.g. quietly running opinion polls (to be read only by officials), or creating petition systems. They want to know what their publics think, but they don’t want their publics to know what they think. It’s quite hard to do the one without the other.
None of this is news – but what Eddie does is to ask whether machine learning changes the dilemma. He asks whether new technologies make it possible for authoritarian government to see what their public wants, while still squashing dissent? The answer is nope, but it’s a very interesting nope, and one that undermines the Harari thesis.
So what does Eddie answer his question? He starts with a dataset of 10 million Chinese social media posts from Weibo (think Twitter with Chinese characteristics). And then he applies an actual real political sensitivity model from a Chinese social media company (which unsurprisingly goes unnamed) to score how spicy the content is. I’ve no idea how he got this model, and I imagine he has excellent reason not to tell, but it allows him to approximate the actual machine learning techniques used by actual Chinese censors. His core finding is that machine learning bias is an enormous problem for censorship regimes – “AI that is trained to automate repression and censorship can be crippled by bad data caused by citizens’ strategic behavior of preference falsification and self-censorship.” In other words, the biases in bad data give rise to systematic blindness in the algorithms, which in turn may very plausibly be destabilizing.
The fundamental problem is that the data that the machine learning algorithm has access to are a lousy representation of what Chinese citizens actually believe. Instead, the data reflect what Chinese citizens are prepared to say in public (after decades of censorship), which is … not quite the same thing. That makes it really hard to train a machine learning system to predict what Chinese citizens might say in a time of crisis, and rapidly squelch any dangerous sounding talk, which is what the CCP presumably, wants to be able to do (the first priority of China’s leaders is and always has been domestic political stability). Eddie’s simulations suggest that the traditional solution – throw more data at the model! – doesn’t work so well. Exposing the model to more crappy data – even lots and lots of crappy data – leads only to negligible improvements.
And this is where the interesting twist comes in. What does lead to significant improvements is drawing on a different data source: specifically, drawing on data from an uncensored Western social media service. When you feed the model data from Chinese language users talking about similar stuff on Twitter, the accuracy of the model improves significantly. Now that the model is able to ‘see’ the kind of potentially politically dangerous language that self-censoring citizens don’t use on Chinese social media services, it is able to do a better job at identifying it, and potentially censoring it when it is used in places where it can use censorship. Free and open Western social media (i.e. Twitter before it became the shitshow it is now) provide an unexpected service to authoritarian regimes, by providing them with a relatively unbiased source of data that they can train their algorithms of oppression on. There are limits to this – the uncensored conversation outside China is obviously different from the conversation that Chinese citizens would have if they weren’t censored. And the uncensored conversations are obviously still imperfect proxies for what people really think (a concept that gets blurrier the harder you poke at it). But still, it provides much better information than the censored alternative.
This tells us two things. One is that machine learning offers new tools to authoritarians, but it also generates new problems. None of these new technologies make politics go away. There are a lot of people who treat machine learning as some kind of automated sorcery – they may disagree over whether it will miraculously cure the world’s problems, or cast mass publics under an evil and irrevocable enchantment. They don’t usually have much idea what they’re talking about. Machine learning is applied statistics, not magic.
The other is a little less obvious. When the world begins to fill up with garbage and noise (borrowing a term from Philip K. Dick, when it is overwhelmed by ‘gubbish’), probably approximately correct knowledge becomes the scarce and valuable resource. And this probably approximately correct knowledge is largely the product of human beings, working together under specific kinds of institutional configuration (think: science, or, more messily, some kinds of democracy). The social applications of machine learning in non-authoritarian societies are just as parasitic on these forms of human knowledge production as authoritarian governments. Large language models’ sometime ability to approximate the right answer relies on their having ingested large corpuses of textual data on how humans have answered such questions, and drawing inferences from the patterns in the answers.
So how do you keep these forms of knowledge production alive in a world where every big tech company wants to parasitize them and/or subvert them to their own purposes? Eddie’s article suggests that authoritarian governments have a hidden dependency on more liberal regimes’ ability to produce more accurate information. The same is true of the kinds of knowledge capitalism that are dominant in free societies.
Decades ago, Albert Hirschman described cultural depletion theories under which market societies undermined the cultural conditions that they needed to reproduce themselves, by devouring trust relations, honesty, good will and the like. It’s at the least prima facie plausible that knowledge capitalism – in its current form – does much the same thing. Software eats the world, consuming the structures that produce more or less reliable human knowledge – and excreting gubbish instead. You don’t have to do the full Harari-we’re-all-doomed-thumbsucker-article-for-the-Atlantic to worry that the long term implications of this may not be so great. The future is not an inevitable dystopia, whether authoritarian or capitalist. Equally, there is no reason to assume that new technologies and forms of production will automatically regenerate the kinds of knowledge structures that we need to figure things out collectively in any even loosely reliable way.
[also published at Programmable Mutter ]
{ 7 comments }
steven t johnson 07.25.23 at 5:56 pm
“… times of crisis, when people reveal their true preferences…”
This principle is not generally accepted. It is highly unlikely Yang or any other acceptable thinker would accept the notion that revolutionary periods, which certianly counts as crisis, lead to people expressing their true opinions. Given the historical experience of what Yang calls the “free world” in suppressing the extraordinary variety of opinions that suddenly burst out from the most unexpected places when people can get away with it, some elementary comparisons should be required.
Also, conceptually, the notion of repression and authoritarian really need some unpacking. If authoritarians do surveys to identify urgent issue and address them in some fashion by some sort of change in policy it is not sufficient to dismiss this as meant to split the presumed mass opposition. A free world where people can say what they want because no policy will change is free in what sense exactly?
I am well aware that formal legalities such as absence of government pre-censorship are in the end held to define democracy. The thing is, since I reject the principle that the means (in this case, the formal laws) justifies the end (in this case, that such laws are the rule of the people, aka “democracy”,) this is a motte-and-bailey. It’s even worse in my opinion because it is the legalism, held to be the motte, that is the bailey, the extreme claim.
oldster 07.25.23 at 8:38 pm
“Eddie’s article suggests that authoritarian governments have a hidden dependency on more liberal regimes’ ability to produce more accurate information.”
A variation on the old joke that after communism and planned economies sweep the world, the Soviets will still keep Finland around as a capitalist state “in order to know what the prices of goods are.”
LFC 07.25.23 at 8:54 pm
The current Chinese regime has been in power since 1949, and while the economic approach has changed quite drastically, and while self-defeating and disastrous episodes such as the Great Leap Forward and the Cultural Revolution are no longer on the agenda, the basic structure of the polity as a one-party state/regime has remained. In recent years there has been popular dissatisfaction (e.g. with the approach to Covid) that the regime has managed, not always gracefully or easily, to navigate. Cultural obliteration (genocide) of the Uighurs and repressive policies in Tibet and Hong Kong do not seem to have provoked an enormous amount of opposition in the general population.
Some of the reasons for the Chinese regime’s longevity are suggested in Levitsky and Way’s recent Revolution and Dictatorship. Those reasons have to do with the regime’s origins in social revolution and civil war, and the way those experiences created a cohesive and loyal political elite and a tight fusion between the party and the military apparatus. Chinese civil society remains relatively fragmented, weak, and disorganized.
In this context, the Chinese regime doesn’t need to gather especially reliable, good data on what its subjects are thinking in 2023, any more than it had to do that in 1989, or 1979, or 1960. The mechanisms — reports from local officials “on the ground” or certain human informants keeping their “ears to the ground” and reporting on what their neighbors say in perhaps rather rare unguarded moments, or whatever (“whatever” because this is not my field, but then it’s not H. Farrell’s either) — that worked in the past will likely continue to work for the regime. And when and if they stop working, it may well not be because of the dilemmas inherent in trying to use AI/machine learning to find out what the population is thinking in order to censor it.
Some social scientists have built their academic careers partly or wholly by studying the Internet and social media (and now AI and machine learning), and that’s fine and entirely appropriate. But older questions, e.g. about regime durability and survival, that predate the digital era and that of “knowledge capitalism,” remain interesting, and the relevance of a study such as Yang’s to those questions appears perhaps somewhat (?) tangential, at least to judge from the OP’s summary. Though if a very substantial percent of the Chinese population has managed to evade the regime’s restrictions and post uncensored material on Western social media, and if the regime finds a way to use that fact (if it is a fact) to its advantage, that might be a different story.
Cultural depletion theories like those described by Hirschman, or like Bell’s “the cultural contradictions of capitalism,” whereby market societies were to undermine either the personality traits (e.g., frugality, abstention from excessive consumption) or the social characteristics (e.g., honesty and “trust relations”) that ensured their reproduction, do not seem to have panned out. If they had, “market societies” would have collapsed by now. The inference may generalize. Ecological apocalypse, refugee crises, food insecurity (if not “famine equilibria”), housing crunches, epidemics, over-urbanization, and the like may prove to be greater threats to the long-run survival of authoritarian governments — and, for that matter, non-authoritarian ones — than the dilemmas inherent in AI.
John Q 07.26.23 at 3:40 am
AFAICT, machine learning in this context is a combination of stepwise regression and discriminant analysis, both of which became popular in the 197os, thanks to the development of computers powerful enough to implement them. The only difference is that there is a lot more data. But these techniques had fundamental problems from the start, and adding more data never helped much.
Stepwise regression is just automated p-hacking, usually with a fairly low implied value for p. And discriminant analysis assumes a population divided into distinct and fixed subgroups, which rarely works when applied to humans who can change their behavior in response to signals and incentives. When I was doing it, we were pretty happy if we could get 70 per cent right on a two-way split.
It’s not clear to me how, if at all, “machine learning” differs from the kind of statistical analysis done (with machines, and aimed at learning) 50 years ago. When I’ve asked people who should know, I haven’t been able to get a clear answer.
Moz in Oz 07.26.23 at 7:17 am
JQ, I think the difference is that the machine is now able to imagine equations you can’t, and often equations you can’t understand when presented with them. But sadly the AI can’t explain either the result or how it came up with it, only that it meets the criteria you gave it better than other equations it tried.
It’s more like giving an Econ101 class of 500 students an assignment and marvelling that someone got every single question right. Ask them how and “I studied really hard”. Assuming that the same student will also get 100% the next time would be unwise, and likewise any model from “the same” AI aimed at forecasting rather than matching retrospective data.
One easy example for me is “if it can translate Japanese to German why can’t it translate Linear B to English”… insufficient data is only part of the problem. Even better when applied to legal questions, where there is generally no right answer only good and bad reasoning (which is not to say that even terrible reasoning can’t be “correct in law” (not looking at the US Supreme Court at all))
KT2 07.27.23 at 6:40 am
“There’s No Fire Alarm for Artificial General Intelligence”, nor I suspect will there be an alarm for, as you outline, technologies that “make it possible for authoritarian government to see what their public wants, while still squashing dissent? The answer is nope,”.
Maybe ‘they’ – and many of ‘us’ – don’t or won’t think or know, and haven’t been accustomed to the new signals necessary to be alert to newly aquired authoritarian tools, when combined with propaganda, bread & circuses. “flat” opinions due to profiling and signaling, may work in to dampen the populace sensors, and leave them floundering at a crucual moment about who to trust and whixh way to jump.
But in hindsight…!
*
“There’s No Fire Alarm for Artificial General Intelligence”
October 13, 2017
Eliezer Yudkowsky
…
“If you look at what people actually say at the time, historically, they’ve usually got no clue what’s about to happen three months before it happens, because they don’t know which signs are which.”
https://intelligence.org/2017/10/13/fire-alarm/
KT2 07.28.23 at 2:29 am
Moz in Oz said: “JQ, I think the difference is that the machine is now able to imagine equations you can’t, and often equations you can’t understand when presented with them.”.
And crush them now in real time with “36 exaFLOPs” The Condor Galaxy. “Setting up a generative AI model takes minutes, not months and can be done by a single person. CG-1 is the first of three 4 ExaFLOP AI supercomputers to be deployed across the U.S. Over the next year, together with G42, we plan to expand this deployment and stand up a staggering 36 exaFLOPs of efficient, purpose-built AI compute.”
Comoute power vs humans brings us to “The Bitter Lesson”, …”We have to learn the bitter lesson that building in how we think we think does not work in the long run”. Anyone old enough to remember the storage vs code vs compute way back in the 80’s? Comuting power made the argument moot.
Almost unbelievable power: “The humongous ‘Condor Galaxy’” (25-30x more compute than anything available – incl military.) “It supports up to 600 billion parameter models, with configurations that can be expanded to support up to 100 trillion parameter models. Its 54 million AI-optimized compute cores and massive fabric network bandwidth of 388 Tb/s allow for nearly linear performance scaling from 1 to 64 CS-2 systems, according to Cerebras.”
Cerebras to Enable ‘Condor Galaxy’ Network of AI Supercomputers: 36 ExaFLOPS for AI
https://www.anandtech.com/show/18969/cerebras-to-enable-a-network-of-ai-supercomputers-36-exaflops-for-ai
Students will, imo, need The Bitter Lesson…. “There were many examples of AI researchers’ belated learning of this bitter lesson, and it is instructive to review some of the most prominent.”
“The Bitter Lesson
Rich Sutton
March 13, 2019
…
“And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation. There were many examples of AI researchers’ belated learning of this bitter lesson, and it is instructive to review some of the most prominent.
…
“… We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach. ”
http://www.incompleteideas.net/IncIdeas/BitterLesson.html
“Richard S. Sutton FRS is a Canadian computer scientist. He is a distinguished research scientist at DeepMind and a professor of computing science at the University of Alberta. Sutton is considered one of the founders of modern computational reinforcement learning,[1] having several significant contributions to the field, including temporal difference learning and policy gradient methods.” https://en.wikipedia.org/wiki/Richard_S._Sutton
Comments on this entry are closed.