Amazon has a new feature:
_Amazon.com Statistically Improbable Phrases_: Amazon.com’s Statistically Improbable Phrases, or “SIPs”, show you the interesting, distinctive, or unlikely phrases that occur in the text of books in Search Inside the Book. Our computers scan the text of all books in the Search Inside program. If they find a phrase that occurs a large number of times in a particular book relative to how many times it occurs across all Search Inside books, that phrase is a SIP in that book.
Experimenting with this, I find that SIPs effectively convey the essence of an author’s ideas, provided that the author is a phrase-maker. Very useful for cocktail parties. Here, by way of example, is the condensed essence of a number of sociological theorists.
Capital, Vol 1 by Karl Marx: average social labour, appetite for surplus labour, direct exchangeability, social labour process, abstract human labour, labour fund, labour objectified, specifically capitalist mode, surplus value, necessary labour.
Democracy in America by Alexis de Tocqueville: dogmatical belief, amongst democratic nations, all democratic nations, aristocratic ages, aristocratic people, democratic times, democratic armies, aristocratic communities, democratic ages, aristocratic countries, democratic people.
The Division of Labor and The Elementary Forms of Religious Life by Emile Durkheim: restitutory law, criminological types, segmentary type, segmentary organisation, penal rules, lower societies, social volume, organised type, collective type, negative solidarity, droit criminel, penal character, contractual law, common consciousness, moral density, regulatory organs, social similarities, collective sentiments, domestic sentiments, become specialised.
The Protestant Ethic and the Spirit of Capitalism by Max Weber, and From Max Weber: Essays in Sociology edited by Hans Gerth and C. Wright Mills: dueling corps, virtuoso religiosity, magical asceticism, expert officialdom, old civilized countries, civic strata, active asceticism, emissary prophecy, office prebends, certitudo salutis, ascetic conduct, capitalistic development.
Social Theory and Social Structure and On Theoretical Sociology by Robert K. Merton: empirical uniformities, middle range theory, manifest and latent functions, functional unity, sensate system, reference group behavior, environing social structure, cosmopolitan influentials, serendipity pattern, reference group contexts.
The Social Construction of Reality by Peter Berger and Thomas Luckmann: socializing personnel, deviant conceptions, legitimating apparatus, discrepant worlds, institutionalized conduct, socially objectivated, subjective biography, reciprocal typification, reality maintenance, legitimating theories, social stock, universal experts, finite provinces, conceptual machinery, ultimate legitimation, institutionalized actions, paramount reality, plausibility structure, institutional meanings, symbolic universe, relevance structures.
Natural Symbols and Purity and Danger by Mary Douglas: bodily dissociation, strong grid, main morality, positional family, elaborated speech, high classification, restricted code, condensed symbols, big spirits, elaborated code, symbolic behaviour, witch beliefs, purity rule, religious behaviour, millennial movements, magical efficacy, excremental magic, pangolin cult, sex pollution, sorcery beliefs, primitive world view, pollution rules, pollution ideas, pollution beliefs, female pollution, parts the hoof, sexual pollution, healthy mindedness, primitive ritual.
The Field of Cultural Production by Pierre Bourdieu: art competence, cultural consecration, symbolically dominant, symbolic revolution, symbolic goods, dominated fractions, aesthetic disposition, scholarly culture, dominant fractions, artistic field, positions and dispositions, bourgeois artists, autonomous field, dominated position.
Identity and Control by Harrison White: embedding ratios, commit interfaces, servile elite, compound actors, council species, select arena, molecular disciplines, discipline species, valuation orderings, differentiation ratio, social molecules, acquaintance dance, fresh control, hieratic style, arena disciplines, multiplex tie, network populations, getting action, interface species, dual hierarchy.
You’re invited to browse authors in the field of your choice. Some fields are a bit thin on SIP-able texts. _The Wealth of Nations_ wasn’t that good for phrases, but Keynes’ _Essays in Persuasion_ yielded “intensifying unemployment, inside opinion, internal price level” while Volume II of Hayek’s _Law, Legislation and Liberty_ came up with the very nice “catallactic possibilities”.
{ 1 trackback }
{ 21 comments }
joel turnipseed 04.02.05 at 1:05 am
Kieran – Excellent. Yeah, I noticed this a couple days ago and thought: “hmmmm…” At least now I know it’s good for CT posts and cocktail parties. In my browsing, though, seems like the SIP is limited to those titles with “Look Inside” authorized by publisher, no?
Matt McGrattan 04.02.05 at 2:46 am
Descartes’ Meditiations [in the Cottingham edition] gives us:
striated particles, alimentary juices, venous artery, arterial vein, corporeal imagination, tiny fibres, little fibres, mental intuition, coarser parts, contiguous bodies, external sense organs, supremely perfect being, celestial matter, modal distinction, continued proportion, corporeal substance, internal place
…which is rather splendid.
Jorn Barger 04.02.05 at 4:56 am
Is there any way to browse just titles that include the SIP list? (Searching on “statistically improbable phrases” draws a blank.)
ArC 04.02.05 at 6:28 am
Am I a bad person for immediately thinking of looking for the SIPs of certain SF hacks?
carla 04.02.05 at 11:00 am
I quit going to Amazon after the November election when I saw how much money they give to Republicans.
Does this make me shallow? LOL
Ben Hyde 04.02.05 at 11:28 am
Mavin mimic madness!
A belief in average social labor only leads to a dogmatical believe in the necessity of restitutory law. The competition frame with it’s instance on dueling corps leads to clear empirical uniformities which only furthers the goal of socializing personnel. What next? Bodily dissociation from labor’s art competence. Embedding ratios, need we say more?
Dylan 04.02.05 at 1:01 pm
I bet that the phrases from “Capital”, for instance, are just the terms that Marx coined that did _not_ catch on in a broad way.
don hosek 04.02.05 at 2:46 pm
Actually, presence of SIPs in some ways really is evidence of stereotyped phrasing in writing. I know that I’m guilty of this in my sloppier writing moments: If I’m not careful there are words and phrases that repeat themselves.
In a non-fiction book, you may get certain phrases which repeat themselves because of the nature of the work, but looking at some fiction that was indexed by amazon, it seemed that a large number of SIPs was indicative of poor writing style. I kind of think that this may also apply to non-fiction, where poor writing is much more common.
A. Cephalous 04.02.05 at 3:11 pm
Not to toot my own horn–not that I can (lacking equipment vital to tooting)–but I’ve written a bit on SIPs in literature that everyone’s more than welcome to check out, comment on, and mock me mercilessly for…
Andrew Edwards 04.02.05 at 3:18 pm
The Trial, by Franz Kafka: ostensible acquittal; definite acquittal.
Brilliant.
Barry Freed 04.02.05 at 3:22 pm
At some risk of stepping on norbizness’ toes, I endeavored to find out the SIPs contained in “Chicken Soup for the NASCAR Soul.” Alas, there were none. I tremble at the implications this has for vis-a-vis Don’s hypothesis.
John Emerson 04.02.05 at 3:34 pm
Thus Spake Zarathustra:
you higher men, lust after eternity, voluntary beggar, wretched contentment, fire hound, famous wise men, ass festival, great noon, nuptial ring, wild wisdom, poisonous flies, great nausea, stillest hour, old tablets, ugliest man, final sin, higher man, old pope, old magician
John Emerson 04.02.05 at 3:40 pm
Naked Lunch:
old gash, old junky, his cock, sick morning
Jim Anderson 04.02.05 at 6:29 pm
The Bald Soprano and Other Plays:
adore hashed brown potatoes, such caca, clicks his tongue, three noses, invisible crowd, upstage center, invisible guests, such cascades, general factotum
A. Cephalous 04.02.05 at 6:58 pm
The books all appear in the works cited page of my current dissertation chapter, but until now I really didn’t understand them:
The Education of Henry Adams: diplomatic education, accidental education, supersensual chaos, larger synthesis, rebel agents
John Barleycorn, Jack London: salmon boat, oyster pirates, night coal, long sickness, desire for alcohol
Origin of Species, Charles Darwin: temperate productions, genera descended, transitional gradations, unknown progenitor, fossiliferous formations, our domestic breeds, modified offspring, doubtful forms, closely allied forms, profitable variations, enormously remote, transitional grades, very distinct species, mongrel offspring, and, of course, diversified habits (a nice indication of his annoyance with Lamarck)
The SIPs for The Descent of Man aren’t as indicative of the larger argument: more beautiful males, colour from the females, rivalry with other males, build concealed nests, stridulating organs, from some lower form, build open nests, anthropomorphous apes
SusanC 04.03.05 at 6:32 am
Dan Drown, The Da Vinci Code:
cilice belt, lame saint, seeded womb, lettered dials, corporal mortification, rosewood box, sacred feminine, royal bloodline, stone cylinder, sweater pocket
Julia Kristeva, Powers of Horror:
biblical impurity, incest dread, impure opposition, primal repression, phobic object, pure signifier, symbolic law, paternal function
And I really must read Mary Douglas to find out what that Pangolin cult was all about..
clew 04.04.05 at 2:14 am
I would like to see what SIPs are Improbable w/r/t everything published before the work being scanned, and Probable w/r/t the works after.
And Google and Amazon between them may manage to tell us.
Jeff R. 04.04.05 at 11:57 am
Jared Diamond, Guns, Germs, and Steel:food production arose, big wild mammals, wild mammal species, archaeological hallmarks, blueprint copying, societies with writing, mammal domestication, founder crops, indigenous food production
Ayn Rand, Atlas Shrugged: pull peddlers, furnace foreman
(hm, not so good.) The Fountainhead adds “drafting room” and Anthem brings “our brother men”. And We the Living‘s “ragged gray uniforms”
Neal Stephenson, Cryptonomicon: grand wazir, substitution alphabet, data haven, hive mind, dive plan, making license plates, strange information, main vault, math whizzes
Ken Houghton 04.04.05 at 1:52 pm
“grand wazir” is Improbable? Only in the spelling; never seen the “vizier” in any form of English without the “Grand” being used…
arc – no.
snuh 04.04.05 at 6:00 pm
statistically improbable phrases for norman mailer’s why are we in vietnam: “mixed shit, medium assholes, bull fuck, grade asshole, gone ape, dead ass”.
let us hope this feature is never pg-13ed.
Anthony 04.04.05 at 7:13 pm
Barry –
Your discovery requires a reworking of Don H’s thesis:
from “presence of SIPs in some ways really is evidence of stereotyped phrasing in writing.” to ” presence of SIPs … is evidence of stereotyped phrasing in writing despite avoidance of the most common cliches.”
Comments on this entry are closed.