All your base really are belong to Google

by Eszter Hargittai on November 16, 2005

A few months ago I posted an entry called Google World in which I talked about the amount of information Google and other companies such as Yahoo!, MSN and AOL are amassing about their users.

This week’s launch of Google Base is another step in the direction of building elaborate profiles of users. Moreover, it is an interesting move by the company to get users to fill up Google’s own Web property with lots of valuable material for free.

Google Base is a collection of content submitted by users hosted on Google’s site. Let’s say you have some recipes (I mention these as that part of my own Web site seems to be one of its most popular sections and Google Base already in this early stage has a section on that), instead of simply hosting the recipes on your own site and having Google (and other search engines) drive traffic to it, the recipe can now live on Google’s own Web property. Other types of content range from classifieds about housing and jobs to course syllabi. Some have suggested it is like a gigantic expanded version of the popular Craig’s List, which I mention in case that is a service with which you are familiar. Google Base will be a collection of information that users provide for free, but for which Google gets credit when people find it.

It is hard not to wonder how much more prominent Google Base content will be in Google’s search results compared to other content on the Web.

In addition to providing a means to get lots of content on its own servers, Google Base also allows the company to refine its profile of its Google Account holders – a prerequisite for posting material on Google Base. Not only will it have information about people’s interests through their searches and users’ networks through their email and chat communication, but it will also know what content is of so much interest that the person posts material to Google Base about it. This information often also comes with geographic specifics and other data yet again refining Google’s profiles of its users.

In case one decides not to shift all of one’s content on to Google Base and continues to maintain one’s own Web site, it is now possible to use Google Analytics to see how people find one’s pages. Since this is yet another Google service tied to a user’s Google Account, Google has information about how people find users’ sites even if they are not getting to it through the use of Google’s search engine.

All this has implications for privacy (increasingly refined profiles of users) and for how a few players – ones driven by commercial interests – influence the types of content to which users are most easily exposed.

In a fascinating way we are seeing a move back to the model of the mid/late-90s where portal sites (e.g. Yahoo!, Lycos, Excite) were focused on creating vast content empires in the hopes that users would never leave their properties. One difference is that there was little talk back then about the privacy implications of all this. It is not clear if there is enough talk about that aspect even today.

PS. If you don’t get the title of this post, you can read up on the reference here.

{ 1 trackback }

Darwiniana » Totalitarian Google?
11.18.05 at 10:54 pm

{ 24 comments }

1

John Quiggin 11.16.05 at 7:05 am

I agree that this is back to the 90s stuff and predict its failure for much the same reasons. People who take up this offer will be locked inside a walled garden, away from the most interesting content, and steered, all the time, towards the kind of content that is conducive to the sale of ads. Sooner or later, they’ll get sick of it.

It’s sad, if not surprising, to see Google going this way. But AOL was pretty cool in 1990, and we all learned to live without it.

2

John Landon 11.16.05 at 8:35 am

Welcome to the world of Totalitarian Google. Did you know that Google cheats? That it quietly sinks the rankings of certain sites?
I learned this the hardway in the Darwin debate. Some clever fellow behind the scences is manipulating site retrievals. Having studied this carefully over six months my initial disbelief has given way to conviction.
So I think we need a Google monitor.
It was always too good to be true, this Google mania.

I have withheld details due to paranoia. You can check out ‘Does Google Cheat?’ at http://darwiniana.com

3

Chris Karr 11.16.05 at 10:03 am

I think you are being a bit too harsh on Google here and I think you’re missing the bigger point about Google Base. The purpose is not to build a “walled garden” of content, it is to move web searching beyond the crude keyword phase. I took a look at the specs and documents for Google Base and it’s clear that a Google Base item is little more than a set of key-value pairs of data.

Why is this important? Let’s say that I want to search for all reviews written by “Eszter”. Using the existing search technologies, I have no way of getting just this data without creating extremely contorted and error-prone search queries. I can do these types of things within the walled gardens of sites like Amazon, but not across the Internet as a whole. Google Base is the first step in this direction. (“Give me all documents of type ‘review’ where author is ‘Eszter’.”)

With regard to your points about Google becoming a content portal – I just don’t buy it. It’s true that Google will host content for you, but if you’re already a web site owner, they make it pretty easy to tag your content with the appropriate key-value pairs and submit your stuff to also be searched in the catalog. If someone’s searching for all documents where I’m is the creator, I will probably want them to find my blog entries. If Google Base is like any of the other Google properties (search, Froogle, news, etc.), it’ll present the results and get out of the way. Unlike the portal players of the 1990’s, Google is not playing a lock-in game – it’s playing a ubiquity game. They’re not interested in hosting everyone’s content in a modern AOL – they’re interested in being the card catalog everyone uses to get to that content.

As for the privacy implications of Google Analytics… I don’t know about you, but Google already refers well over half the traffic to my website. I figure that without Analytics, they already have a good idea about how people get to my site. For me, the price of letting Google know how visitors navigate within my site is worth the benefit of more sophisticated logging tools. I also include a Flickr toolbar on every one of my pages, and I could care less whether Yahoo is doing the same.

4

neil 11.16.05 at 10:04 am

5

Chris Karr 11.16.05 at 10:08 am

FWIW, I think that the course listing pages for linear algebra disprove the walled garden theory:

http://base.google.com/base/search?q=&btnG=Search+Base&nd=1&scoring=r&us=0&a_n409=linear+algebra&a_y409=0&a_s409=0&a_r=1

I can refine the search and report bad items, but everything else seems to punt me to MIT’s pages instead of a Google page hosting MIT content.

6

Andrew Edwards 11.16.05 at 10:23 am

If you don’t get the title of this post, you can read up on the reference here.

Come on! Is there anyone who reads an entire post about Google and doesn’t get it? Is there anyone left who doesn’t respond to an All Your Base joke by sort-of rolling their eyes and sort-of screaming “Take Off Every Zig!”?

7

Cryptic Ned 11.16.05 at 10:24 am

Oh no! Google has removed the link and the search term from post #4!!!

8

Andrew Edwards 11.16.05 at 10:27 am

Oh, and as of 10:25am EST a Google Base seach for “porn” yielded no actual pornography. What’s the over/under on how long this lasts? I say 5:00pm.

This is, of course, the test of the thing. If it’s actually open, people will post sexually explicit material. If there is no sexually explicit material, then it’s not actually open.

9

almostinfamous 11.16.05 at 10:36 am

i for one welcome our new geek overlords.

also:good point, andrew. there is no such thing as openness without spam and porn creeping in there.

10

Slocum 11.16.05 at 10:39 am

Google Base will be a collection of information that users provide for free, but for which Google gets credit when people find it.

Oh, you mean like…everything already on the whole damn Internet?!? Users provide for free, Google gets credit when people find it–the very friggin’ definition of a search engine. The value Google adds is in making the information accessible.

If you want to worry about something, worry about this–will information that Google hosts be accessible to other search engines? But even that, though, I don’t think is much of a worry. If the info is freely linkable, it’s freely indexible. And the negative publicity Google got from any attempt to wall off hosted pages would be intense.

11

Tim 11.16.05 at 10:46 am

Submit [your information] to Google! Resistance is futile!

12

matt 11.16.05 at 11:29 am

PS. If you don’t get the title of this post…

Wow… It never even occurred to me that people might not get that reference…

13

Tad Brennan 11.16.05 at 11:41 am

Remember: “The Base” is just English for Al Qaeda. And you still believe in coincidences….

14

fyreflye 11.16.05 at 12:31 pm

Those who like me would prefer that Google not collect information about themselves can download the free extension CustomizeGoogle (works for sure on Firefox and Windows, maybe on Unix.) Find it by Googling the extension name.

15

des von bladet 11.16.05 at 12:33 pm

PS. If you don’t get the title of this post…

What you say?!

(Sigh, come on persons!)

16

goatchowder 11.16.05 at 5:22 pm

This is why I’dnever use Google for hosting. I’m willing to pay so as not to have to deal with these questionable privacy practices. I’ve been lucky enough to find a webhoster (http://www.nearlyfreespeech.net) that has both an excellent privacy policy and dirt-cheap prices for hosting. It’s “Nearly Free” in both senses of the word.

Google’s nowhere near as bad as AO-Hell, but that’s damning (heh) with faint praise. Hopefully I won’t get sucked into the Google orbit any more than is absolutely necessary.

17

Peter Clay 11.16.05 at 5:23 pm

One or two people around me have floated the idea of search-as-public-utility; that is, using some sort of decent search is becoming an important part of productivity and public debate, and also that having so many people rely on a closed system to choose what they see gives that system a worrying amount of power (Murdoch press passim).

18

John Landon 11.17.05 at 12:20 am

Re: Totalitarian google, here’s the link.

http://darwiniana.com/2005/09/05/does-google-cheat/

I have studied Google carefully over time, and I quite doubt it is paranoia to say they tamper with certain site rankings

19

phil 11.17.05 at 9:47 am

Would true conservatives countenance the fiscal rape of their children and grandchildren?

One thing the Bush Administration clearly has been very good at is focusing the attention of the press (and by extension the American people) on issues that they want to highlight. This has had the effect of advancing the Bush agenda, but has had the added effect of deflecting focus away from things that the Administration does not want to highlight. One of those issues is clearly the rampant, runaway spending of your tax dollars by Bush and the Republican majority congress. At this point there can be no doubt that, as they try to focus your attention on issues like stem cells and Supreme Court nominations, Bush and the Republican Congress are spending us all into a hole from which it will take us, our children and our grandchildren years to recover.

You don’t need to take my word for this, nor the words of any democrat or Bush-hater. You need only to read what conservatives like George Will are saying, or the people at conservative think tanks like the Heritage Foundation and the Cato Institute. The Cato Institute recently completed a report on the spending habits of all US presidents during the last 40 years. If you’re interested in reading the report I’ve included a link at the end of this post.

If you want to continue to believe that Bush and Congressional Republicans are “on your side” or if you care only about saving stem cells and banning gay marriage perhaps you should read no further. But if you’re interested in the truth and are concerned about your financial well-being and that of your children, perhaps you should read on. Here’s some of what the Cato Institute report had to say about presidential spending over the last 40 years:

All presidents presided over net increases in spending. As it turns out George W. Bush is one of the biggest spenders of them all. In fact he is an even bigger spender than Lyndon B. Johnson in terms of discretionary spending.

The increase in discretionary spending in Bush’s first term was 48.5% in nominal terms. That’s more than twice as large as the increase in discretionary spending during Clinton’s entire 2 terms (21.6%) and higher than Lyndon B. Johnson’s entire discretionary spending spree (48.3%).

Adjusting the budget trends for inflation Bush looks even worse; his spending rate is much higher then Lyndon Johnson’s. In other words, Bush expanded federal non-entitlement programs in his first term almost twice as fast each year as Lyndon Johnson did during his entire presidency.

George W. Bush is the biggest spending president of the last 40 years in both the defense and discretionary spending categories by a long shot. He beats Johnson by almost 4% in defense spending growth and more than 3% in domestic discretionary spending growth.

And conservative columnist George Will points out that in his column today that federal spending has grown twice as fast under President Bush and congressional Republicans as under President Clinton. And with respect to the argument that this profligacy is related to 9/11 and homeland security, Will and the conservative think tanks have noted that over 65 percent of the spending increase is unrelated to national security.
Will further reports that Congressional Republicans (who achieved their majority by promising fiscal discipline) have presided over an orgy of pork spending with your tax dollars the likes of which have never been seen before. In 1991, the 546 pork projects in the 13 appropriation bills cost $3.1 billion. In 2005, the 13,997 pork projects cost $27.3 billion.

You may support Bush and the congressional Republicans because of some vague promise of “progress” on social issues with which you and the Republicans agree. In that case perhaps you are entitled to refer to yourself as a “social conservative.” But nobody who calls themselves a fiscal conservative could support Bush and the Republican Congress who are spending your tax dollars in an orgy of profligacy the likes of which has not been experienced in our lifetimes. You can continue to deny yourself this truth, but be assured that true conservatives know the truth. Bush and the Republican Congress are asking you to mortgage their futures and the futures of their children and grandchildren in exchange for soft “promises” on social issues. You are justifying the fiscal rape of your children and grandchildren perpetrated by your “moral” leaders in exchange for a vague promise of gains on social issues. Do yourself and your kids a favor; look them in the eye and explain to them why you have chosen to saddle them with these financial burdens, explain to them your reasoning. Then look in the mirror and explain to yourself how you can continue to support the people who you know in your heart are screwing you and to your kids. Is that morality? Is that conservatism?

Read the whole Cato article here:
http://www.cato.org/pubs/tbb/tbb-0510-26.pdf

Read the Will column here:
http://www.suntimes.com/output/will/cst-edt-geo17.html

20

Gary Farber 11.17.05 at 9:52 am

“But AOL was pretty cool in 1990….”

Also, pocket-protectors.

21

cm 11.17.05 at 11:35 am

The corrective factor in this is that when content is manipulated for commercial gain, it loses relevance. And the latter is figured out rather quickly.

22

Eszter 11.18.05 at 6:59 am

Chris, just because the majority of visitors to your blog/site happen to come through Google doesn’t mean that’s generalizable to all sites. I’m sure they have much less information about visitors to certain sites. They probably have more about visitors to techie sites since people interested in that type of content may be more likely to use Google and use it in a more informed manner than some others in the first place.

Regarding your attempt to disprove the walled garden theory: it’s a bit early to disprove something like this based on data from the service given that it’s only been around for a few days.

CM – Content can be manipulated for gains other than the realm of commerce (e.g. politics). Manipulation of content at this level is not always that easy to figure out.

As to the “All your base…” quote, I can’t believe how many of you live in a bubble to think that it’s a reference anyone in the world would get. (Or even to think that anyone interested in a post about Google would get it.) Stop the next person you see who is not a techie/geeky person and ask them if they understand. Chances are quite good that they will have no idea what you are talking about. (No, I don’t have a systematic study on this from which to lend support to this statement, I’m afraid. But if someone wants to fund such a study I’ll do it.:)

23

Chris Karr 11.18.05 at 8:54 am

The point I was trying to make with respect to the privacy implications regarding Google is they’re not unique in what they’re doing and people seem to single out Google, when they are doing the exact same thing everyone else is doing. For example, are you as concerned about the privacy implications of people adding Flickr content on their own sites? The inclusion of any sort of content from remote sites (from client-side Javascript, to images, to nifty Flash utility movies) is just as capable as serving as a tool to tear down the veil of users’ privacy as Google Analytics.

Given their search marketshare, is Google uniquely suited to build detailed user profiles? Absolutely. Are they any more uniquely suited to do this than any company that interacts with users in the form of search services, e-mail, photo sharing, and so on? I don’t think so. Maybe I’m just being naive, but as an Internet company, Google has a history of being a better citizen than its previous counterparts. I fully expect Google to create some sort of user profile of me based on my online activities and searches. I also expect that they’ll use that data to advertise to me in the form of text ads in their site. That’s the unfortunate reality of dealing with commercial search engines. However, given that the switching cost in selecting search engines is so low, Google also understands that if they become obnoxious and degrade the quality of their service, I’ll abandon them for someone who doesn’t. Just like how I switched from Altavista to Google in the first place. That’s the fundamental fact that led to their success and I think they’re smart enough to recognize that “do no evil” is the reason that they’re so popular to begin with.

Regarding my theory about disproving the walled garden theory, you’re right that the service has only been available a few days. However, in those few days, I have used it to submit my own and find others’ content. In my own personal experience using the site, it’s clear that Google is not attempting to build a walled garden within Google Base. Furthermore, Google Base is not a Craig’s List competitor, a replacement for online recipe books, or anything like that. It’s a simple structured metadata entry and search engine that allows users to add metadata to the engine for content that already exists on the web in addition to providing hosting for content not yet on the web.

Google Base may be become a building block for Google branded services that compete with some of these other services, but at the moment it’s just a catalog. The people at Craigs List (or any other classifieds “competitor”) could (if they were smart) put their own listings on Google Base and popup just the same as everything else when someone does a search for “find me a bicycle for $50 in NYC”. Not only would they expose their stuff to people using their own site, they’d capture the additional audience using Google Base. (If you do a car search on the site now, you’ll notice that auto vendors are doing exactly that.)

So with all due respect, the view that Google is building AOL 2.0 or trying to kill Craig’s List is entirely missing the point of Google Base. Google Base is not a new marketplace or a new place to post classified ads any more than a MySQL database engine is an online store. Google Base is the first step towards the next generation of search engines. The purpose of Google Base is not to compete with or displace the industries like the classified ads business – the purpose is to unify the content in these specific niches under a single search engine that can be used to search them all. The purpose of Google Base is not to bring all this content under a Google umbrella, rather it is to move the art of searching beyond the crude language of “show me all pages that contain the keyword ‘foo'” to a more natural and useful language of “show me all records of books authored by ‘Jack’ that were published between 1936 and 1945”.

Is there value in gobbling up Craigs List’s and others’ content and classified listings? Absolutely. Is it as valuable as being the central point where people use structured queries to find information? Not even close. Being the world’s content “card catalog” is immensely more valuable.

24

Chris Karr 11.18.05 at 11:46 am

And if my lengthy comments here haven’t earned me a red mark in your book, I posted a more detailed explanation about how Google Base is more significant than classifieds and how it may change the way we deal with the web at

http://www.aetherial.net/personal/2005/11/being_bullish_a.html

As always, comments are appreciated.

Comments on this entry are closed.