Reinventing the wheel in social network theory

by John Q on May 15, 2004

I was thinking idly about Erdos numbers, and it suddenly struck me that I could easily prove the necessity of a couple of ‘stylised facts'[1] about the associated networks. It’s well-known that the collaboration network for mathematicians contains one big component, traditionally derived by starting with Pal Erdos. The same is true of the network generated by sexual relationships. Although there is no generally agreed starting point here, it is a sobering thought that a relatively short chain would almost certainly connect most of us with both George Bush and Saddam Hussein.

Anyway, the thought struck me that, given a simple two-parameter model, I could prove (at least in a probabilistic sense) not only the existence of a large component but its uniqueness. One parameter would characterise the distribution of the number of connections made by each person, and the other would characterise the bias in favor of endogamy or exogamy. Provided, in an appropriate sense, that these parameters multiplied to a number greater than 1 for some large segment of the population, a network with a starting point in that segment would expand until it contained a substantial portion of the whole population.

It’s easy to see then, that there can’t be two large components (where large means, say, more than 100 members and more than 10 per cent of the relevant population), because the probability that at least one of the possible connections (more than 10 000, by assumption) will be made approaches 1.

I’m recording this not because I think it’s a new discovery, but to raise a general point about research strategy in theoretical problems. The recommended strategy in most fields is to acquaint yourself thoroughly with the literature, then work out what new contribution you might be able to make. My preferred strategy is to begin with only a cursory knowledge of the field in question, work out how I would answer a question of interest and only then consult the literature.

The disadvantage of this approach is that you spend a lot of time reinventing wheels, since most questions of interest have already been answered in one way or another. The advantages, though, are substantial. First, it’s easier to understand something you’ve worked out for yourself than something you’ve read by somebody else. Second, in most research topics, the literature bears the marks of its history. What this means is that the substantive theoretical insights are inextricably mixed with accidental effects. If Professor A, the author of the first big paper in the field, thought that axiom X was crucial and axiom Y was uncontroversial, it’s likely that axiom X will continue to get a lot of attention, whether or not its justified, and that anyone who questions axiom Y will be regarded as ill-informed. If you come to the problem afresh, you may see it differently (not necessarily a good thing if you want to publish lots of articles in journals edited by the students of Professor A, but if you already have tenure this isn’t such a problem).

fn1. This is economist jargon for things we think are true but for which we have no solid evidence

Anyway, I’d welcome anyone pointing me to where my results have been anticipated, as well as any thoughts on research strategy.

{ 16 comments }

1 charliew 05.15.04 at 12:33 pm: Your research strategy sounds a bit like Hal Varian’s advice in his excellent essay, “How to build economic models in your spare time”. Varian says you won’t find good research ideas in journals. He looks in the Wall Street Journal and wonders about a phenomenon – how would I explain this? Then he pulls a standard model out of micro and starts modifying it to recreate the phenomenon.

I like this approach, but as an untenured faculty I concentrate my research in a single area where I know the literature. Focus amplifies impact.
2 Keith M Ellis 05.15.04 at 12:43 pm: My opinion is that a foundational strategy (“reinventing the wheel”) is appropriate if you’re A) intellectually suited to doing creative work that relies upon a “fresh” perspective; and B) doing such work. Even then, you’re gambling that you’ll get back your time and effort investment with either a new insight, or a deeper, stronger insight than you otherwise could possibly have had.

My intuition tells me that this is, as a practical matter, a losing proposition for most people.

But then, is this purely a practical matter? And is this a matter of choice or temperment?

Personally, I can’t really understand anythingâ€”or, at least, feel as if I understand anythingâ€”until I’ve essentially recapitulated, on my own creativity, much of the work. This means that there’s a bazillion things that I will never give myself a chance to “comprehend”; or, more to the point, have a satisfactory technical facility with. But I don’t think I could be another person. And I’m quite certain that the only valuable intellectual work I’ve ever done has been very essentially creative in the way that requires at least a moderately autodidactical sort of mind.
3 Barry 05.15.04 at 1:37 pm: Krugman, IIRC, said that he was advised to consult the WSJ, Fortune, Businessweek, etc., to come up with questions which might lead to a good dissertation.

And somebody whose name I can’t recall suggested that Ph.D. students should first formulate questions, then convert them into economic hypotheses suitable for dissertations, and only then consult the literature. The idea was that this would build skill in economic thinking.
4 Max Weber 05.15.04 at 1:59 pm: It’s an especially effective strategy when one is working in an area that doesn’t need to bother with a lot of data.

And of course, one can reliably assume that the formulation of the problems one is about to solve is so simple and obvious that one doesn’t really need others’ efforts.
5 tim 05.15.04 at 4:54 pm: There is a way to leverage a fresh insight: collegiality. Once you’ve become intrigued by a problem, but before you’ve done too much work on it, have a chat with a colleague who is more familiar with the field, but not deeply embadded in it. Ask a grad student for a brief oral summary of the literature, probably something they have done for their own purposes. If the problem has been solved or if your approach has led to a dead end, they can point you to the death-blow paper directly, and you can see if there is anything about your own idea that can be salvaged, or maybe a similar use for your reinvention of the wheel (like gambling instead of transportation).

This kind of informal collegiality is supposed to be the real advantage of the ivory tower, otherwise we could work anywhere. If being a professional means something more than free library access to journals, this is it. We need to do more of this in the social sciences, imho, you could perhaps lead the way?
6 Walt Pohl 05.15.04 at 4:56 pm: There are theorems along the lines of what you suggest? Who proved these theorems? Why, Erdos, of course.
7 Walt Pohl 05.15.04 at 6:38 pm: Some more detail: I didn’t quite understand your two-parameter model, but consider this one-parameter model: suppose there is a fixed probability p that any two people wrote a paper together. Then as the number of people goes to infinity, the probability that the graph is connected goes to 1.

The area is known as “random graphs”, and was originated by Erdos. Most graduate-level textbooks on graph theory (such as Diestel’s, which is available on-line for free download) will have a chapter on it, and there are books devoted to the subject. I think Bollobas has one (called “Random Graphs” of course — mathematicians don’t have the cleverest method of naming books).
8 Lindsay Beyerstein 05.15.04 at 7:14 pm: I heard from a Libertarian friend that someone has mapped the sexual connections between Ayn Rand and the rest of the Libertarian/Objectivist community.

I think a JFK number would be a useful sexual analogue of an Erdos number.
9 John Quiggin 05.15.04 at 9:45 pm: On collegiality, and working anywhere, one of the reasons I’m keen on blogging is that Brisbane is some distance from the centre of the academic world. So I see the potential for academically-oriented blogs to substitute for some of the casual interactions with colleagues that are rather limited here.

I should say that some of my colleagues here at UQ are very good, but they are concentrated in a couple of fields.* So most of my collaborative work is done by email.

*Actually, network theory is one of them. So I’ll be able to discuss this one after the weekend.
10 David 05.15.04 at 11:40 pm: Walt Pohl’s comments on random graph theory are correct; a search on mathscinet (go to ams.org and follow the links) should lead you to good survey papers on random graphs. One of the surpises of random graph theory is that, in order to have a component involving half the people with high probability, you only need the average number of papers by a typical author to be a little over log 2=0.69 — less than 1!

I don’t know what your two parameter model is, but I think a lot of people hve worked with models where there is one probability for writing a paper with someone who shares your interests closely and another for writing a paper with someone in a distant field.
11 Bill Tozier 05.16.04 at 12:04 am: [You couldn’t’ve summoned me with more certainty if you’d said my name twenty times in a row… [[points for unnecessary D&D gag?]]]

I’m worn out a bit too much to address this in detail (Cf. blog), but one observation that arose a number of discussions at the very meeting I was just attending this week:

Reinventing the wheel only seems to be useless until you’ve done it often enough to see the underlying pattern that ties all the “redundant” designs together. At that point, you’ve done something much more interesting.

As in, “Give a man a fire and you keep him warm for a night. Set him on fire and you keep him warm for the rest of his life….” Or perhaps that other proverb about fish. I get them confused.
12 neil 05.16.04 at 1:04 am: There is already an Internet-based social network where sexual contact denotes a link. It’s the sexchart. There are several celebrities as well as a tool to measure your proximity to the few well-known personages on the chart (but not including George Bush or Saddam Hussein).
13 Giles 05.16.04 at 10:36 pm: a more interesting question than the personal is whether this approach is socially optimal – is better or more original research produced from isolated or unconnected lines of thought?

Examples of where isolation has produced similar but not identical ideas might be the discovery of integration. Would two independent theories have developed if Libeniz and Newton had been connected by the instantaneous literature reviews of today? Would Maths be richer if only Libeniz had published.

Might the world of economics be richer if the Brisbane flight/ telephone and internet connection were cut?
14 John Quiggin 05.17.04 at 8:44 am: But the fact of simultaneous discovery implies some sort of contact. I should reread Merton on multiples I guess.
15 Giles 05.17.04 at 7:27 pm: yes there was obviously some contact, but since it was not instantaneous contact, Iâ€™m speculating that this allowed them to go in different directions. If Newton had read Libenitz had published before Newton, and Newton had read it, might Newton then not have published/amended his work to publish something more similar. And would the world be poorer for it?
16 John Quiggin 05.18.04 at 10:18 am: I think we’ve converged here, giles. What you say here is pretty much the point I made in the post.

Comments on this entry are closed.

Reinventing the wheel in social network theory

Recent Comments

Search

Archives

Pages

Book Events

Contributors

Fine Print

Lumber Room

Old Wood

Meta

Recent Posts

Tags