by Eszter Hargittai on September 26, 2004

I was hesitant to blog about technical details of my work here, but then I realized that if my fellow economist and philosopher bloggers can post about the details of their work then why couldn’t the sociology geeks?:) I’ll tuck it below the fold though as it likely only has limited appeal.

One of the goals of my dissertation project was to figure out survey measures of people’s actual online skills. In most of the existing literature, when people include measures of computer skills (the existing lit is mostly about computer-use skill not online skills), they rely on people’s self-perceived abilities. That is, researchers simply ask users to rate their skill. As you can imagine, this measure may not be very good. However, collecting data on actual skill is quite time-consuming, labor intensive and expensive, so we often don’t have a choice but to rely on survey measures. The question then is whether we can come up with survey measures better than the ones currently in use.

In my project, I measured people’s (one hundred randomly selected adult Internet users’) ability to find various types of information online and their efficiency (speed) in doing so.[1] I also asked participants to rank their skills (as per the traditional skill measures) and to rate their understanding of a few dozen computer and Internet-related items. (There’s more on what I did to see whether perceived understanding is a good proxy for actual knowledge, but for that you’ll have to read the paper.;-)[2]

I then checked the correlation of the various survey measures with actual skill. I constructed an index measure of skill based on the most highly correlated survey questions.[3] I then looked to see to what extent the self-perceived skill measure explains the variance in actual skill versus the extent to which my index measure based on knowledge items explains the variance in actual skill. I am happy to report that my index measure is a better predictor of skill than people’s self-perceived abilities.

An additional exciting bonus is that some of my survey measures were replicated on a national data set (the General Social Survey 2000 & 2002 Internet modules) so others can use these better measures as well.

I’m excited. The study I did was pretty risky in some ways. There was no guarantee that I would even find any variance on the most crucial variables (such as skill). But I did. And now these findings with the new versus traditional survey measures of skill suggest that there is something generalizable there, which is exciting.

Yes, I’m a data geek.

fn1. Yes, I realize there are all sorts of other ways one may measure skill, I had to pick something and I picked this measure because I believe it is very relevant to many other types of online actions.

fn2. If you’re interested in the methodological details of the study, you’ll find related publications here.

fn3. Yes, I know, it’s complicated because computer and Internet-related knowledge changes over time so it’s hard to know whether/how my measures will stand the test of time. But they should be useful at least for the GSS data as those surveys were conducted close to when I did my project.

dsquared 09.26.04 at 6:47 pm

Which is the specific paper where the methodology is described? It all sounds very interesting.


eszter 09.26.04 at 6:58 pm


The paper that describes all of the above in detail is not available online because I am about to send it off for review (and I don’t post things online at that stage).

I have a paper out about the details of how the data were collected (i.e. recruitment, technical specifications, types of questions asked). I have another paper out about how I coded and classified users’ online actions. That’s less directly relevant although it does show how I have the data by second on every action, which is relevant given that one of my skill measures is time-to-completion of task.


dsquared 09.26.04 at 7:17 pm

But just in general terms, are we talking about a principal component here?


eszter 09.26.04 at 8:06 pm

I didn’t simply run a PCA on all the survey measures to figure out what variables to include in the index, if that’s what you’re asking. There were some other issues I wanted to consider while constructing the index variable.


PG 09.26.04 at 8:47 pm

Were there certain searches that you categorized as more difficult than others? for example, was finding the fulltext of a NYT article understood as a greater accomplishment than finding a fulltext Washington Post piece?


David Tiley 09.27.04 at 1:56 am

I want sociology geekdom. I want historian geekdom too. They hurt my brain less than philosophy and I am sure you have a heap of readers like this.

I would have been amaaazed if there was much correlation between self-assessed skills in any areas of computers and shared datum lines of ability. For a start, we have particular areas of expertise (applications) where higher levels of ability seem simple, and a nodding acquaintanceship with others where we seem by comparison to be inept. That is, we are attuned to small ability differences which are magnified.

And our own geek subcultures accept levels of skill as normal which seems magical to outsiders.

There is also a tendency, across the broad computer domain, for a significant tribe of people to simply not understand that other areas of expertise exist at all. Ask your IT folk about interface design.

If you wanted to post the tests, we could all have a go. I would be fascinated. Have I learnt anything from these late night skitterings across the world’s databases? I don”t know.


eszter 09.27.04 at 6:17 am

PG – I just gave people tasks to perform, I didn’t decide ahead of time which ones were more difficult than others. The tasks were usually general enough that you could find relevant information on all sorts of Web sites. There was never just one correct response. More details are in the paper(s).

David – Your comments remind me of something I’ll want to blog separately another time.. how people with different abilities draw the line of what counts as skilled computer knowledge. But my study focused on average random users so I didn’t end up with too many IT professional types who would classify related knowledge on a whole other spectrum.

I’d be happy to post the survey here sometime.. but that would require approval from my Institutional Review Board for Human Subjects research.. a pretty tedious process. (I guess if I wasn’t going to use the data for research purposes it might not, but if I’m going to collect data, I’d like to have that option available.) Stay tuned.:)


dsquared 09.27.04 at 1:40 pm

I love pontificating about social sciences research … while watching my son at playgroup, I jotted down a few notes.

My guess is that you did a PCA of the scores on the skills tests and took the first principal component as a measure of skill. Then you carried out a regression of the survey questions on the skill measure to get a set of weights to apply to the questions, giving you a weighted average of the survey questions which would be an instrument for the skill score.

Or at least that’s what I think I would have done. Am I right?


eszter 09.27.04 at 4:00 pm

Dsquared – see above. But yes, it seems like that would certainly be one way to approach it.

