Social media repertoires

by Eszter Hargittai on October 29, 2021

I’d like to blog more about my research, but not sure yet how to go about it (e.g., whether to write more about research already completed or about projects currently in the works or both.. feel free to voice your preference). Today, I’m posting a link to a paper that was just published (and is available open access so no paywall to battle): Birds of a Feather Flock Together Online: Digital Inequality in Social Media Repertoires, which I wrote with my friend Ágnes Horvát.

There is some work (not a ton, but a growing literature) on who adopts various social media platforms (e.g., are men vs women more likely to be on Facebook or on Pinterest, are more highly educated people more likely to be on Twitter or on Reddit), but as far as we can tell, no one has looked at the user base of pairs of such services. (I am always very cautious to claim that we are the first to do something as it’s nearly impossible to have a sense of all work out there, but we could not find anything related. Do let me know if we missed something.)

Why should anyone care who adopts a social network site (SNS) and what’s the point of knowing how user bases overlap across such sites? There are several reasons for the former and then by extension, the latter. I started doing such analyses back in the age of MySpace and Facebook finding socioeconomic differences in who adopted which platform even among a group of college students. More recent work (mine and others’) has continued to show differences in SNS adoption by various sociodemographic factors. This matters at the most basic level, because (a) whose voices are heard on these platforms matters to what content millions of people see and share and engage with; and (b) many studies use specific platforms as their sampling frame and so if a specific platform’s users are non-representative of the population (in most cases that is indeed the case) and the research questions pertain to the whole population (or all Internet users at minimum, which is again often the case) then the data will be biased from the get-go.

By knowing which platforms have similar users, when wanting to diversify samples, researchers can focus on including data from SNSs with lower overlaps in their user base without having to sample from too many of them. Also, for campaigns – these could be health-related, political, commercial – that want to reach diverse constituents, it is again helpful to know which sites have similar users versus reach different groups of people. Our paper shows (with graphs that I am hoping are helpful to interpreting the results) how SNS pairs differ by gender, age, education, and Internet skills.



Kiwanda 10.29.21 at 10:55 pm

I’d be curious to know if the results look much different using Jaccard instead of cosine similarity; probably not.

As the number of users of an SNS X approaches the population (or sub-population) size, the pairwise similarity of X with some Y becomes a function of just the popularity of Y. The normalization handles this, in some sense, but still, the pairwise similarity becomes less informative. Possibly related to this: it sounds more striking somehow to say that 92% of women who use pinterest also use facebook (as your data implies, I think), than that the cosine similarity of pinterest and facebook users is 75%. But maybe that’s just without enough familiarity with cosine similarity.

It might be possible to use your data to design a campaign, phrasing it as an optimization problem and then solving that problem. It would be interesting to then compare campaigns that were specifically designed for diversity of impact, vs. campaigns that are not.


John Quiggin 10.30.21 at 8:35 am

Just an aside, but applying for a US visa is a good reason for reducing the number of SNS accounts. You have to list all of them. Worse, the application software is buggy and crash-prone, so you end up doing it several times over.


Sumana Harihareswara 11.01.21 at 4:49 pm

I am looking forward to hearing more about your research via CT! I think my order of preferences is:

1) in-progress stuff where there’s any way we can help — e.g., by spreading the word that you’re looking for related work or research subjects
2) completed work I can read, especially with extra nuggets about stuff you learned during the work that didn’t quite fit into the published version
3) in-progress work where I just get to go “ooooo I look forward to the published work with anticipation!” because I can always use more things to look forward to


Eszter 11.03.21 at 7:28 am

Kiwanda, it’s not clear there would be much difference in findings using that method. We had a back-and-forth with the reviewers about which method we settled on, there’s a bit in the paper justifying our final choice.

John, fortunately I haven’t had to apply for a US visa in quite some time so I am no longer familiar with the details. Given the number of such platforms one may sign up for over the years and then abandon, this seems like a tricky request. I deal with other types of details now like the exact dates I was in the US in a calendar year (that’s required for US citizens with non-US residency for tax-filing purposes).

Sumana, thanks for the feedback, those all sound like good ways to engage. I’d like to do more on in-process research as getting feedback can be helpful and I can see how that’s more exciting to see. I’ll probably be posting a request for help with recruiting participants soon.

Comments on this entry are closed.