Will big data lift the veil of ignorance?

by Lisa Herzog on January 31, 2025

(Hi all, wonderful to become part of this great blog! But now, directly on to some content!)

Imagine that you have a toothache, and a visit at the dentist reveals that a major operation is needed. You phone your health insurance. You listen to the voice of the chatbot, press the buttons to go through the menu. And then you hear: “We have evaluated your profile based on the data you have agreed to share with us. Your dental health behavior scores 6 out of 10. The suggested treatment plan therefore requires a co-payment of [insert some large sum of money here].”

This may sound like science fiction. But many other insurances, e.g. car insurances, already build on automated data being shared with them. If they were allowed, health insurers would certainly like to access our data as well – not only those from smart toothbrushes, but also credit card data, behavioral data (e.g. from step counting apps), or genetic data. If they were allowed to use them, they could move towards segmented insurance plans for specific target groups. As two commentators, on whose research I come back below, recently wrote about health insurance: “Today, public plans and nondiscrimination clauses, not lack of information, are what stands between integration and segmentation.”

If, like me, you’re interested in the relation between knowledge and institutional design, insurance is a fascinating topic. The basic idea of insurance is centuries old – here is a brief summary (skip a few paragraphs if you know this stuff). Because we cannot know what might happen to us in the future, but we can know that on an aggregate level, things will happen to people, it can make sense to enter an insurance contract, creating a pool that a group jointly contributes to. Those for whom the risks in question materialize get support from the pool. Those for whom it does not materialize may go through life without receiving any money, but they still know that they could get support if something happened to them. As such, insurance combines solidarity within a group with individual pre-caution.

Private insurance, especially when offered by commercial companies, typically follows the logic of what Martin O’Neill describes as “mutual” insurance: individual contributions are defined “in accordance with the best estimate of the level of risk brought to the pool.” Pricing according to estimated individual risks is sometimes described as “actuarial fairness.” It means that different groups receive different contracts, depending on what is known about their risks; some, whose risks are very high, may not receive an offer at all. “Social insurance,” however, can also install the model O’Neill describes as “solidaristic” insurance, in which risk levels are not taken into account. Because it can be made mandatory by government, social insurance can avoid the problem that individuals with low risks would opt out of pools. Mandatory insurance is particularly relevant for basic goods for which society would have a moral duty to provide for individuals anyway, in case the risks materialize – one might see basic healthcare as a case in point.

How insurance works in detail, however, crucially depends on what is known, to whom, and how reliably it can be shared with others.

A first question is who gets included in which risk pool. If you have no information whatsoever about the risks of different groups, it makes most sense to include everyone in one pool at the same conditions. If, however, individuals have reliable knowledge that their own risk is different, and if they can reliably share this knowledge with the insurer, then risks pools might segregate. If individuals cannot share the knowledge with the insurer, but are sufficiently certain about it, you can get the phenomenon of “adverse selection”: those with low risks might opt out, and the insurer might be left with a pool for whom the fees would have to become too high, so that the whole scheme unravels.

A second knowledge issue arises once people are insured: “moral hazard.” Once insured, individuals might behave in ways that increase their risks. For example, if you have good dental insurance, you might be more negligent with dental hygiene than if you expect to carry all costs yourself. Insurers, and also the other members of an insurance pool, therefore have an incentive to collect information about people’s behavior, to make sure that they insure them only against independently existing risks – hence the imaginary dental insurance scenario above. Here, the luck egalitarian intuition that there is a moral difference between “choice” and “circumstances” can be applied, even if one does not endorse luck egalitarianism as an overall theory of justice. If someone chooses to incur an avoidable risk, and it materializes, this puts costs on others that are not meant to be covered by the insurance contract. To exclude such cases, reliable information about what exactly happened, and whether a risk was avoidable or not, is needed.*

In the last decades, “big data” has massively changed the informational landscape, and industry reports and research suggest that insurers want to make use of it. In a recent paper, I had explored some of the consequences with regard to moral hazard. Yes, big data might provide better information about people’s behavior. But there are also reasons to think that this might create new injustices. For one thing, there is the unequal quality of data and the risk of errors, e.g. for people who change their names (which, in many societies, tend to be women, because of patriarchal family norms). For another thing, there are the many forms of background injustice that make a difference to the conditions under which people choose. Think, for example, about the problem of “food desert” in poor neighborhoods. If your insurance company sees your credit card data, its algorithmic systems might categorize you as choosing hedonism over a healthy lifestyle (and increase your co-payment), while it is in fact a matter of circumstances that you cannot buy healthier food.

I therefore warned against two things. First, there is a risk of “misguided responsibilization,” which pays insufficient attention to the social circumstances of people and to biases in the data. Second, there is the risk of shifting towards mutualist instead of solidaristic insurance, in O’Neill’s terms, simply because this becomes technically possible, without asking the political question of which type of insurance is the right one, normatively speaking, for which kinds of risks.

However, the problem is far bigger. Big data not only changes how much moral hazard can happen, but importantly, it matters for how much information about their risk class individuals can share with insurers at the point of choosing insurance or of making political choices about social insurance, e.g. through their voting behavior. I learned about these things from an extremely interesting book, by Thorben Iversen and Philip Rehm, Big data and the Welfare State: How the Information Revolution Threatens Social Solidarity. It is too rich to do full justice to it here, but one of its key arguments is precisely this: Big data about people’s health, lifestyle, and maybe also genetics might massively improve the predictability of risks, and this will have effects for how people think about social insurance.

Iversen and Rehm show that those branches of insurance that were legally allowed to do so, e.g. life insurance, have already changed massively, thanks to the availability of more information – in the expected direction of more segmentation of risk pools, following the logic of mutualism rather than solidarity. They write: “Because today’s insurers have much better data enabling them to draw a distinction between good and bad risks, the middle class now has a new institutional incentive to exclude bad risks by privatizing those risks or by differentiating public insurance” (p. 6). One thus sees a “process, whereby large national insurance pools are parsed into smaller ones with more differentiation in contributions and benefits,” which they describe as “segmentation” (ibid.).

And they draw a fascinating connection to one of the most famous thought experiments of the political philosophy: the Rawlsian “veil of ignorance,” which asks us to imagine what a just society should look like if we did not know where in society we stand. As they write, “the data revolution has raised the veil and allowed people to see more clearly whether they are likely to lose or gain from public insurance” (p. 8). Their gloomy prediction is that people with low risks – typically, the privileged – will no longer be willing to contribute as much to general social insurance systems, once they know that their own risks are low, and they can reliably transmit this knowledge to insurers.

This raises interesting questions about the role that ignorance about risks has played in the Theory of Justice – think about the state of medical research and data collection in the 1960s, when Rawls developed his ideas! One dimension of the communitarian critique had been that the difference principle, which justifies redistributive taxation, requires solidaristic motives, e.g. based on a sense of national belonging, to be acceptable. But maybe sufficient ignorance about your own risks might also get quite far? And yes, I know that there were discussions about risk aversion at the time – but my point is that even for risk-neutral agents, if you have no idea about the risk distribution whatsoever, you might be more prone to agree to redistribution. Moreover, Rawls’s ideas come from the context of a growing economy, where the question was mostly who would win really big, and who would just do fine. If you have no idea how your chances to be among the really lucky ones are, it may not be so crazy to endorse the difference principle even if you feel no particular bonds with your fellow citizens.

As Rachel Friedman argues in her book Probable Justice (another book too rich to do justice to it here), social insurance has always combined a prudential and a solidaristic logic. Tracing the development of different understandings of risk and different understandings of social insurance, between individual prudence and collective solidarity, throughout the centuries, she warns against giving up this combination. Precisely because of its complexity, it can speak to different political constituencies, she argues. But if Iversen and Rehm are right, this delicate balance will get tipped over in a social environment that is so information-rich that the pull towards the prudential side will become irresistible for those with low risks.

However, there is a twist, and it has to do with the luck egalitarian intuitions I had mentioned earlier. While it may become more and more knowable to which risk class someone belongs, it is likely that it will also become more and more evident, also thanks to big data, that this is often not a matter of choice. If someone has had bad luck in the “genetic lottery,” they should not carry that risk, morally speaking. The same holds, morally speaking, for one’s upbringing, educational opportunities, etc. And big data can make visible so many of the structural injustices that condition people’s choices. The idea that you somehow “deserve” your low risk because of your good behavior becomes untenable in many cases.

Maybe this is why the logic that Iversen and Rehm describe, plausible as it is, is not the only one that one sees in the interplay between big data and social insurance. As a reviewer of the book noted, the predicted “destruction-of-solidarity logic” is “difficult to square” with the introduction of the Affordable Care Act in the USA, which was an extension of solidarity.

As always in institutional design questions, there are some deeper philosophical questions lurking here (that’s one of the reasons why I find them so fascinating). Many ethical and religious doctrines invite us to imagine ourselves in the shoes of others, as if we couldn’t quite know whether we might not end up in their situation. “Here but for the grace of God go I” is a famous dictum ascribed alternatively to St. Francis or John Bradford. Adam Smith’s idea of the impartial spectator, or the Kantian categorical imperative, are all about overcoming our parochial perspective, and hence also our knowledge about our own risk class. But if some people know for sure that they have low risks, what does this mean for their motivation to imagine themselves in the shoes of others? Or will the knowledge that they really do not deserve their position, because it is all the work of “circumstances,” have the opposite effect? Big data will probably force us to rethink quite a few of our moral intuitions – all those that have to do with how much we can know about our circumstances, and our choices.

 

 

  • This does of course not mean that victims of their own reckless behavior should not receive any support – as Elizabeth Anderson has famously argued, the reckless motorcyclist should not be left dying on the street. I share this view, which applies most centrally to goods such as life, health, etc. The choice-circumstances distinction remains relevant for social insurance in two ways: first, for chosen behavior, those who have the means might be made to pay more themselves; second, for non-essential goods, e.g. the repair costs for the wrecked motorcycle, chosen outcomes might not need to be covered by insurance.

{ 3 comments… read them below or add one }

1

John Q 01.31.25 at 7:39 am

Welcome, Lisa ! There’s a lot to chew on here, so it will be a while before I can give a proper response, but I am really looking forward to the discussion.

2

Chris Bertram 01.31.25 at 8:31 am

Welcome Lisa! Great piece. I’m reminded very much of Goodin and LeGrand’s argument in Not Only the Poor (1987) that seeks to explain the postwar expansion of the welfare state in the unknowability (and hence uninsurability) of the risks people faced in time of war, meaning that the state was required to step in. There’s still quite a range of big risks out there of which the distribution (and correlation with things like wealth and income) is hard to know with any kind of granularity (even if the poor always end up worse): I’m thinking particularly of novel diseases causing epidemics and some of the effects of climate change.

3

Lisa Herzog 01.31.25 at 8:55 am

It’s interesting that you’re mentioning the war experience, Chris – it makes me think that there is a psychological dimension of risk perception as well, which may, to some extent differ from the purely epistemic dimension. And yes, of course – I’m not saying that we won’t face great collective risks anymore. But if you’re very rich, you can do a lot to protect yourself from pandemics, and probably also to some extent to climate change (do your private risk diversification by having property in different world regions, etc.). So even there, some groups of society might develop a sense that they will be fine, no matter what…

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>