(Hi all, wonderful to become part of this great blog! But now, directly on to some content!)
Imagine that you have a toothache, and a visit at the dentist reveals that a major operation is needed. You phone your health insurance. You listen to the voice of the chatbot, press the buttons to go through the menu. And then you hear: “We have evaluated your profile based on the data you have agreed to share with us. Your dental health behavior scores 6 out of 10. The suggested treatment plan therefore requires a co-payment of [insert some large sum of money here].”
This may sound like science fiction. But many other insurances, e.g. car insurances, already build on automated data being shared with them. If they were allowed, health insurers would certainly like to access our data as well – not only those from smart toothbrushes, but also credit card data, behavioral data (e.g. from step counting apps), or genetic data. If they were allowed to use them, they could move towards segmented insurance plans for specific target groups. As two commentators, on whose research I come back below, recently wrote about health insurance: “Today, public plans and nondiscrimination clauses, not lack of information, are what stands between integration and segmentation.”
If, like me, you’re interested in the relation between knowledge and institutional design, insurance is a fascinating topic. The basic idea of insurance is centuries old – here is a brief summary (skip a few paragraphs if you know this stuff). Because we cannot know what might happen to us in the future, but we can know that on an aggregate level, things will happen to people, it can make sense to enter an insurance contract, creating a pool that a group jointly contributes to. Those for whom the risks in question materialize get support from the pool. Those for whom it does not materialize may go through life without receiving any money, but they still know that they could get support if something happened to them. As such, insurance combines solidarity within a group with individual pre-caution.
Private insurance, especially when offered by commercial companies, typically follows the logic of what Martin O’Neill describes as “mutual” insurance: individual contributions are defined “in accordance with the best estimate of the level of risk brought to the pool.” Pricing according to estimated individual risks is sometimes described as “actuarial fairness.” It means that different groups receive different contracts, depending on what is known about their risks; some, whose risks are very high, may not receive an offer at all. “Social insurance,” however, can also install the model O’Neill describes as “solidaristic” insurance, in which risk levels are not taken into account. Because it can be made mandatory by government, social insurance can avoid the problem that individuals with low risks would opt out of pools. Mandatory insurance is particularly relevant for basic goods for which society would have a moral duty to provide for individuals anyway, in case the risks materialize – one might see basic healthcare as a case in point.
How insurance works in detail, however, crucially depends on what is known, to whom, and how reliably it can be shared with others.
A first question is who gets included in which risk pool. If you have no information whatsoever about the risks of different groups, it makes most sense to include everyone in one pool at the same conditions. If, however, individuals have reliable knowledge that their own risk is different, and if they can reliably share this knowledge with the insurer, then risks pools might segregate. If individuals cannot share the knowledge with the insurer, but are sufficiently certain about it, you can get the phenomenon of “adverse selection”: those with low risks might opt out, and the insurer might be left with a pool for whom the fees would have to become too high, so that the whole scheme unravels.
A second knowledge issue arises once people are insured: “moral hazard.” Once insured, individuals might behave in ways that increase their risks. For example, if you have good dental insurance, you might be more negligent with dental hygiene than if you expect to carry all costs yourself. Insurers, and also the other members of an insurance pool, therefore have an incentive to collect information about people’s behavior, to make sure that they insure them only against independently existing risks – hence the imaginary dental insurance scenario above. Here, the luck egalitarian intuition that there is a moral difference between “choice” and “circumstances” can be applied, even if one does not endorse luck egalitarianism as an overall theory of justice. If someone chooses to incur an avoidable risk, and it materializes, this puts costs on others that are not meant to be covered by the insurance contract. To exclude such cases, reliable information about what exactly happened, and whether a risk was avoidable or not, is needed.*
In the last decades, “big data” has massively changed the informational landscape, and industry reports and research suggest that insurers want to make use of it. In a recent paper, I had explored some of the consequences with regard to moral hazard. Yes, big data might provide better information about people’s behavior. But there are also reasons to think that this might create new injustices. For one thing, there is the unequal quality of data and the risk of errors, e.g. for people who change their names (which, in many societies, tend to be women, because of patriarchal family norms). For another thing, there are the many forms of background injustice that make a difference to the conditions under which people choose. Think, for example, about the problem of “food desert” in poor neighborhoods. If your insurance company sees your credit card data, its algorithmic systems might categorize you as choosing hedonism over a healthy lifestyle (and increase your co-payment), while it is in fact a matter of circumstances that you cannot buy healthier food.
I therefore warned against two things. First, there is a risk of “misguided responsibilization,” which pays insufficient attention to the social circumstances of people and to biases in the data. Second, there is the risk of shifting towards mutualist instead of solidaristic insurance, in O’Neill’s terms, simply because this becomes technically possible, without asking the political question of which type of insurance is the right one, normatively speaking, for which kinds of risks.
However, the problem is far bigger. Big data not only changes how much moral hazard can happen, but importantly, it matters for how much information about their risk class individuals can share with insurers at the point of choosing insurance or of making political choices about social insurance, e.g. through their voting behavior. I learned about these things from an extremely interesting book, by Thorben Iversen and Philip Rehm, Big data and the Welfare State: How the Information Revolution Threatens Social Solidarity. It is too rich to do full justice to it here, but one of its key arguments is precisely this: Big data about people’s health, lifestyle, and maybe also genetics might massively improve the predictability of risks, and this will have effects for how people think about social insurance.
Iversen and Rehm show that those branches of insurance that were legally allowed to do so, e.g. life insurance, have already changed massively, thanks to the availability of more information – in the expected direction of more segmentation of risk pools, following the logic of mutualism rather than solidarity. They write: “Because today’s insurers have much better data enabling them to draw a distinction between good and bad risks, the middle class now has a new institutional incentive to exclude bad risks by privatizing those risks or by differentiating public insurance” (p. 6). One thus sees a “process, whereby large national insurance pools are parsed into smaller ones with more differentiation in contributions and benefits,” which they describe as “segmentation” (ibid.).
And they draw a fascinating connection to one of the most famous thought experiments of the political philosophy: the Rawlsian “veil of ignorance,” which asks us to imagine what a just society should look like if we did not know where in society we stand. As they write, “the data revolution has raised the veil and allowed people to see more clearly whether they are likely to lose or gain from public insurance” (p. 8). Their gloomy prediction is that people with low risks – typically, the privileged – will no longer be willing to contribute as much to general social insurance systems, once they know that their own risks are low, and they can reliably transmit this knowledge to insurers.
This raises interesting questions about the role that ignorance about risks has played in the Theory of Justice – think about the state of medical research and data collection in the 1960s, when Rawls developed his ideas! One dimension of the communitarian critique had been that the difference principle, which justifies redistributive taxation, requires solidaristic motives, e.g. based on a sense of national belonging, to be acceptable. But maybe sufficient ignorance about your own risks might also get quite far? And yes, I know that there were discussions about risk aversion at the time – but my point is that even for risk-neutral agents, if you have no idea about the risk distribution whatsoever, you might be more prone to agree to redistribution. Moreover, Rawls’s ideas come from the context of a growing economy, where the question was mostly who would win really big, and who would just do fine. If you have no idea how your chances to be among the really lucky ones are, it may not be so crazy to endorse the difference principle even if you feel no particular bonds with your fellow citizens.
As Rachel Friedman argues in her book Probable Justice (another book too rich to do justice to it here), social insurance has always combined a prudential and a solidaristic logic. Tracing the development of different understandings of risk and different understandings of social insurance, between individual prudence and collective solidarity, throughout the centuries, she warns against giving up this combination. Precisely because of its complexity, it can speak to different political constituencies, she argues. But if Iversen and Rehm are right, this delicate balance will get tipped over in a social environment that is so information-rich that the pull towards the prudential side will become irresistible for those with low risks.
However, there is a twist, and it has to do with the luck egalitarian intuitions I had mentioned earlier. While it may become more and more knowable to which risk class someone belongs, it is likely that it will also become more and more evident, also thanks to big data, that this is often not a matter of choice. If someone has had bad luck in the “genetic lottery,” they should not carry that risk, morally speaking. The same holds, morally speaking, for one’s upbringing, educational opportunities, etc. And big data can make visible so many of the structural injustices that condition people’s choices. The idea that you somehow “deserve” your low risk because of your good behavior becomes untenable in many cases.
Maybe this is why the logic that Iversen and Rehm describe, plausible as it is, is not the only one that one sees in the interplay between big data and social insurance. As a reviewer of the book noted, the predicted “destruction-of-solidarity logic” is “difficult to square” with the introduction of the Affordable Care Act in the USA, which was an extension of solidarity.
As always in institutional design questions, there are some deeper philosophical questions lurking here (that’s one of the reasons why I find them so fascinating). Many ethical and religious doctrines invite us to imagine ourselves in the shoes of others, as if we couldn’t quite know whether we might not end up in their situation. “Here but for the grace of God go I” is a famous dictum ascribed alternatively to St. Francis or John Bradford. Adam Smith’s idea of the impartial spectator, or the Kantian categorical imperative, are all about overcoming our parochial perspective, and hence also our knowledge about our own risk class. But if some people know for sure that they have low risks, what does this mean for their motivation to imagine themselves in the shoes of others? Or will the knowledge that they really do not deserve their position, because it is all the work of “circumstances,” have the opposite effect? Big data will probably force us to rethink quite a few of our moral intuitions – all those that have to do with how much we can know about our circumstances, and our choices.
- This does of course not mean that victims of their own reckless behavior should not receive any support – as Elizabeth Anderson has famously argued, the reckless motorcyclist should not be left dying on the street. I share this view, which applies most centrally to goods such as life, health, etc. The choice-circumstances distinction remains relevant for social insurance in two ways: first, for chosen behavior, those who have the means might be made to pay more themselves; second, for non-essential goods, e.g. the repair costs for the wrecked motorcycle, chosen outcomes might not need to be covered by insurance.
{ 16 comments }
John Q 01.31.25 at 7:39 am
Welcome, Lisa ! There’s a lot to chew on here, so it will be a while before I can give a proper response, but I am really looking forward to the discussion.
Chris Bertram 01.31.25 at 8:31 am
Welcome Lisa! Great piece. I’m reminded very much of Goodin and LeGrand’s argument in Not Only the Poor (1987) that seeks to explain the postwar expansion of the welfare state in the unknowability (and hence uninsurability) of the risks people faced in time of war, meaning that the state was required to step in. There’s still quite a range of big risks out there of which the distribution (and correlation with things like wealth and income) is hard to know with any kind of granularity (even if the poor always end up worse): I’m thinking particularly of novel diseases causing epidemics and some of the effects of climate change.
Lisa Herzog 01.31.25 at 8:55 am
It’s interesting that you’re mentioning the war experience, Chris – it makes me think that there is a psychological dimension of risk perception as well, which may, to some extent differ from the purely epistemic dimension. And yes, of course – I’m not saying that we won’t face great collective risks anymore. But if you’re very rich, you can do a lot to protect yourself from pandemics, and probably also to some extent to climate change (do your private risk diversification by having property in different world regions, etc.). So even there, some groups of society might develop a sense that they will be fine, no matter what…
dsquared 01.31.25 at 9:14 am
Hullo Lisa! (and everyone else!)
I gulp to realise that it was TWENTY years ago that I argued that “financial risk to the insurer” was very difficult to map on to “health risk to the insured” and that consequently, people were overestimating the extent to which big data would be used to price-discriminate (https://crookedtimber.org/2004/09/22/blame-it-on-fatty/). And of course, two decades is a very long time in data science, so I’ll be buying and reading the Iversen and Rehm book with great interest. (As an analyst once said to me, the ideal life assurance customer is one who works and pays premiums for forty years then drops dead on the day of retirement).
But I think the general point of my ancient post might still be relevant – it’s an important result of insurance economics that you can often get into a situation where the only stable equilibrium is a state-enforced pool. As well as causing social problems, insurers offering competing contracts on the basis of differentiated information can tear the entire industry apart.
Megan Blomfield 01.31.25 at 9:45 am
Hi Lisa! Really interesting post. Like Chris, I was going to mention Goodin on the post-war expansion of the welfare state; and climate change (there’s some economic research out there on how uncertainty might influence international environmental agreements). Also Buchanan and Brennan on the ‘veil of uncertainty’ effect for long-standing political rules – I think the basic idea is that the longer a chosen rule will be in place, the harder it is for you to know how it will play out for you individually, making a rule that will protect everybody look more in your self-interest. I wonder if big data might undermine this effect too, by making it easier to update rules quickly as new information comes to light.
MisterMr 01.31.25 at 9:55 am
Welcome to Lisa Herzog.
IMHO there is an ambiguity because, when we speak of something like the “genetic lottery”, we are implying some sort of egalitarian moral principle, “luck eagalitarianism” or something, whereas when we are speaking of the “veil of ignorance” we are speaking of something more like “enlightened egoism”.
The increased level of segmentation due to big data changes the “enlightened egoism” calculation, so that it is less and less rational, in terms of self interest, for those with few risk to share with the ones with many risks.
OTOH, it also depends on the power of those who have more risks: in their egoistic calculations, it makes a lot of sense to push risks and costs on others (let’s call this “egoistic leftism” when the poor just snatch the money from the rich).
However, this kind of naked egoism is rarely accepted or made explicit, so for example leftish policies are rarely described or argued for in a logic of egoism of the poor; and right wing policies are sometimes described in terms of rational egoism of the rich but then this is rationalised or moralised with some sort of social-darwinistic logic where this egoism is needed or somehow good.
I’ll just say that both in terms of “altruism” and “egoism of the poor”, increased segmentation sucks.
So it seems to me that this logic of segmentation should be avoided by either nationalising stuff (like an NHS) or regulating stuff a lot with non-discrimination clauses, and this was already the case before Big Data, although Big Data makes the situation more extreme.
wackz 01.31.25 at 10:55 am
Big data? Where I live (in Europe somewhere) private insurance (auto, home) works like this:
1. Entice new customers by offering them super-low policy prices.
2. Make the contract auto-renewable and extremely difficult to cancel.
3. Keep jacking up prices for the existing customers.
And that’s all there is to it.
Trader Joe 01.31.25 at 11:42 am
Welcome Lisa. A very thoughtful piece (and I’m a big insurance nerd).
As health insurers begin to employ greater levels of big data in segmentation, I would think the pricing mechanism they would choose would be similar to what the auto insurers employ when they ask customers if they would voluntarily use a device that specifically tracks their driving behavior over a period of time in order to get a discount.
In these models the customer is incentivized to provide the additional input and the insurer can then use that data as part of a broader mosaic to make even more detailed inference about other customers who don’t employ the device.
To use your dental analogy – I might willingly use the ‘smart toothbrush’ they offer and answer detailed questions about my eating and drinking habits to save say 20%-30%, relative to whatever the standard group rate might be. I would voluntarily do this because I think I have superior habits. Those who choose not to may or may not have habits just as good – but if the non-self selecting pool has worse loss experience, the base rates for them will go up faster than the rates for the positive pool.
Its basically the opposite of adverse selection – its incentivizing good risks to segment themselves and then you can price adjust the remainder.
Scott P. 01.31.25 at 5:20 pm
The predecessors to various forms of insurance were burial clubs, as existed e.g. in ancient Rome. You paid a small monthly or yearly sum to the club, and then when you died they would pay for your funeral, ensure an appropriate number of mourners were present, and make sure you got a good burial.
marcel proust 01.31.25 at 8:44 pm
Way off topic, but I am (very pleasantly) stunned to see dsquared gracing these pages once again./1
1/I get his (semi-weekly? biweekly? triweekly?)/2 newsletter via (a different) email but for old times’/this old timer’s sake, it is great to see this name here. Much like seeing Belle’s (which is almost as rare).
2/Strange: triweekly means “three times a week” while biweekly every two weeks!
Moz of Yarramulla 01.31.25 at 9:22 pm
if you’re very rich, you can do a lot to protect yourself from pandemics, and probably also to some extent to climate change
At the risk of ignoring the primary topic, I wonder about the accuracy of personal risk models. Ideas like the above are extremely common, and rely on a great deal of risk sharing that’s not obvious to the people making those comments.
As we see with the new US cabal defunding the NIH, “the rich can avoid pandemics” assumes that the rich person in question is both able and willing to support a medical research industry involving many thousands of people and billions of dollars every year, just on the off chance that a global pandemic happens in their lifetime (and that said pandemic can be addressed by researchers focussed on the problem “stop our one rich dude and their support pyramid getting sick”). Covid vaccines came from basic biomedical research, rapidly tweaked for Covid, mRNA wasn’t invented from scratch at the time.
Quite some thinking has been done on this on the theme “controlling the guards in my post-apocalypse bunker”, but it’s more generally applicable to rich people whose goal is social collapse. What pyramid of peons do they depend upon, exactly? Can you really be a tech billionaire without advanced semiconductors and the billion-odd people directly involved in producing them?
(“Climate change and bird extinctions in the Amazon” is worth reading if you’re thinking about buying your way out of the consequences of the climate catastrophe)
Lisa 02.02.25 at 10:39 am
Thanks for the further thoughts and comments and reading tips! A few points in response.
– Yes, state enforcement will often be the only stable equilibrium. But there will probably also be many forms of insurance where there is a possible market equilibrium in which, say, 60% of the population are sufficiently low risk to be included, and the rest is not. And then the question is whether this is normatively desirable (or even permitted). A lot depends on non-discrimination rules, obviously.
– The point about long-term policies is interesting, especially in the face of climate change. And you’d think that people might also think about the effects of policies on their children’s lives. I wasn’t so sure about the point about big data allowing for a quicker updating of rules – this would hold in commercial systems, for sure, but given how slow democratic legislation tends to be, I’m not quite sure how it would work for public insurance.
– about the ambiguity between moral principles or enlightened egoism: that’s exactly the point that I find interesting about insurance, and that Friedman makes in far more detail: this ambiguity can be helpful in politics because it can help build broader coalitions than you might otherwise get. But the point about state enforcement and non-discrimination stands of course.
– Big data being used via different kinds of contracts, where you get lower prices (or co-payments) if you share data: that’s exactly what I meant, I was a bit brief for reasons of space. I leads to more actuarial fairness, but also to segmentation – and the question is whether that’s the right direction to go.
– burial clubs: yes, and then you have the guilds and other forms of associations. Political philosophy / theory pay far too little attention to all the ways in which people lived in such associational structures throughout the centuries…
– can the rich really avoid pandemics: good points, I was a bit polemical there. But as a rich person, you don’t have to fund the research for the new vaccine. You can free ride on those countries that still fund it, bribe a few health officials there (or buy a passport), and that’s it! I agree with the overall picture that you draw, of course – I’ve learned a lot from Gar Alperovitz’s work on how technological development is all about the division of labor and we’re all benefitting from the work of others. But is that how billionaires see the world? Given meritocratic myths, probably not!
John Q 02.04.25 at 5:02 am
How important is Big Data (which I take to mean inferences drawn from multiple sources in large data sets) compared to rules about how much use can be made of ‘small’ pieces of data that have always been a possible basis for discrimination, justified or otherwise: age, gender, race, residential area, health status (determined by a checkup), driving record and so on?
Companies can use most of these (except for overt use of race) in car insurance in Australia, but much less so in health insurance.
Matt 02.04.25 at 5:21 am
Companies can use most of these (except for overt use of race) in car insurance in Australia, but much less so in health insurance.
In the US, quite a lot of companies, often by means of corporate (or, in my experience, university) “wellness” programs, aim to monitor and collect information about employee health and habits that are shared with insurance companies. Sometimes taking part in these (including the data sharing, sometimes involving taking various biometric tests) is incentivized but not required. In other cases, it’s essentially been required, often by means of passing on higher insurance premiums to people who don’t take part or provide the information. The information is not supposed to be used in relation to setting premiums for individuals, but I’m not very optimistic about how it is used. I set out an argument against much of this sort of thing, especially but not only when required, in this paper if anyone is interested.
Lisa Herzog 02.04.25 at 7:45 am
One of the problems of using big data is that it does reveal (at least in terms of likelihood) protected characteristics. But my sense is that big data will allow far better predictions by combining different sources of data. Maybe I’m wrong here, and the quality of the data and the degree of predictability will in fact be rather bad – e.g. because there is too much noise in the data, or because they are polluted (e.g. different people using the same credit card), or because some events are really too random for big data to be useful for describing risk groups. From the perspective of building or maintaining solidaristic insurance, I’d welcome that. But given the incentives for companies to find out, and given their ability to pay for the best data scientists, I don’t think one should rely on that…
Trader Joe 02.05.25 at 12:21 pm
Health insurance is a completely different thing than auto/home insurance. Three variables that are readily available – Credit score, age and address explain about 60% of loss variation (add in age and a couple other demographic factors you get to about 80%).
The fact is credit score (for all its faults) was the original “big data” and it remains the case that there is no single data point that is more explanatory for auto and home coverage – unfortunately, its not nearly as powerful for health. Its partly why health coverage is almost always ‘group’ coverage while P&C is almost always individual.
For health coverage – there is no such variable right now. Age obviously is predictive, but beyond that there is really nothing that helps. The losses a health insurer is most concerned about are things like cancer, Alzheimer’s and chronic conditions. None of these are routinely explainable even with genetic data (a few cancers excepted), let alone something demographic.
Smoking was added as a risk variable many years ago, back when a high proportion of people smoked it was explanatory and still is for the relatively small proportion of people that still do. Obesity is explanatory for cardiovascular illness (but not for as high a percentage as you’d think), but far less so for cancers and not at all for memory illness.
The fact that scientists label practically everything as “cancer causing” tells us more about the prevalence of cancer than it does about what causes it and accordingly no single variable ends up having more predictive power than any other and few of them are meaningful.
Everyone knows a person who smoke and drank and lived to be 90 and another person who only ate lettuce and ran marathons and dropped dead at 60. Health insurers don’t care about these people – they care about those who, say, contract cancer at 50 and then battle it until they are 80….those are the expensive patients and right now there is no way to predict who they might be.
Comments on this entry are closed.