> A few years later, the gap between the premiums charged for car insurance to men and women had grown much wider.
This doesn't actually make any sense.
Based on a quick Google (the link doesn't work), the difference is in average premiums, but men are no longer paying more simply for being men.
The point of the policy isn't to avoid high cost drivers (skewed male) paying more on average, but rather to avoid stereotyping drivers on the basis of sex.
When it turns out that insurance companies are able to use other non-sex based factors to assess cost accurately, that isn't a loss for this policy, but is a win, since a man is now just as eligible for lower premiums as a woman, whereas before it was literally impossible.
EDIT:
Also
> And it turned out that the previous gender discrimination policy had been nothing like discriminatory enough; women were much safer drivers, and hadn’t previously been getting anything like enough credit for it
The question isn't who is a safer driver, it is who is a cheaper driver. It isn't a question of if women were getting enough credit for being safer, but for being cheaper. Women drive less than men on average, for example, so even at the same dollar of accidents per mile, they should pay less. You're looking at the outcome of many variables (total expected cost to insurance company) and attributing it just to a single variable (women cause less accident damage).
Spot on. Fairness is a causal problem. I am getting increasingly convinced that the first step in any data analysis should be obtaining a causal model, either by causal discovery or subject matter knowledge.
Sure it makes senses. If your original model just had a term for “is male” which you then had to delete and replace with other terms that are correlated with “is male”, it’s possible that these new terms are predictive of higher risk individuals that were previously masked when pooled with all other males (for example, “is felon”).
Your point about big features in datasets being big (even for genetics!) is something I have firsthand experience with. My PhD was about predicting the efects of cancer mutations on drug efficacy. I built two models, one using physics and one using machine learning. They both ended up showing the same thing: that a mutation which changed the charge of an amino acid was more likely to show a measurable effect. Not that this was the only important kind of mutation, just that the effect sizes were small in other cases. This, not quite incidentally, is something I could have told you at the start of my PhD, because this is a rather obvious fact of biochemistry. My work did improve the outcomes of a clinical trial but the question to ask is whether we actually learned anything new, or just came up with a better justification for following instinct.
Coming up with better justifications for following instinct is a good idea, though. Sometimes our instincts are wrong, and it's very meaningful if that's what the data end up showing. It's better to know we're doing the right thing than to hope we are but wonder if our intuitions are right or not.
Re your footnote I was once working for a moderately successful indie band who had twice supported the Rolling Stones. The drummer was despairingly trying to renew the insurance on his 15 year old Peugeot 106 (which was driven most of the time by his wife because you tend to be driven on tour). His broker gave the same hypothetical to explain the ludicrous prices.
He was particularly upset to find out he was just a rounding error in the potential cost of a crash involving him and his new best mates Mick and Keith.
There seem to be two tiers of people allowed to price insurance-the ones who tot up the price of Mick and Keith to give a huge number and the ones who adjust for the likelihood of Jonny Rockstar actually getting into a shitty motor. The second set actually manage to sell the insurance.
Yes indeed. Martin Lewis (of Money Saving Expert) recommends UK drivers try variations of their job title, as the impact on premiums can be significant
This is relevant to the observation about large features in data sets (or rather, large effects in real life): "The piranha problem: Large effects swimming in a small pond" https://arxiv.org/abs/2105.13445
«Insurance is odd stuff. I was reminded this morning of a strange event in European motor insurance, which arose out of a decision in 2011 that it was illegal, under the Gender Directive, to discriminate in pricing between men and women. This came as a bit of a shock to the industry; it was considered, at the time, to be a very well established fact that women drove more safely than men»
«it’s possible that these new terms are predictive of higher risk individuals that were previously masked when pooled with all other males»
I had also wondered about all these arguments because they are absurd, until I read an article about the insurance industry that contained the "tl;dr": it is nowadays common "insurance" industry practice to aim for each *account* to be profitable in the long term. If you understand the implications that is enough.
A bit longer discussion:
* The essence of insurance is to have a risk pool, and for the insurer to make a book so members of the pool that do not have a loss pay for the members who do have a loss (and in the meantime the insurer invests the premiums and gets a profit).
* Charging different premiums for men and women or for other categories with different risks is means not just market segmentation but also *risk pool* segmentation, that is splitting a large risk pool into many.
* Risk pool segmentation is absurd for several reasons *from an insurance point of view*: it means higher volatility, it does not mean higher prices as in product market segmentation, and it leads to moral hazard:
** If it is possible to segment a large risk pool into two lower and higher probability pools that means that a premium charged to everybody of say £100 per year becomes two different premiums of £80 and £120 but the average premium paid across the two pools will still be £100.
** The moral hazard is that people who know being high probability but are misclassified as low probability will be given a lower price and will be more likely to buy the policy, and people who know being low probability but are misclassified as high probability will find the higher price too expensive and will not buy the policy, thus the "low probability" pool will selected for higher probability members and vice-versa.
The only case where there might be an advantage to the insurer is when the insurer only wants to serve the low probability pool, in pursuit of lower assumed volatility and pricing advantage (at the cost of lower volume) over those insurers that have a single pool; but because of price competition from other insurers pursuing the same choice the acceptable level of probability will fall ever lower and the size of the low probability pool with shrink ever smaller.
All these argument are far less important because today's "insurers" as a rule do not actually sell insurance, but because they aim that every account be profitable over the long period they actually contingent loan facilities and the "premiums" are actually contingent loan facility fees until an insured loss happens and a loan to pay for the loss is made and then become loan repayments after the insured loss (this is particularly obvious in markets in which insurance is mandatory in law or practice).
That means that "insurance" customers are not made part of a risk pool, and are assessed as to *creditworthiness* rather than probability of loss (which has a significant influence on creditworthiness).
There are several interesting consequences to selling contingent loan facilities instead of insurance...
«"insurance" customers are not made part of a risk pool, and are assessed as to *creditworthiness* rather than probability of loss»
Somewhat similar to future mortgagees of a housing credit union/savings and loans/building society.
«There are several interesting consequences to selling contingent loan facilities instead of insurance...»
Also I have often wondered about the causes of this switch and my current speculations are:
* "market segmentation" is the second main message of business schools (the first main message is "lower wages") and I suspect that many "insurance" industry people (in particular salesmen) instinctively wanted to apply it.
* "financialization" is popular and "financializing" what used to be "insurance" into a credit product goes along with the trend.
* Buffett style: since USA stock and property gains are all but guaranteed to be high and reliable by the USA Treasury and Fed, "insurers" with a large stream of contingent fees can make fortunes by investing in stocks (and property), so low volatility and short term cash generation matter a lot.
There are other plausible causes, it would be interesting if it were possible to figure out which mattered more.
“This is what makes me a big sceptic about futurists talking about genetics in health insurance, telemetrics in motor or AI in everything – if something’s big enough to be relevant to pricing, it’s generally really big and easy to find).”
Why? Are you saying the information will be too good and not allowed to be used?
> A few years later, the gap between the premiums charged for car insurance to men and women had grown much wider.
This doesn't actually make any sense.
Based on a quick Google (the link doesn't work), the difference is in average premiums, but men are no longer paying more simply for being men.
The point of the policy isn't to avoid high cost drivers (skewed male) paying more on average, but rather to avoid stereotyping drivers on the basis of sex.
When it turns out that insurance companies are able to use other non-sex based factors to assess cost accurately, that isn't a loss for this policy, but is a win, since a man is now just as eligible for lower premiums as a woman, whereas before it was literally impossible.
EDIT:
Also
> And it turned out that the previous gender discrimination policy had been nothing like discriminatory enough; women were much safer drivers, and hadn’t previously been getting anything like enough credit for it
The question isn't who is a safer driver, it is who is a cheaper driver. It isn't a question of if women were getting enough credit for being safer, but for being cheaper. Women drive less than men on average, for example, so even at the same dollar of accidents per mile, they should pay less. You're looking at the outcome of many variables (total expected cost to insurance company) and attributing it just to a single variable (women cause less accident damage).
Spot on. Fairness is a causal problem. I am getting increasingly convinced that the first step in any data analysis should be obtaining a causal model, either by causal discovery or subject matter knowledge.
Sure it makes senses. If your original model just had a term for “is male” which you then had to delete and replace with other terms that are correlated with “is male”, it’s possible that these new terms are predictive of higher risk individuals that were previously masked when pooled with all other males (for example, “is felon”).
The second link ("had grown much wider") also links to the Sheila's Wheels car ad, which seems likely to be an error.
Your point about big features in datasets being big (even for genetics!) is something I have firsthand experience with. My PhD was about predicting the efects of cancer mutations on drug efficacy. I built two models, one using physics and one using machine learning. They both ended up showing the same thing: that a mutation which changed the charge of an amino acid was more likely to show a measurable effect. Not that this was the only important kind of mutation, just that the effect sizes were small in other cases. This, not quite incidentally, is something I could have told you at the start of my PhD, because this is a rather obvious fact of biochemistry. My work did improve the outcomes of a clinical trial but the question to ask is whether we actually learned anything new, or just came up with a better justification for following instinct.
Coming up with better justifications for following instinct is a good idea, though. Sometimes our instincts are wrong, and it's very meaningful if that's what the data end up showing. It's better to know we're doing the right thing than to hope we are but wonder if our intuitions are right or not.
Having another take on the “obvious’’ is
Re your footnote I was once working for a moderately successful indie band who had twice supported the Rolling Stones. The drummer was despairingly trying to renew the insurance on his 15 year old Peugeot 106 (which was driven most of the time by his wife because you tend to be driven on tour). His broker gave the same hypothetical to explain the ludicrous prices.
He was particularly upset to find out he was just a rounding error in the potential cost of a crash involving him and his new best mates Mick and Keith.
There seem to be two tiers of people allowed to price insurance-the ones who tot up the price of Mick and Keith to give a huge number and the ones who adjust for the likelihood of Jonny Rockstar actually getting into a shitty motor. The second set actually manage to sell the insurance.
Sadly, most pit musicians don’t have Strads.
I watched a guy get a lower quote for car insurance when he changed his answer to 'Occupation?' from 'computer programmer' to 'systems analyst'.
Yes indeed. Martin Lewis (of Money Saving Expert) recommends UK drivers try variations of their job title, as the impact on premiums can be significant
Great point on lawyers! As an old lawyer, I've noticed it--errrr--with myself.
This is relevant to the observation about large features in data sets (or rather, large effects in real life): "The piranha problem: Large effects swimming in a small pond" https://arxiv.org/abs/2105.13445
In the US car insurers are allowed to price based on sex and in many states women pay more. So not sure about this…
«Insurance is odd stuff. I was reminded this morning of a strange event in European motor insurance, which arose out of a decision in 2011 that it was illegal, under the Gender Directive, to discriminate in pricing between men and women. This came as a bit of a shock to the industry; it was considered, at the time, to be a very well established fact that women drove more safely than men»
«it’s possible that these new terms are predictive of higher risk individuals that were previously masked when pooled with all other males»
I had also wondered about all these arguments because they are absurd, until I read an article about the insurance industry that contained the "tl;dr": it is nowadays common "insurance" industry practice to aim for each *account* to be profitable in the long term. If you understand the implications that is enough.
A bit longer discussion:
* The essence of insurance is to have a risk pool, and for the insurer to make a book so members of the pool that do not have a loss pay for the members who do have a loss (and in the meantime the insurer invests the premiums and gets a profit).
* Charging different premiums for men and women or for other categories with different risks is means not just market segmentation but also *risk pool* segmentation, that is splitting a large risk pool into many.
* Risk pool segmentation is absurd for several reasons *from an insurance point of view*: it means higher volatility, it does not mean higher prices as in product market segmentation, and it leads to moral hazard:
** If it is possible to segment a large risk pool into two lower and higher probability pools that means that a premium charged to everybody of say £100 per year becomes two different premiums of £80 and £120 but the average premium paid across the two pools will still be £100.
** The moral hazard is that people who know being high probability but are misclassified as low probability will be given a lower price and will be more likely to buy the policy, and people who know being low probability but are misclassified as high probability will find the higher price too expensive and will not buy the policy, thus the "low probability" pool will selected for higher probability members and vice-versa.
The only case where there might be an advantage to the insurer is when the insurer only wants to serve the low probability pool, in pursuit of lower assumed volatility and pricing advantage (at the cost of lower volume) over those insurers that have a single pool; but because of price competition from other insurers pursuing the same choice the acceptable level of probability will fall ever lower and the size of the low probability pool with shrink ever smaller.
All these argument are far less important because today's "insurers" as a rule do not actually sell insurance, but because they aim that every account be profitable over the long period they actually contingent loan facilities and the "premiums" are actually contingent loan facility fees until an insured loss happens and a loan to pay for the loss is made and then become loan repayments after the insured loss (this is particularly obvious in markets in which insurance is mandatory in law or practice).
That means that "insurance" customers are not made part of a risk pool, and are assessed as to *creditworthiness* rather than probability of loss (which has a significant influence on creditworthiness).
There are several interesting consequences to selling contingent loan facilities instead of insurance...
«common "insurance" industry practice to aim for each *account* to be profitable in the long term»
To be sure: "common" does not mean "always", in some markets it is still possible to buy actual insurance (membership of a risk pool).
«"insurance" customers are not made part of a risk pool, and are assessed as to *creditworthiness* rather than probability of loss»
Somewhat similar to future mortgagees of a housing credit union/savings and loans/building society.
«There are several interesting consequences to selling contingent loan facilities instead of insurance...»
Also I have often wondered about the causes of this switch and my current speculations are:
* "market segmentation" is the second main message of business schools (the first main message is "lower wages") and I suspect that many "insurance" industry people (in particular salesmen) instinctively wanted to apply it.
* "financialization" is popular and "financializing" what used to be "insurance" into a credit product goes along with the trend.
* Buffett style: since USA stock and property gains are all but guaranteed to be high and reliable by the USA Treasury and Fed, "insurers" with a large stream of contingent fees can make fortunes by investing in stocks (and property), so low volatility and short term cash generation matter a lot.
There are other plausible causes, it would be interesting if it were possible to figure out which mattered more.
Analog integrated circuit designer here. Another take on the “obvious” is a win - and it sounds like you’ve developed 2. That’s valuable.
“This is what makes me a big sceptic about futurists talking about genetics in health insurance, telemetrics in motor or AI in everything – if something’s big enough to be relevant to pricing, it’s generally really big and easy to find).”
Why? Are you saying the information will be too good and not allowed to be used?
Laws can change what insurers say they look at. Doesn’t change where the real risks actually are.