(Content warning: cancer. No graphic description, but frank discussion of mortality rates.)
Today, we’ll be talking about a study by Shrager et al. on a statistical anomaly in cancer diagnoses. At Dr. Shrager’s thoracic surgery practice, they noticed a 2-fold increase of pulmonary lobectomies for lung cancer for 65 year olds as opposed to 64 year olds. Why such a stark increase over such a short time? Dr. Shrager had a theory: the increase was driven by eligibility for Medicare, which begins at age 65. Besides being exactly as depressing as it sounds, it’s a stark case study of the costs that come when we get clumsy about systemization.
It’s worth being precise about which system we mean, though. Diagnosing is technically a system: a tremendously complicated human being goes in, a yes/no flag comes out. But as reductionist as diagnosis is, it’s not exactly something you can avoid doing, and at least medicine is a domain that takes correspondence to the natural world seriously. There are all sorts of valid concerns about doing diagnosis right, but those concerns are being treated as problems by people with the power to fix them.
Receiving a diagnosis, though, is another matter entirely. It doesn’t matter how carefully you ensure that your tests have the appropriate accuracy if you never run the tests at all. And cancer, like many medical conditions, is tremendously sensitive to the timeliness of diagnosis. If you have healthcare access that’s cheap and straightforward to use, you might get tested on a regular cadence and be highly likely to catch it early. But many insurance plans technically count as “covering” a patient while still being expensive and requiring a lot of personal bureaucratic labor to actually utilize. This makes it even more complicated to evaluate the impact of how we’ve systemized access to diagnoses; it’s not a simple yes/no but a complicated dance of convenience, cost, and perceived risk.
Not all insurances are created equal - but the coverages of over-65 year olds in the United States largely are. Fewer than 1% of Americans over 65 are uninsured thanks to Medicare. The exact statistics of Medicare vs. other forms of insurance were not available to this study, but that lack of data makes the conclusion all the more striking. After all, you would expect to see a larger effect in a study of strictly uninsured people who went on to receive Medicare than a study of everyone with pre-mortem cancer. (As this study was looking into access to diagnoses, the study excluded patients whose cancer was only found post-mortem.) But this study didn’t consider insurance information at all: it simply stratified patients by age and let the story about insurance pop out on it’s own accord.
And pop out it did! The primary metric they used was “age over age”, an adaptation of the financial “year over year”. It’s the difference in diagnosis rates from one year to the previous one, divided by the rates of the previous one:
[(% of cancer diagnoses at age n) - (% of cancer diagnoses at age n-1)] / ( % of cancer diagnoses at age n-1)
But you can tune out the details of the math: just know that the bigger the percentage, the more that diagnoses increase as patients get one year older. For example, Stage I lung cancer has a 4% AoA increase at age 64 - a slight increase to the diagnosis rates for 63 year olds. Stage I breast cancer and colon cancer actually go down, with a -2% AoA increase each, and prostate cancer holds steady at a clean 0.
But the figures are dramatically different when the jump is from 64 to 65. Stage I Lung cancer has a 23% AoA increase. Breast cancer, which was -2% a mere year ago, now goes to 10%, while colon cancer has a colossal 33% increase and prostate cancer goes from steady to 15% AoA. The fact that this is a catchup of “overdue” diagnoses is plainly evident in how the AoA figures crater on the jump from 65 to 66: -1%, -6%, -11%, -6%.
In other words, you can learn exactly which age people become eligible for Medicare if you just watch for the spike in diagnoses. (And to forestall any discussion of some sort of exactly-age-65 specific time bomb in the human body, Shrager et al. note that this pattern was decidedly not observed in a study of lung and breast cancer off of the Canadian Cancer Registry.) The system that decides who gets diagnoses in the United States is based on insurance, and insurance is broken, with obvious and disastrous consequences for the health of millions. Indeed, Shrager et al. cite a study showing that insured patients with cancer over the age of 65 are more likely to undergo surgical intervention and have better 5-year cancer specific mortality rates than people who are younger but not uninsured. Access is so critically important that it even trumps age!
Their paper ends with a pointed note that this study should be considered in the political context of Medicare-for-all. The American insurance system creates this gap in diagnoses that don’t correspond to cancer’s existence in the real world; universal coverage would go a long way to closing it. It’s not an easy fix, but it is a simple one - access to healthcare makes your health better. But since we’re here to look at this from a desystemic perspective, let’s ask ourselves: what do we do with American cancer data before we have universal coverage? Because these biases aren’t just affecting the real human beings who live in the diagnosis gaps - those human beings are also being turned into data that flows upwards, into algorithms that drive decisions affecting even more human beings.
Let’s say that some well-meaning hospital executive reads the same study we did and thinks -- wow, okay, we need to fix these diagnosis inequities. Let’s use machine learning to predict which patients are mostly likely to test positive for cancer and proactively reach out and get them tested. We’ll ignore the insurance gap entirely and look purely at the data! Well - we know what looking purely at the data gets us, don’t we? We just finished figuring out that it shows a massive spike in cancer diagnoses at age 65, and sniffing out massive spikes is what machine learning does best. As far as a predictive model trained on data from the United States is concerned, there really is an exactly-age-65 specific time bomb in your body that causes a spike in cancer diagnoses.
How do you control for this bias in your model? Well...you don’t, really. You could artificially weight the scores to some target AoA, but what’s the “right” target for AoA, anyway? The fundamental problem is that you want your model to guess who has undetected cancer, but the only data you can feed it with are patients with detected cancer. So any correspondence break between cancer in the general population and the patients that actually get diagnosed can’t help but feed that bias into that model, compounding the tragedy of the original problem. The impact of insurance inequity is a group of 64 year olds who have undiagnosed cancer because they’re waiting for Medicare. The impact of doing statistical analysis on data generated by inequity and then using it to drive decisions is another group of 64 year olds who have undiagnosed cancer because they’re 64 year olds.
If the first post was the “what” of Desystemize, this is the “why”. Studies and journalistic exposés of broken systems are everywhere, making it clear that we need to look down. But on the heels of “we need to look down” is an equally crucial, all too often ignored echo: “And until we do, we can’t look up.” A broken system at the interface with the real world poisons analysis from the root and makes it inevitable that you’ll perpetuate existing inequity. We need to finish our vegetables before we eat our cake: when the vegetables are slow, human-scale correspondence work and the cake is flashy data science that gets more automatic every year, you can see where the resistance comes from.
This article will not start a revolution. Predictive models for cancer (and all sorts of other diseases - who knows how many other conditions have this same bias?) are being created at health systems all around the country as we speak. They will be turned on, and they will be used to drive clinical decisions, and they will have a significantly lower score for 64 year olds than 65 year olds because 65 is a much more common year to get diagnosed with cancer than 64. The message that we need to slow down is not a catchy one, because turning on the machine that neglects 64 year olds is a lot more fun than leaving it off. This state of affairs will continue until decision makers and everyone else are used to listening for broken correspondence, with an intuition for what analysis must be abandoned when it happens and the courage to replace something with nothing. If Desystemize can move that needle, however slightly, it will have been worth the effort.
This seems remarkably related to Nelson Goodman’s problem of “grue”. He defined something to be “grue” if it was observed before 2025 and turned out to be green, or if it wasn’t observed before 2025 and turned out to be blue. He notes that all observed emeralds have been green, but also that all observed emeralds have been grue. *We* obviously predict that future observed emeralds will still be green, but there’s a sense in which this post might be suggesting that a certain kind of improperly structured machine learning algorithm might predict they’ll be grue (because it doesn’t notice that the training data is based on the property of “detected cancer”, which is grue-some, rather than “cancer”, which is green-like).
> Predictive models for cancer (and all sorts of other diseases - who knows how many other conditions have this same bias?) are being created at health systems all around the country as we speak. They will be turned on, and they will be used to drive clinical decisions, and they will have a significantly lower score for 64 year olds than 65 year olds because 65 is a much more common year to get diagnosed with cancer than 64.
On the other hand, it seems like a health executive might know that it's beneficial to try to diagnose cancer earlier than they see it. I don't know how these predictive models work, but if there's big spike at 65, couldn't they say, okay, let's test more at 64 to try to catch it earlier? That should help even with a mistaken theory of why the spike happens.
Though, maybe then they run into false positives because age isn't a very specific risk factor, and it's harder to get people to do the testing because of the access issues, so then they figure out what's going on.