I’m going to use a particle physics problem as an analogy to describe a public health issue. This may seem like an odd approach, but the rationale is this; they are both essentially the same intellectually demanding problem but (even for a particle physicist) the physics one is less emotionally charged.
A lot of my time in physics has been spent measuring hadronic jets. These are the rather messy sprays of particles which happen when quarks and gluons collide. I wrote about them here, for example, but for present purposes, all you need to know about them is that when we try to measure the energy of a given jet, the result differs from the true value by an unpredictable, random amount. If you have thousands of jets to measure, you might be able to get the average energy correct, but the energy of any individual jet will be measured wrongly by some amount.
Generally, we can make sure the average is correct through a process called calibration. But we cannot remove the random spread.
Another kind of measurement which has some kind of random spread is medical testing or screening for some congenital condition or disease. The randomness could be because the test itself is not completely accurate, or because whatever it measures is not 100% correlated to the condition being tested for. Maybe it is a genetic factor which means an enhanced risk, but with no certainty as to whether the condition will actually occur.
The impact of the errors in the jet measurement is that the number of jets measured to have their energy in a particular range (say above 200 GeV, just to be definite) will depend not only on the number of jets with a true energy in this range, but also on the number of jets with a true energy below 200 GeV, because for some of them, their energy was randomly measured to be too high. One problem is that typically there are many more low-energy jets than high-energy jets. So if the random spread is too large, you can end up with the weird situation that most of the jets appearing in the range above 200 GeV actually had real energies below 200 GeV. Even though you have a good detector which on average measures the right energy!
Back to the medical screening test. Say you have a test as to whether you are about to develop Alzheimer's disease, which is 87% accurate for a given individual. But say most people in the sample you are testing don't have Alzheimers, just as most jets have lower energies. You can still end up with the situation that most of the people who test positive for the disease actually won't develop it.
Similar situations apply in other tests, as nicely discussed in this Times Higher Education review of Gerd Gigerenzer's book "Risk Savvy" using the example of breast cancer*.
To go through some numbers as an example, say you have detected 1100 jets in total. 1000 of them have true energies below 200 GeV (call them low-energy jets), 100 of them above (high-energy jets). Your detector gets the right answer 87% of the time. So of the 1000 low-energy jets, 870 of them are correctly measured as being low-energy jets, and 130 are wrongly measured as high energy. Of the 100 high-energy jets, 87 have their energies correctly measured as being high, and 13 show up as low-energy jets.
Of the jets measured as high-energy, how many really have high energy? Well, we have 217 jets measured as high-energy, made up of the the 130 wrongly-measured low-energy jets, and the 87 correctly-measured high-energy jets. So even though our detector is calibrated, and gets the right answer 87% of the time, less than half (87/217 = 0.40) of the jets we measure as high energy are really high energy.
Not easy. Read it back again, substituting "Alzheimers" for high-energy and "No Alzheimers" for low-energy**.
The result of this from the patient's viewpoint can be that the test is useless, perhaps worse than that. You test negative, fine - very small chance that you have the condition. You test positive... actually, still a small chance you have the condition, but now you definitely have a lot of stress, worry and possibly even risky treatment, depending on how well you and your doctor understand statistics.
As David Colquhoun says here, this whole analysis is far from new and not at all controversial, but aspects of it are counter-intuitive and to be honest it is always worth going over again, as some reports of the recent new Alzheimer's diagnosis made clear. And the parallels with a common issue in my own field struck me quite strongly. Hence this article.
Actually there is more. Various statistical analysis techniques, often based on conditional probabilties and/or Bayes theorem, allow us to make use of the information on jets despite the random errrors. So we do manage to obtain a good measure of the true energy distribution of jets. This doesn't improve the situation for an individual jet - it still has the same chance of being measured with the wrong energy as it ever did - but we learn new physics by exploiting what the detector can tell us, while quantifying the uncertainties due to what it can't.
In the same way, this may mean that in some circumstances, doing such screening tests even on overwhelmingly healthy populations may be useful. The point was actually made by some of the scientists involved in the Alzheimer's test, in some of the better media reports. Knowing the risk factors and eventual illnesses of an entire population could allow early low-risk interventions (such as changes of lifestyle) to be recommended and tested. Done right, this might even halt the "causes/cures cancer/heart disease" merry-go-round most foods and drinks seem to endure. While it is unlikely to help the actual population under study, it could make their children healthier, happier and longer-lived.
And there's the problem of course. I don't want to know if I have an increased-but-still-low risk of developing an incurable disease, frankly. But I wouldn't mind medical researchers knowing, if it would help them reduce the chances of the next generation getting it. And if they wouldn't tell me. And if I could trust someone to store the data securely and not suddenly sell it off to insurance companies. And if the whole thing didn't cost huge amounts of money that could be spent more effectively elsewhere.
Still, worth thinking about.
* I haven't read the book itself yet, I confess, but it is on my list.
** the 1000/100 ratio is made up for ease of calculation, the real ratio will depend on the population you test, though it looks like it the Alzheimer's case these numbers aren't far off. See the NHS summary here.
Jon Butterworth’s book, Smashing Physics, is out now!
A bunch of interesting events where you might be able to hear him talk about it etc are listed here. Also, Twitter.