Bad Science (20 page)

Read Bad Science Online

Authors: Ben Goldacre

Tags: #General, #Life Sciences, #Health & Fitness, #Errors, #Health Care Issues, #Essays, #Scientific, #Science

BOOK: Bad Science
13.61Mb size Format: txt, pdf, ePub

Asked about the evidence, the subjects confidently uncovered flaws in the methods of the research that went against their preexisting view, but downplayed the flaws in the research that supported their view. Half the proponents of capital punishment, for example, picked holes in the idea of state/state comparison data, on methodological grounds, because that was the data that went against their view, while they were happy with the before/after data; but the other half of the proponents of capital punishment rubbished the before/after data, because in their case they had been exposed to before/after data that challenged their view and state/state data that supported it.

Put simply, the subjects’ faith in research data was not predicated on an objective appraisal of the research methodology, but on whether the results validated their preexisting views. This phenomenon reaches its pinnacle in alternative therapists—or scaremongers—who unquestioningly champion anecdotal data, while meticulously examining every large, carefully conducted study on the same subject for any small chink that would permit them to dismiss it entirely.

This, once again, is why it is so important that we have clear strategies available to us to appraise evidence, regardless of its conclusions, and this is the major strength of science. In a systematic review of the scientific literature, investigators will sometimes mark the quality of the “methods” section of a study blindly—that is, without looking at the “results” section—so that it cannot bias their appraisal. Similarly, in medical research there is a hierarchy of evidence: a well-performed trial is more significant than survey data in most contexts, and so on.

So we can add to our list of new insights about the flaws in intuition:

 
  • 5. Our assessment of the quality of new evidence is biased by our previous beliefs.
 

Availability

 

We spend our lives spotting patterns and picking out the exceptional and interesting things. You don’t waste cognitive effort, every time you walk into your house, noticing and analyzing all the many features in the visually dense environment of your kitchen. You do notice the broken window and the missing television.

When information is made more “available,” as psychologists call it, it becomes disproportionately prominent. There are a number of ways this can happen, and you can pick up a picture of them from a few famous psychology experiments into the phenomenon.

In one, subjects were read a list of male and female names, in equal number, and then asked at the end whether there were more men or women in the list. When the men in the list had names like Ronald Reagan, but the women were unheard of, people tended to answer that there were more men than women, and vice versa.

Our attention is always drawn to the exceptional and the interesting, and if you have something to sell, it makes sense to guide people’s attention to the features you most want them to notice. When slot machines pay up, they make a theatrical “kerchunk-kerchunk” sound with every coin they spit out, so that everybody in the casino can hear it, but when you lose, they don’t draw attention to themselves. Lottery companies, similarly, do their absolute best to get their winners prominently into the media, but it goes without saying that you, as a lottery loser, have never had your outcome paraded for the TV cameras.

As we shall see, the tragic anecdotes about the MMR vaccine are disproportionately misleading, not just because the statistical context is missing, but because of their “high availability”: they are dramatic, associated with strong emotion, and amenable to strong visual imagery. They are concrete and memorable, rather than abstract. No matter what you do with statistics about risk or recovery, your numbers will always have inherently low psychological availability, unlike miracle cures, scare stories, and distressed parents.

It’s because of availability, and our vulnerability to drama, that people are more afraid of sharks at the beach, or of fairground rides on the pier, than they are of flying to Florida or driving to the coast. This phenomenon is even demonstrated in patterns of smoking cessation among doctors. You’d imagine, since they are rational actors, that all doctors would simultaneously have seen sense and stopped smoking once they’d read the studies showing the phenomenally compelling relationship between cigarettes and lung cancer. These are men of applied science, after all, who are able, every day, to translate cold statistics into meaningful information and beating human hearts.

But in fact, from the start, doctors working in specialties like chest medicine and oncology, where they witnessed patients dying of lung cancer with their own eyes, were proportionately more likely to give up cigarettes than their colleagues in other specialties. Being shielded from the emotional immediacy and drama of consequences matters.

Social Influences

 

Last in our whistle-stop tour of irrationality comes our most self-evident flaw. It feels almost too obvious to mention, but our values are socially reinforced by conformity and by the company we keep. We are selectively exposed to information that revalidates our beliefs, partly because we expose ourselves to
situations
in which those beliefs are apparently confirmed; partly because we ask questions that will—by their very nature, for the reasons described above—give validating answers; and partly because we selectively expose ourselves to
people
who validate our beliefs.

It’s easy to forget the phenomenal impact of conformity. You doubtless think of yourself as a fairly independent-minded person, and you know what you think. I would suggest that the same beliefs were held by the subjects of Solomon Asch’s experiments into social conformity. These subjects were placed near one end of a line of actors who presented themselves as fellow experimental subjects but were actually in cahoots with the experimenters. Cards were held up with one line marked on each of them, and then another card was held up with three lines of different lengths: six inches, eight inches, ten inches.

Everyone called out in turn which line on the second card was the same length as the line on the first. For six of the eighteen pairs of cards the accomplices gave the correct answer, but for the other twelve they called out the wrong answer. In all but a quarter of the cases, the experimental subjects went along with the incorrect answer from the crowd of accomplices on one or more occasions, defying the clear evidence of their own senses.

That’s an extreme example of conformity, but the phenomenon is all around us. Communal reinforcement is the process by which a claim becomes a strong belief, through repeated assertion by members of a community. The process is independent of whether the claim has been properly researched or is supported by empirical data significant enough to warrant belief by reasonable people.

Communal reinforcement goes a long way toward explaining how religious beliefs can be passed on in communities from generation to generation. It also explains how testimonials within communities of therapists, psychologists, celebrities, theologians, politicians, talk show hosts, and so on can supplant and become more powerful than scientific evidence.

When people learn no tools of judgment and merely follow their hopes, the seeds of political manipulation are sown.

—Stephen Jay Gould

 

There are many other well-researched areas of bias. We have a disproportionately high opinion of ourselves, which is nice. A large majority of the public think they are more fair-minded, less prejudiced, more intelligent, and more skilled at driving than the average person, when of course, only half of us can be better than the median.
13
Most of us exhibit something called attributional bias: we believe our successes are due to our own internal faculties, and our failures are due to external factors; whereas for others, we believe their successes are due to luck, and their failures to their own flaws. We can’t all be right.

Last, we use context and expectation to bias our appreciation of a situation, because in fact, that’s the only way we can think. Artificial intelligence research has drawn a blank so far largely because of something called the frame problem: you can tell a computer how to process information, and give it all the information in the world, but as soon as you give it a real-world problem—a sentence to understand and respond to, for example—computers perform much worse than we might expect, because they don’t know what information is relevant to the problem. This is something humans are very good at—filtering irrelevant information—but that skill comes at a cost of ascribing disproportionate bias to some contextual data.

We tend to assume, for example, that positive characteristics cluster: people who are attractive must also be good; people who seem kind might also be intelligent and well informed. Even this has been demonstrated experimentally: identical essays in neat handwriting score higher than messy ones, and the behavior of sporting teams that wear black is rated as more aggressive and unfair than teams that wear white.

And no matter how hard you try, sometimes things just are very counterintuitive, especially in science. Imagine there are twenty-three people in a room. What is the chance that two of them celebrate their birthday on the same date? One in two.
14

When it comes to thinking about the world around you, you have a range of tools available. Intuitions are valuable for all kinds of things, especially in the social domain: deciding if your girlfriend is cheating on you, perhaps, or whether a business partner is trustworthy. But for mathematical issues, or assessing causal relationships, intuitions are often completely wrong, because they rely on shortcuts that have arisen as handy ways to solve complex cognitive problems rapidly, but at a cost of inaccuracies, misfires, and oversensitivity.

It’s not safe to let our intuitions and prejudices run unchecked and unexamined: it’s in our interest to challenge these flaws in intuitive reasoning wherever we can, and the methods of science and statistics grew up specifically in opposition to these flaws. Their thoughtful application is our best weapon against these pitfalls, and the challenge, perhaps, is to work out which tools to use where. Because trying to be “scientific” about your relationship with your partner is as stupid as following your intuitions about causality.

Now let’s see how journalists deal with stats.

11
 
Bad Stats
 

Now that you appreciate the value of statistics—the benefits and risks of intuition—we can look at how these numbers and calculations are repeatedly misused and misunderstood. Our first examples will come from the world of journalism, but the true horror is that journalists are not the only ones to make basic errors of reasoning.

Numbers, as we shall see, can ruin lives.

The Biggest Statistic

 

Newspapers like big numbers and eye-catching headlines. They need miracle cures and hidden scares, and small percentage shifts in risk will never be enough for them to sell readers to advertisers (because that is the business model). To this end they pick the single most melodramatic and misleading way of describing any statistical increase in risk, which is called the relative risk increase.

Let’s say the risk of having a heart attack in your fifties is 50 percent higher if you have high cholesterol. That sounds pretty bad. Let’s say the extra risk of having a heart attack if you have high cholesterol is only 2 percent. That sounds OK to me. But they’re the same (hypothetical) figures. Let’s try this. Out of a hundred men in their fifties with normal cholesterol, four will be expected to have a heart attack, whereas out of a hundred men with high cholesterol, six will be expected to have a heart attack. That’s two extra heart attacks per hundred. Those are called natural frequencies.

Natural frequencies are readily understandable, because instead of using probabilities, or percentages, or anything even slightly technical or difficult, they use concrete numbers, just like the ones you use every day to check if you’ve lost a kid on a bus trip or got the right change in a shop. Lots of people have argued that we evolved to reason and do math with concrete numbers like these, and not with probabilities, so we find them more intuitive. Simple numbers are simple.

The other methods of describing the increase have names too. From our example above, with high cholesterol, you could have a 50 percent increase in risk (the “relative risk increase”), or a 2 percent increase in risk (the “absolute risk increase”), or, let me ram it home, the easy one, the informative one, an extra two heart attacks for every hundred men, the natural frequency.

As well as being the most comprehensible option, natural frequencies contain more information than the journalists’ relative risk increase. Recently, for example, we were told that red meat causes bowel cancer, and ibuprofen increases the risk of heart attacks; but if you followed the news reports, you would be no wiser. Try this, on bowel cancer, from the
Today
program on Radio 4: “A bigger risk meaning what, Professor Bingham?” “A third higher risk.” “That sounds an awful lot, a third higher risk; what are we talking about in terms of numbers here?” “A difference…of around about twenty people per year.” “So it’s still a small number?” “Umm…per 10,000…”

These things are hard to communicate if you step outside the simplest format. Professor Sheila Bingham was the director of the MRC Centre for Nutritional Epidemiology in Cancer Prevention and Survival at the University of Cambridge and dealt with these numbers for a living, but in this (entirely forgivable) fumbling on a live radio show she was not alone; there are studies of doctors, and commissioning committees for local health authorities, and members of the legal profession that show that people who interpret and manage risk for a living often have huge difficulties expressing on the spot what they mean. They are also much more likely to make the right decisions when information about risk is presented as natural frequencies, rather than as probabilities or percentages.

For painkillers and heart attacks, another front-page story, the desperate urge to choose the biggest possible number led to the figures being completely inaccurate in many newspapers. The reports were based on a study that had observed participants over four years, and the results suggested, using natural frequencies, that you would expect one extra heart attack for every 1,005 people taking ibuprofen. Or as the
Daily Mail
, in an article titled “How Pills for Your Headache Could Kill,” reported: “British research revealed that patients taking ibuprofen to treat arthritis face a 24 percent increased risk of suffering a heart attack.” Feel the fear.

Almost everyone reported the relative risk increases: diclofenac increases the risk of heart attack by 55 percent; ibuprofen, by 24 percent.
The Boston Globe
was clever enough to report the natural frequencies: 1 extra heart attack in 1,005 people on ibuprofen. The U.K.’s
Daily Mirror
, meanwhile, tried and failed, reporting that 1 in 1,005 people on ibuprofen “will suffer heart failure over the following year.” No. It’s heart attack, not heart failure, and it’s 1
extra
person in 1,005, on top of the heart attacks you’d get anyway. Several other papers repeated the same mistake.

Often it’s the fault of the press releases, and academics can themselves be as guilty as the rest when it comes to overdramatizing their research. But if anyone in a position of power is reading this, here is the information I would like from a newspaper, to help me make decisions about my health, when reporting on a risk: I want to know whom you’re talking about (e.g., men in their fifties); I want to know what the baseline risk is (e.g., four men out of a hundred will have a heart attack over ten years); and I want to know what the increase in risk is, as a natural frequency (two extra men out of that hundred will have a heart attack over ten years). I also want to know exactly what’s causing that increase in risk: an occasional headache pill, or a daily tubful of pain-relieving medication for arthritis. Then I will consider reading your newspapers again, instead of blogs that are written by people who understand research and that link reliably back to the original academic paper, so that I can double-check their précis when I wish.

More than a hundred years ago, H. G. Wells said that statistical thinking would one day be as important as the ability to read and write in a modern technological society. I disagree; probabilistic reasoning is difficult for everyone, but everyone understands normal numbers. This is why natural frequencies are the only sensible way to communicate risk.

Choosing Your Figures

 

Sometimes the mispresentation of figures goes so far beyond reality that you can only assume mendacity. Often these situations seem to involve morality: drugs, abortion, and the rest. With very careful selection of numbers, in what some might consider to be a cynical and immoral manipulation of the facts for personal gain, you can sometimes make figures say anything you want.

The UK’s
Independent
was in favor of legalizing cannabis for many years, but in March 2007 it decided to change its stance. One option would have been simply to explain this as a change of heart, or a reconsideration of the moral issues. Instead it was decorated with science—as cowardly zealots have done from eugenics through to prohibition—and justified with a fictitious change in the facts. cannabis—an apology was the headline for its front-page splash: “In 1997, this newspaper launched a campaign to decriminalise the drug. If only we had known then what we can reveal today…Record numbers of teenagers are requiring drug treatment as a result of smoking skunk, the highly potent cannabis strain that is 25 times stronger than resin sold a decade ago.” Twice in this story we are told that cannabis is twenty-five times stronger than it was a decade ago. For the paper’s former editor Rosie Boycott, in her melodramatic recantation, skunk was “thirty times stronger.” In one inside feature the strength issue was briefly downgraded to a “can be.” The paper even referenced its figures: “The Forensic Science Service says that in the early Nineties cannabis would contain around 1 percent tetrahydrocannabinol (THC), the mind-altering compound, but can now have up to 25 percent.”

This is all sheer fantasy.

I’ve got the U.K.’s Forensic Science Service data right here in front of me, and the earlier data from the Laboratory of the Government Chemist, the United Nations Drug Control Program, and the European Monitoring Centre for Drugs and Drug Addiction. I’m going to share it with you, because I happen to think that people are very well able to make their own minds up about important social and moral issues when given the facts.

The data from the Laboratory of the Government Chemist goes from 1975 to 1989. Cannabis resin pootles around between 6 percent and 10 percent THC, herbal between 4 percent and 6 percent. There is no clear trend.

 

The Forensic Science Service data then takes over to produce the more modern figures, showing not much change in resin and domestically produced indoor herbal cannabis doubling in potency from 6 percent to around 12 or 14 percent (2003–05 data in table under references).

The rising trend of cannabis potency is gradual, fairly unspectacular, and driven largely by the increased availability of domestic, intensively grown indoor herbal cannabis.

“Twenty-five times stronger,” remember. Repeatedly, and on the front page.

If you were in the mood to quibble with
The Independent
’s moral and political reasoning, as well as its evident and shameless venality, you could argue that intensive indoor cultivation of a plant that grows perfectly well outdoors is the cannabis industry’s reaction to the product’s illegality itself. It is dangerous to import cannabis in large amounts. It is dangerous to be caught growing a field of it. So it makes more sense to grow it intensively indoors, using expensive real estate, but producing a more concentrated drug. More concentrated drugs products are, after all, a natural consequence of illegality. You can’t buy coca leaves in South London, but you can buy crack.

 

 

There is, of course, exceptionally strong cannabis to be found in some parts of the British market today, but then there always has been. To get its scare figure,
The Independent
can only have compared the
worst
cannabis from the past with the
best
cannabis of today. It’s an absurd thing to do, and moreover, you could have cooked the books in exactly the same way thirty years ago if you’d wanted; the figures for individual samples are available, and in 1975 the weakest herbal cannabis analyzed was 0.2 percent THC, while in 1978 the strongest herbal cannabis was 12 percent. By these figures, in just three years herbal cannabis became “sixty times stronger.”

And this scare isn’t even new. In the mid-1980s, during Ronald Reagan’s “war on drugs,” American campaigners were claiming that cannabis was fourteen times stronger than in 1970. Which sets you thinking. If it was fourteen times stronger in 1986 than in 1970, and it’s twenty-five times stronger today than at the beginning of the 1990s, does that mean it’s now 350 times stronger than in 1970?

That’s not even a crystal in a plant pot. It’s impossible. It would require more THC to be present in the plant than the total volume of space taken up by the plant itself. It would require matter to be condensed into superdense quark-gluon plasma cannabis. For God’s sake don’t tell the newspapers such a thing is possible.

Other books

The Marriage Bargain by Diane Perkins
Pure Dead Wicked by Debi Gliori
Murder In Chinatown by Victoria Thompson
The Rebel’s Daughter by Anita Seymour
Light to Valhalla by Melissa Lynne Blue
Fleeting Moments by Bella Jewel
Shadowlight by Lynn Viehl
The Defiler by Steven Savile