The Numbers Behind NUMB3RS (12 page)

BOOK: The Numbers Behind NUMB3RS
5.36Mb size Format: txt, pdf, ePub
CHAPTER
6
Predicting the Future

Bayesian Inference

MANHUNT

When a bus transporting prison inmates is involved in a road accident, two of the prisoners escape, killing the guard in the process. Charlie provides some help in unraveling the case by carrying out a detailed analysis of the crash scene, which enables him to reconstruct what must have happened. His conclusion: The crash was not an accident, it was staged. The escape was planned.

This is the story
NUMB3RS
viewers watched in the first-season episode called “Manhunt,” broadcast on May 13, 2005.

Charlie's fictional mathematical reconstruction of the accident is based on the way accident investigators work in real life. But figuring out how the crash occurred is not the end of Charlie's involvement in this particular case. After one of the escapees is captured, attention focuses on finding the other, the man who planned the escape. The recaptured prisoner, a model prisoner who had almost completed his sentence, turns out to have had no prior knowledge of the escape plot. But he is able to tell Don about his companion, a convicted killer serving a life sentence with no possibility of parole—and hence a highly dangerous individual with little to lose from killing again. The most chilling thing the recaptured prisoner tells Don is that the killer intends to murder the key witness at his trial, a woman whose testimony had helped convict him.

Don tries to persuade the witness to leave town and go into hiding until the killer is caught, but she refuses. She is a hospital doctor with patients she feels she cannot walk away from. This places Don in a race against the clock to track down the escapee before he can make good his deadly intention.

Media coverage of the escape, including police photographs of the escaped killer, soon leads to reports of sightings from members of the public. Unfortunately, the reports flood in, several hundred in all, and they are scattered across Los Angeles, many of them claiming simultaneous sightings at locations miles apart. While some of the reports may be hoaxes, most are probably from well-meaning citizens who genuinely believe they have spotted the man whose photograph they had seen in the newspaper or on TV. But how can Don decide which sightings are accurate—or even which ones are most likely to be correct?

This is where Charlie makes his second contribution to the case. He says he has carried out a “Bayesian statistical analysis” of the sightings, which tells him which particular sightings are most likely reliable. Using Charlie's results, Don is able to determine where the killer probably is, and manages to get to him just in time to prevent him from killing the witness.

As is often the case with dramatic portrayals of mathematics or science at work, the length of time available to Charlie to produce his ranking of the reported sightings is significantly shortened, but the idea of using the mathematically based technique known as Bayesian analysis is sound. At the end of this chapter, we'll explain how Charlie most likely performed his analysis. (Viewers do not see him carrying out this step, and the script offers no details.) first, though, we need to describe in more general terms the hugely important techniques of Bayesian statistics.

PREDICTING THE FUTURE

Law enforcement would be much easier if we could look into the future and know about crimes before they actually occur.
*
Even with the help of mathematics, however, this is not possible. Mathematics can predict with as much accuracy as you wish the position of a spacecraft traveling at thousands of miles an hour at noon Greenwich mean time six months from now, but most of us find it hard to predict with any accuracy where we will be at noon even a week from now. Human behavior simply is not amenable to mathematical prediction. At least, not if you want the mathematics to give an exact answer. If, however, you are willing to settle for numerical estimates on things
likely
to happen, then mathematics can be of real use.

For instance, no one apart from the handful of Al Qaeda operatives who carried out the September 11, 2001, attacks knew in advance what was going to take place. But things might have turned out very differently if the U.S. authorities had known that such an attack was likely, what the most probable targets were, and which actions to take to prevent the terrorists from carrying out their plan. Could mathematics help provide such advance warning of things that might occur, perhaps with some kind of numerical measure of their likelihood?

The answer is, not only is this possible, it actually happened. A year before the attack took place, mathematics had predicted that the Pentagon was a likely terrorist target. On that occasion, no one took the mathematical prediction sufficiently seriously to do something about it. Of course, it's always easier to be smart after the event. What mathematics can do—and did—is (as we explain below) furnish a list of likely targets, together with estimates of the probabilities that an attack will take place. Policymakers still have to decide which of the many threats identified should be singled out for expenditure of the limited resources available. Still, given how events unfolded on that fateful day in 2001, perhaps next time things will turn out differently.

HOW MATHEMATICS PREDICTED THE 9/11 ATTACK ON THE PENTAGON

In May 2001, a software system called Site Profiler was fielded to all U.S. military installations around the world. The software provided site commanders with tools to help to assess terrorist risks, to manage those risks, and to develop standardized antiterrorism plans. The system worked by combining different data sources to draw inferences about the risk of terrorism, using a mathematical technique called Bayesian inference.

Prior to the system's deployment, its developers carried out a number of simulation tests, which they referred to in a paper they wrote the year before.
*
Summarizing the results of the tests, they noted: “While these scenarios showed that the RIN [Risk Influence Network] ‘worked,' they tended to be exceptional (e.g., attacks against the Pentagon).”

As the world now knows, the Pentagon was the site of an attack. Unfortunately, neither the military command nor the U.S. government had taken seriously Site Profiler's prediction that the Pentagon was in danger—nor, for that matter, had the system developers themselves, who viewed the prediction as “exceptional.”

As experience has taught us time and time again, human beings are good at assessing certain kinds of risks—broadly speaking, personal risks associated with familiar situations—but notoriously bad at assessing others, particularly risks of novel kinds of events. Mathematics does not have such a weak spot. The mathematical rules the developers built into Site Profiler did not have an innate “incredulity factor.” Site Profiler simply ground through the numbers, assigning numerical risks to various events, and reported the ones that the math said were most probable. When the numbers said the Pentagon was at risk, that's what the program reported. Humans were the ones who discounted the prediction as too far-fetched.

This story tells us two things. first, that mathematics provides a powerful tool for assessing terrorist risks. Second, that humans need to think very carefully before discounting the results that the math produces, no matter how wild they might seem.

This is the story behind that math.

SITE PROFILER

Site Profiler was licensed by the U.S. Department of Defense in 1999 to develop an enterprise-wide antiterrorism risk management system called the Joint Vulnerability Assessment Tool (JVAT).

The JVAT program was started in response to the bombing of U.S. Air Force servicemen in Khobar Towers, Saudi Arabia, in June 1996, in which nineteen American servicemen and one Saudi were killed and 372 of many nationalities wounded, and the August 1998 bombings of United States embassies in the East African capital cities of Dar es Salaam, Tanzania, and Nairobi, Kenya, where a total of 257 people were killed and more than 4,000 wounded.

The investigations into those events revealed that the United States had inadequate methods for assessing terrorist risks and anticipating future terrorist events. Addressing that need was a major challenge. Since the intentions, methods, and capabilities of potential terrorists, and often even their identities, can almost never be forecast with certainty from the intelligence information available, much of the effort in countering the threat has to focus on identifying likely targets. Understanding the vulnerabilities of a potential target and knowing how to guard against attacks typically requires input from a variety of experts: physical security experts, engineers, scientists, and military planners. Although a limited number of experts may be able to understand and manage one or two given risks, no human can manage all of the components of hundreds of risks simultaneously. The solution is to use mathematical methods implemented on computers.

Site Profiler is just one of many systems that allow users to estimate—with some degree of precision—and manage a large risk portfolio by using Bayesian inference (implemented in the form of a Bayesian network, which we describe below) to combine evidence from different data sources: analytic models, simulations, historical data, and user judgments.

Typically, the user of such a system (often an expert assessment team) enters information about, say, a military installation's assets through a question-and-answer interface reminiscent of a tax preparation package. (Site Profiler actually modeled its interface on Turbo Tax.) The software uses the information it has gathered to construct mathematical objects to represent the installation's various assets and threats, to express the entire situation as a Bayesian network, to use the network to evaluate the various risks, and finally to output a list of threats, each one given a numerical rank based on its likelihood, the severity of its consequences, and so forth. Our interest here is in the mathematics that sits “under the hood” of such a system.

The key idea behind all this goes back to an eighteenth-century English clergyman, Thomas Bayes.

THOMAS BAYES AND THE PROBABILITIES OF WHAT WE KNOW

In addition to being a Presbyterian minister, Thomas Bayes (1702–1761) was a keen amateur mathematician. He was fascinated by how we come to know the things we know, specifically how we judge the reliability of information we acquire, and he wondered whether mathematics could be used to make such judgments more precise and accurate. His method of calculating how our beliefs about probabilities should be modified whenever we get new information—new
data
—led to the development of Bayesian statistics, an approach to the theory and practice of statistical analysis that has long attracted passionate adherents, as well as staunch critics. With the advent in the late twentieth century of immensely powerful computers that can crunch millions of pieces of data per second, both Bayesian statisticians (who
always
use his fundamental idea) and non-Bayesian statisticians (who
sometimes
use it) owe him a great debt.

BAYES' METHOD

Bayes' idea concerns probabilities about things that may or may not be true—that the probability of heads in a coin flip is between .49 and .51; that Brand Y cures headaches more frequently than Brand X; that a terrorist or criminal will attack target J or K or L. If we want to compare two possibilities, say, A and B, Bayes gives the following recipe:

  1. Estimate their relative probabilities P(A)/P(B)—the odds of A versus B.
  2. For each observation of new information, X, calculate the likelihood of that observation if A is true and if B is true.
  3. Re-estimate the relative probabilities of A and B as follows: P(A given X) / P(B given X) = P(A)/P(B) × Likelihood Ratio, where the Likelihood Ratio is the likelihood of observing X if A is true divided by the likelihood of observing X if B is true.
  4. Repeat the process whenever new information is observed.

The odds of A versus B in step one are called “prior odds,” meaning that they represent our state of knowledge prior to observing the data X. Often this knowledge is based on subjective judgments—say, what are the odds that a new drug is better than the standard drug for some illness, or what are the odds that terrorists will attack one target versus another, or perhaps even what are the odds that a criminal defendant is guilty before any evidence is presented? (The arbitrariness of putting a number on the last example is one reason that the use of Bayesian statistics in criminal trials is essentially zero!)

To understand Bayes' recipe, it is helpful to consider an example where these “prior odds” are actually known. When that situation occurs, the use of Bayesian methods is noncontroversial.

THE (FICTITIOUS) CASE OF THE HIT-AND-RUN ACCIDENT

A certain town has two taxi companies, Blue Cabs and Black Cabs. Blue Cabs has 15 taxis, Black Cabs has 75. Late one night, there is a hit-and-run accident involving a taxi. The town's 90 taxis were all on the streets at the time of the accident. A witness sees the accident and claims that a blue taxi was involved. At the request of the police, the witness undergoes a vision test under conditions similar to the those on the night in question. Presented repeatedly with a blue taxi and a black taxi, in random order, he shows he can successfully identify the color of the taxi 4 times out of 5. (The remaining one fifth of the time, he misidentifies a blue taxi as black or a black taxi as blue.) If you were investigating the case, which company would you think is most likely to have been involved in the accident?

Other books

My Sunshine by Catherine Anderson
First Kiss by Kylie Adams
Loving Nicole by Jordan Marie
No Hero by Jonathan Wood
Ulysses S. Grant by Michael Korda
The Glass House by David Rotenberg
Shiver by Lisa Jackson