The Information (45 page)

Read The Information Online

Authors: James Gleick

Tags: #Non-Fiction

BOOK: The Information
10.19Mb size Format: txt, pdf, ePub

The earth is not a closed system, and life feeds upon energy and negative entropy leaking into the earth system.… The cycle reads: first, creation of unstable equilibriums (fuels, food, waterfalls, etc.); then use of these reserves by all living creatures.

 
 

Living creatures confound the usual computation of entropy. More generally, so does information. “Take an issue of
The New York Times
, the book on cybernetics, and an equal weight of scrap paper,” suggested Brillouin. “Do they have the same entropy?” If you are feeding the furnace, yes. But not if you are a reader. There is entropy in the arrangement of the ink spots.

For that matter, physicists themselves go around transforming negative entropy into information, said Brillouin. From observations and measurements, the physicist derives scientific laws; with these laws, people create machines never seen in nature, with the most improbable structures. He wrote this in 1950, as he was leaving Harvard to join the IBM Corporation in Poughkeepsie.

That was not the end for Maxwell’s demon—far from it. The problem could not truly be solved, the demon effectively banished without a deeper understanding of a realm far removed from thermodynamics: mechanical computing. Later, Peter Landsberg wrote its obituary this way: “Maxwell’s demon died at the age of 62 (when a paper by Leó Szilárd appeared), but it continues to haunt the castles of physics as a restless and lovable poltergeist.”

10 | LIFE’S OWN CODE
 
(The Organism Is Written in the Egg)
 

What lies at the heart of every living thing is not a fire, not warm breath, not a “spark of life.” It is information, words, instructions. If you want a metaphor, don’t think of fires and sparks and breath. Think, instead, of a billion discrete, digital characters carved in tablets of crystal.

—Richard Dawkins (1986)

 

SCIENTISTS LOVE THEIR FUNDAMENTAL PARTICLES
. If traits are handed down from one generation to the next, these traits must take some primal form or have some carrier. Hence the putative particle of protoplasm. “The biologist must be allowed as much scientific use of the imagination as the physicist,”
The Popular Science Monthly
explained in 1875. “If the one must have his atoms and molecules, the other must have his physiological units, his plastic molecules, his ‘plasticules.’ ”

Plasticule
did not catch on, and almost everyone had the wrong idea about heredity anyway. So in 1910 a Danish botanist, Wilhelm Johannsen, self-consciously invented the word
gene
. He was at pains to correct the common mythology and thought a word might help. The myth was this: that “personal qualities” are transmitted from parent to progeny. This is “the most naïve and oldest conception of heredity,”

Johanssen said in a speech to the American Society of Naturalists. It was understandable. If father and daughter are fat, people might be tempted to think that his fatness caused hers, or that he passed it on to her. But
that is wrong. As Johannsen declared, “The
personal qualities
of any individual organism do not at all cause the qualities of its offspring; but the qualities of both ancestor and descendent are in quite the same manner determined by the nature of the ‘sexual substances’—i.e., the gametes—from which they have developed.” What is inherited is more abstract, more in the nature of potentiality.

To banish the fallacious thinking, he proposed a new terminology, beginning with
gene
: “nothing but a very applicable little word, easily combined with others.”

It hardly mattered that neither he nor anyone else knew what a gene actually was; “it may be useful as an expression for the ‘unit-factors,’ ‘elements,’ or ‘allelomorphs.’… As to the nature of the ‘genes’ it is as yet of no value to propose a hypothesis.” Gregor Mendel’s years of research with green and yellow peas showed that such a thing must exist. Colors and other traits vary depending on many factors, such as temperature and soil content, but
something
is preserved whole; it does not blend or diffuse; it must be quantized.

Mendel had discovered the gene, though he did not name it. For him it was more an algebraic convenience than a physical entity.

When Schrödinger contemplated the gene, he faced a problem. How could such a “tiny speck of material” contain the entire complex code-script that determines the elaborate development of the organism? To resolve the difficulty Schrödinger summoned an example not from wave mechanics or theoretical physics but from telegraphy: Morse code. He noted that two signs, dot and dash, could be combined in well-ordered groups to generate all human language. Genes, too, he suggested, must employ a code: “The miniature code should precisely correspond with a highly complicated and specified plan of development and should somehow contain the means to put it into action.”

Codes, instructions, signals—all this language, redolent of machinery and engineering, pressed in on biologists like Norman French invading
medieval English. In the 1940s the jargon had a precious, artificial feeling, but that soon passed. The new molecular biology began to examine information storage and information transfer. Biologists could count in terms of “bits.” Some of the physicists now turning to biology saw information as exactly the concept needed to discuss and measure biological qualities for which tools had not been available: complexity and order, organization and specificity.

Henry Quastler, an early radiologist from Vienna, then at the University of Illinois, was applying information theory to both biology and psychology; he estimated that an amino acid has the information content of a written word and a protein molecule the information content of a paragraph. His colleague Sidney Dancoff suggested to him in 1950 that a chromosomal thread is “a linear coded tape of information”

:

The entire thread constitutes a “message.” This message can be broken down into sub-units which may be called “paragraphs,” “words,” etc. The smallest message unit is perhaps some flip-flop which can make a yes-no decision.

 
 

In 1952 Quastler organized a symposium on information theory in biology, with no purpose but to deploy these new ideas—entropy, noise, messaging, differentiating—in areas from cell structure and enzyme catalysis to large-scale “biosystems.” One researcher constructed an estimate of the number of bits represented by a single bacterium: as much as 10
13
.

(But that was the number needed to describe its entire molecular structure in three dimensions—perhaps there was a more economical description.) The growth of the bacterium could be analyzed as a reduction in the entropy of its part of the universe. Quastler himself wanted to take the measure of higher organisms in terms of information content: not in terms of atoms (“this would be extremely wasteful”) but in terms of “hypothetical instructions to build an organism.”

This brought him, of course, to genes.

The whole set of instructions—situated “somewhere in the chromosomes”—is the genome. This is a “catalogue,” he said, containing,
if not all, then at least “a substantial fraction of all information about an adult organism.” He emphasized, though, how little was known about genes. Were they discrete physical entities, or did they overlap? Were they “independent sources of information” or did they affect one another? How many were there? Multiplying all these unknowns, he arrived at a result:

that the essential complexity of a single cell and of a whole man are both not more than 10
12
nor less than 10
5
bits; this is an extremely coarse estimate, but is better than no estimate at all.

 
 

These crude efforts led to nothing, directly. Shannon’s information theory could not be grafted onto biology whole. It hardly mattered. A seismic shift was already under way: from thinking about energy to thinking about information.

Across the Atlantic, an odd little letter arrived at the offices of the journal
Nature
in London in the spring of 1953, with a list of signatories from Paris, Zurich, Cambridge, and Geneva, most notably Boris Ephrussi, France’s first professor of genetics.

The scientists complained of “what seems to us a rather chaotic growth in technical vocabulary.” In particular, they had seen genetic recombination in bacteria described as “transformation,” “induction,” “transduction,” and even “infection.” They proposed to simplify matters:

As a solution to this confusing situation, we would like to suggest the use of the term “interbacterial information” to replace those above. It does not imply necessarily the transfer of material substances, and recognizes the possible future importance of cybernetics at the bacterial level.

 
 

This was the product of a wine-flushed lakeside lunch at Locarno, Switzerland—meant as a joke, but entirely plausible to the editors of
Nature
, who published it forthwith.

The youngest of the lunchers and signers was a twenty-five-year-old American named James Watson.

The very next issue of
Nature
carried another letter from Watson, along with his collaborator, Francis Crick. It made them famous. They had found the gene.

A consensus had emerged that whatever genes were, however they functioned, they would probably be proteins: giant organic molecules made of long chains of amino acids. Alternatively, a few geneticists in the 1940s focused instead on simple viruses—phages. Then again, experiments on heredity in bacteria had persuaded a few researchers, Watson and Crick among them, that genes might lie in a different substance, which, for no known reason, was found within the nucleus of every cell, plant and animal, phages included.

This substance was a nucleic acid, particularly deoxyribonucleic acid, or DNA. The people working with nucleic acids, mainly chemists, had not been able to learn much about it, except that the molecules were built up from smaller units, called nucleotides. Watson and Crick thought this must be the secret, and they raced to figure out its structure at the Cavendish Laboratory in Cambridge. They could not see these molecules; they could only seek clues in the shadows cast by X-ray diffraction. But they knew a great deal about the subunits. Each nucleotide contained a “base,” and there were just four different bases, designated as A, C, G, and T. They came in strictly predictable proportions. They must be the letters of the code. The rest was trial and error, fired by imagination.

What they discovered became an icon: the double helix, heralded on magazine covers, emulated in sculpture. DNA is formed of two long sequences of bases, like ciphers coded in a four-letter alphabet, each sequence complementary to the other, coiled together. Unzipped, each strand may serve as a template for replication. (Was it Schrödinger’s “aperiodic crystal”? In terms of physical structure, X-ray diffraction showed DNA to be entirely regular. The aperiodicity lies at the abstract level of language—the sequence of “letters.”) In the local pub, Crick, ebullient, announced to anyone who would listen that they had discovered “the
secret of life”; in their one-page note in
Nature
they were more circumspect. They ended with a remark that has been called “one of the most coy statements in the literature of science”

:

It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.

 
 

They dispensed with the timidity in another paper a few weeks later. In each chain the sequence of bases appeared to be irregular—any sequence was possible, they observed. “It follows that in a long molecule many different permutations are possible.”

Many permutations—many possible messages. Their next remark set alarms sounding on both sides of the Atlantic: “It therefore seems likely that the precise sequence of the bases is the code which carries the genetical information.” In using these terms,
code
and
information
, they were no longer speaking figuratively.

The macromolecules of organic life embody information in an intricate structure. A single hemoglobin molecule comprises four chains of polypeptides, two with 141 amino acids and two with 146, in strict linear sequence, bonded and folded together. Atoms of hydrogen, oxygen, carbon, and iron could mingle randomly for the lifetime of the universe and be no more likely to form hemoglobin than the proverbial chimpanzees to type the works of Shakespeare. Their genesis requires energy; they are built up from simpler, less patterned parts, and the law of entropy applies. For earthly life, the energy comes as photons from the sun. The information comes via evolution.

The DNA molecule was special: the information it bears is its only function. Having recognized this, microbiologists turned to the problem of deciphering the code. Crick, who had been inspired to leave physics for biology when he read Schrödinger’s
What Is Life?
, sent Schrödinger a copy of the paper but did not receive a reply.

On the other hand, George Gamow saw the Watson-Crick report when he was visiting the Radiation Laboratory at Berkeley. Gamow was a Ukrainian-born cosmologist—an originator of the Big Bang theory—and he knew a big idea when he saw one. He sent off a letter:

Other books

Creep Street by John Marsden
Rex Stout - Nero Wolfe 41 by The Doorbell Rang
Profecías by Michel de Nostradamus
Reclaiming Nick by Susan May Warren
Under Cover of Darkness by Julie E. Czerneda
The Story of You and Me by DuMond, Pamela
The Bloodline Cipher by Stephen Cole