Life's Greatest Secret (31 page)

Authors: Matthew Cobb

BOOK: Life's Greatest Secret

10.63Mb size Format: txt, pdf, ePub

And something else was odd. As Ochoa’s group noted in March 1962: ‘A striking feature of the code triplets is that they all contain U.’
¹⁴A month later, Nirenberg’s lab made the same observation: ‘a surprisingly high proportion of U has been found in coding units thus far’.
¹⁵Attempts to get amino acid incorporation with polynucleotides that did not contain U failed repeatedly, leading some researchers to conclude that the twenty-seven possible triplets that did not contain U were ‘nonsense’ triplets, with no meaning. But when the composition of virus RNA was studied, U was not present in particularly high levels, implying that coding units not containing U must exist somewhere in nature, or that something strange was going on in the test tube. Confusion was increased when both Ochoa and Nirenberg’s labs reported that poly(U), which everyone agreed led to the incorporation of phenylalanine, also seemed to code for leucine and valine.
¹⁶Everyone assumed that this must be an experimental error (it was), but until that could be explained, there was a real danger: if a triplet coded for more than one amino acid, then all existing ideas about coding would have to be scrapped. The whole code edifice could come crashing down.

The cascade of new data, and the lack of clarity about what it all meant, encouraged the theoreticians to return to the coding problem. As Crick put it in 1966, during this period there was ‘a flurry of theoretical papers, most of which are best forgotten’.
¹⁷Forgettable they may have been, but they reveal the thinking of scientists at the time and highlight how they were groping their way towards the right answer. In September 1961, Richard Eck re-raised the possibility that the code might be overlapping, with the bases in one coding unit also forming part of the subsequent unit (so, for example, the sequence CACGU would contain three triplets – CAG, ACG, and CGU).
¹⁸* Brenner had disproved this idea to most people’s satisfaction in 1957; his criticism had been reinforced by the fact that mutational studies of viruses had shown that changes to a single base only ever altered a single amino acid – if the code were overlapping, then two or more amino acids should be altered.
¹⁹Nevertheless, as Crick later put it, it was possible with ingenuity to come up with a complex overlapping code.
²⁰Another theoretician took as his starting point the complementary coding groups that would be found on the two strands of the DNA molecule, for example TAC and ATG, and argued they must code for the same amino acid, to avoid the problem of the cell having to know which strand of DNA to use.
²¹This bold and mistaken vision underestimated the skill of the cell, which can indeed distinguish between the two strands.

The most sophisticated attempt to crack the code by theoretical means was made by Carl Woese (pronounced ‘Wose’). His starting point was that any code had to be compatible with the known nucleotide composition of RNA in a variety of organisms, which was known not to be rich in U. After a tortuous set of calculations, Woese produced a code that used only twenty-four potential triplets out of sixty-four, with most amino acids being coded by both a triplet that was low in G + C and one that was high in these two bases. Like all the other theoretical schemes, this one was ingenious, but wrong.
²²

Other less complicated theoretical codes were dreamt up. Richard Roberts found an easy solution to the problem of the U-rich data coming from Ochoa and Nirenberg’s labs – the Us were simply not relevant, he argued, because the code was really composed of only two bases. Although this fitted with the known RNA base composition of viruses, which was not U-rich, it had the disadvantage of providing only sixteen possible combinations, when at least twenty were needed to code for the naturally occurring amino acids. To get out of this dilemma, Roberts suggested that the code was composed of both doublets and triplets, with some doublets, such as AA or GG, indicating the start of a triplet. There was no evidence to support this, but as Roberts cunningly pointed out, there was no evidence against it, either.
²³At the beginning of 1963, Thomas Jukes put forward a variant of this idea, suggesting that each triplet had what he called a ‘pivotal’ base, which could change without altering the amino acid that was coded for. He announced that U was the ‘pivotal base’ in all triplets containing U (it is not).

In a return to the numerology that had dominated the coding problem in the 1950s, Eck published a four-page article in
Science
in which he claimed to detect a symmetrical pattern in the attribution of triplets to amino acids – four amino acids were coded by four triplets, with the remaining sixteen each being coded by two.
²⁴Eck said all he had to do was to tabulate the known distribution of triplets ‘and the puzzle practically solved itself’. But the solution was based entirely on conveniently fitting as-yet unallocated triplet/amino acid combinations into the schema. The pattern was in Eck’s head, not in the data.

Finally, the pioneer of the biological application of information theory, Henry Quastler, came up with a schema based on data from amino acid changes induced by mutations. He was unimpressed by the cell-free studies, arguing that they did not necessarily measure protein synthesis, and above all he emphasised that in most cases the precise nature of the polynucleotides was unknown.
²⁵Crick was scornful of Quastler’s paper, claiming that it consisted of ‘a rather poor fit to some very doubtful data’ and was based on ‘an unspecified technique’. All of the triplets predicted by Quastler were wrong, with the exception of UUU = phenylalanine, which was hardly a prediction.

The real answer to the conundrum of the predominance of U nucleotides in the cell-free data was inadvertently provided by Solomon Golomb.
²⁶He performed various calculations and concluded that it was not possible to deduce anything about the role of non-U sequences without doing an experiment. Which is what the biochemists did, and by the middle of 1962, RNA with no U bases had been shown to encode amino acids.
²⁷The bewildering preponderance of U-rich polynucleotides was an artefact due to the solvents that were initially employed in the cell-free system.
²⁸

With the smell of competition in everyone’s nostrils, two meetings took place in the summer of 1962 at which progress on cracking the code was discussed. In July, a ‘Colloquium on Information in Contemporary Science’ was held in the glorious surroundings of the thirteenth-century Royaumont Abbey to the north of Paris. As one of the participants recalled, ‘the gardens, the musical evenings, and supper by candlelight’ were almost as significant as the discussions.
²⁹This meeting involved philosophers, mathematicians, sociologists and biologists and was one of the last attempts to explore the usefulness of information theory across scientific disciplines.

In his brief introduction on ‘the concept of information in molecular biology’, André Lwoff of the Pasteur Institute set out his position quite trenchantly, unwittingly repeating the critique of information theory as applied to biology that had been made at similar meetings in the US a few years earlier. Lwoff argued that it was not useful to calculate the information contained in a DNA sequence, using either Shannon’s equations or Wiener’s negative entropy (following Brillouin, Lwoff called it negentropy), because such calculations did not deal with the meaning or function of that information in the organism. As he put it: ‘the calculation of negentropy using Shannon’s formulas cannot in any way be applied to an organism.’ It would be like trying to calculate the information content of a tragedy by Racine, he said. For a biologist, argued Lwoff, the only meaning of information was ‘a sequence of small molecules and the set of functions they carry out’.
³⁰Wiener and the philosophers who were present could not see what the problem was, thereby inadvertently illustrating the gulf between the information theoreticians and the biologists.

Similar mutual incomprehension was revealed in the other sessions, which were often fractious. The mathematician Benoît Mandelbrot suggested that such information-focused cross-disciplinary meetings were pointless:

The implications of the strict meaning of information have sufficiently explored for its consequences to be quite clear. What remains is so difficult that it can usefully be discussed only in private … we must consider that its scientific usefulness has ceased, at least for the time being’.

³¹

Alongside these rather sterile plenary discussions there were workshops in which experts in the various fields explored their topic in more detail. The workshop on information theory in biology included a session on the genetic code, chaired by Delbrück, with contributions from Crick, Nirenberg, Woese, Jacob and Ochoa’s student Peter Lengyel. During the discussion, Crick introduced the term ‘codon’ to describe the group of bases that codes for an amino acid – the word had been invented by Brenner, apparently partly as a spoof on the other ‘-on’ words that had been coined by Benzer and by Jacob and Monod.
³²It stuck, and is still in use today.

Much of the discussion at the workshop focused on the uncertainty of the results from the cell-free system: some participants questioned whether the polynucleotides truly contained the proportions of bases that Ochoa and Nirenberg’s groups assumed they did. Woese outlined his proposed code, framed in terms of the informational content of the different bases, and Jacob described protein synthesis in terms of a ‘theory of informational transfer and regulation’. Whatever the insights these presentations may have had, they were not published and left no trace on subsequent research. As most people agreed, the influence of information theory on molecular biology had passed its peak. Information had now become a vague but essential metaphor, rather than a precise theoretical construct.

This was reflected in the ‘Symposium on Informational Macromolecules’, which took place at Rutgers University in New Jersey, at the beginning of September 1962. Despite the title, there was very little direct exploration of the informational content of the macromolecules that were the subject of the meeting – DNA, RNA and proteins. When Ochoa opened the conference, he nodded in the direction of the new vocabulary, referring to ‘information coded into the DNA molecule’ that was ‘transferred to an RNA tape’, but his real position was made abundantly clear in his very first sentence: ‘This symposium deals essentially with the molecular mechanisms concerned with the genetic control and regulation of protein synthesis.’
³³The focus was biochemistry, not information.

The first of two sessions on the genetic code was chaired by the veteran geneticist Ed Tatum, who referred back to his discovery of the ‘one gene, one enzyme’ principle with George Beadle, twenty years earlier:

I think back to the time when we started our work, so many years ago. I think we would not have been able to anticipate that we would, in this relatively short time, be present at a symposium on informational macromolecules. This is something that most of you take for granted, but I can assure you – and I think I speak for Dr Beadle too – that this is really an extraordinary phenomenon in the development of molecular biology.

³⁴

Around 250 people attended the meeting, but only 13 were from outside the US; neither Crick nor Brenner, nor anyone from Watson’s group, attended, and only François Gros was there from the Pasteur Institute. The stars of the show were the new kings of the code: Nirenberg and Ochoa.

Dozens of speakers summarised data from a range of species (including bacteria, mice and algae) that indicated that the genetic code was universal; they outlined the growing conviction that only one of the two strands in the DNA double helix was used to make protein via RNA; and they described the recent discovery that non-U-containing polynucleotides could code for amino acids. But on the question of questions – the nature of the genetic code – there was no agreement. During the coffee breaks and at mealtimes, attendees argued about whether the genetic code had been cracked or not.
³⁵Nothing was certain, beyond the fact that UUU coded for phenylalanine, AAA for lysine and CCC for proline.

During the meeting, both Ochoa and Nirenberg toyed with Roberts’s combined doublet–triplet code. Nirenberg assumed that the code was based on triplets, but warned, ‘it is not possible at this time to distinguish between triplet and doublet codes’.
³⁶As he put it with disarming clarity: ‘Almost all amino acids tested can be coded by polynucleotides containing only two bases.’ Ochoa was even clearer – during the discussion of Nirenberg’s paper he said:

I must say I have been very impressed by Dick Roberts’ ingenious doublet code idea. … It almost looks as if that third base does not matter and, in this regard, I cannot help but think of the possible significance of Roberts’ proposal.

³⁷

In a summary of the meeting that appeared in a book collecting the talks, the organisers of the symposium suggested that the status of the genetic code at the time was something like that of the periodic table first published by Mendeleev – it was fragmentary and not all of its predictions were correct, but ‘nevertheless, a fundamental system had been discovered!’
³⁸

Francis Crick was frustrated by the mixture of unclear experimentation, loosely argued theory and guesswork that had begun to infest studies of the genetic code. In the summer of 1962, as scientists involved in the coding race were either recovering from Royaumont, preparing to go to Rutgers, or both, Crick wrote a long, highly critical, review article on the topic. In typical patrician style it was entitled ‘The recent excitement in the coding problem’. Crick summarised the work of the Ochoa and Nirenberg labs, praising their results as being ‘of very considerable interest’ before changing gear and pulling no punches:

Other books

Autumn's Shadow by Lyn Cote

Hospital by Julie Salamon

Must Love Otters by Gordon, Eliza

Nosferatu the Vampyre by Paul Monette

the Devil's Workshop (1999) by Cannell, Stephen

Living in Freefall (Living on the Run Book 1) by Ben Patterson

SODIUM:2 Apocalypse by Arseneault, Stephen

Enchanted Isle by James M. Cain

The Mad Toy by Roberto Arlt

The Cliff House Strangler by Shirley Tallman