Life's Greatest Secret (35 page)

Read Life's Greatest Secret Online

Authors: Matthew Cobb

BOOK: Life's Greatest Secret
7.02Mb size Format: txt, pdf, ePub
The non-universality of the genetic code and the existence of introns were both completely unexpected, and went against all the assumptions of all the researchers who had been studying the genetic code. These discoveries showed that, strictly speaking, Monod was wrong – what is true for
Escherichia coli
is not necessarily true for an elephant in all respects. Nevertheless, the basic positions established during the cracking of the genetic code remain intact. The strict universality of the code and the linear organisation of genes were not laws, or even requirements. The only requirement is that any divergence from these assumptions can be explained within the framework of evolution, and through testable hypotheses about the history of organisms. This has been amply met for both the non-universality of the code and the existence of introns.
Although the genetic code is not strictly universal, this has not altered our view of the fundamental processes of evolution at all. There is no dispute that life as we know it evolved only once, and that we all descend from a population of cells that lived more than 3.5 billion years ago, known as the Last Universal Common Ancestor, or LUCA.
29
Because all organisms use amino acids with a left-handed orientation and RNA is universally used as a way of stringing amino acids together to make a protein, scientists are convinced that this hypothesis is true. In 2010, Douglas Theobald calculated that the hypothesis that all life is related ‘is 10
2,860
times more probable than the closest competing hypothesis.’
30
The variations in the code that have been discovered are in fact quite minor and can be explained either in terms of the deep evolutionary history of eukaryotes – thereby revealing the thrilling fact that our evolution has hinged on the chance fusion of two cells to create the eukaryotes – or in something recent and local in the life-history of a particular group of organisms such as the ciliates. Similarly, although eukaryotic genes are profoundly different from those of prokaryotes, because they are ‘split’, they still work according to the same principles. All that has happened is that the cellular machinery for taking the information in genomic DNA and turning it into protein has been revealed to be very complicated in a group of organisms that we are particularly interested in, because it includes ourselves. Our basic understanding of how the information in a DNA sequence becomes an amino acid sequence has not been altered; although things are far more complex than the code pioneers could have imagined, the basic framework they developed still stands. The simple models developed in the 1950s and 1960s were not universally correct, but they were a necessary step for the development of our current understanding. And they remain true for the oldest and most numerous organisms on our planet, the prokaryotes.
This final point highlights the power of the reductionist approach adopted by Crick, Delbrück, Monod and the others. They chose to use the simplest possible systems – bacteria and viruses – to understand fundamental processes. In so doing, they gambled that their findings would be applicable to all life. The models that they came up with were simple, elegant and susceptible to experimental testing. Had they been studying mammals and the tangled web of molecules and processes that lead from DNA to protein in these species, it is unlikely that much progress would have been made.
*
Over recent decades, the study of the genetic code has been transformed by one of the most significant technological changes that have taken place in biology – our ability to determine the sequence of DNA and RNA molecules. The breakthrough came with the work of Fred Sanger, who won the Nobel Prize in Chemistry twice, first in 1958 for determining the structure of insulin and other proteins, then in 1980 for sequencing nucleic acids (he shared the second prize with Wally Gilbert, who came up with a less widely used technique for sequencing DNA).
Sanger was not the first to sequence a nucleic acid – a small transfer RNA was sequenced in 1965, using techniques similar to those that had previously been used to sequence proteins.
31
But Sanger’s method made it possible to sequence up to 300 bases of a piece of DNA (in reality 200 bases was more often the limit), marking DNA chains of varying lengths with radioactive phosphorus-containing bases (A, C, G or T), and then visualising these fragments on an electrophoresis gel. Sanger obtained these DNA chains by carrying out four separate reactions to copy a DNA molecule. Each test-tube included four normal nucleotide bases (A, C, G and T), enzymes used to copy the DNA molecule, together with a radioactively labelled variant of one of the bases (hence the need for four reactions). As well as being radioactive, these special bases had been chemically modified so as to stop the chemical reaction when they were incorporated randomly into a new DNA chain. Because a typical extract contains so many identical copies of the DNA molecule and the radioactive base was incorporated at a random point in each new chain, the result was a large number of DNA molecules that were of different lengths and which were radioactive, and could therefore be detected on the gel. Each reaction (A, C, G and T) was then loaded side by side onto a gel and the electric current was turned on. Different lengths of DNA migrated at different speeds and so ended up at distinct points on the gel, enabling the sequence to be read by eye.
Sanger later described this technique, known variously as the chain termination method, dideoxy sequencing or, more simply, Sanger sequencing, as ‘the best idea I have ever had’.
32
The rest of the scientific community seems to agree – his 1977 paper describing the method has been cited more than 65,000 times, a staggering number that makes it the fourth most cited article in the history of science.
33
Using this technique, in 1978 Sanger and his colleagues sequenced the first complete genome, that of a bacteriophage. It was 5,386 base pairs long and represented months and months of work.
34
The technique soon became well established even though it was tedious and repetitive. It was also dangerous: as well as the omnipresence of radioactivity, the electrophoresis gel was made of toxic material and various steps in the procedure involved nasty chemicals that unravelled the DNA in the sample and, potentially, in the experimenter’s body. Despite these hazards, by 1984 researchers had sequenced the full genome of three viruses – two bacteriophages, and the Epstein–Barr virus, which causes glandular fever in humans. The Epstein–Barr virus sequence, which was described in Cambridge, was 172,282 bases long or thirty-two times the length of the first genomic sequence. This was a major feat, representing years of work, and involving what was then a large team of twelve researchers.
Sanger’s method became widely used in the late 1980s with the development of the polymerase chain reaction (PCR), which allows tiny samples of DNA to be amplified in a test tube. This method was invented by Kary Mullis, who was working at the biotech company Cetus Corporation in California, in a flash of insight during a night-time drive with his girlfriend.
35
* PCR involves heating a sample to very high temperatures (up to 95° C); this separates the complementary DNA strands. The sample is then cooled slightly, DNA polymerase enzymes begin to copy the DNA molecules and the complementary strands then pair up. A single cycle doubles the amount of DNA in the sample. By repeating this cycle of heating and cooling dozens of times, even minute amounts of DNA can be amplified millions of times over in a couple of hours.
Mullis had a problem though – he needed a polymerase enzyme that could resist the relatively high temperatures his experiment required. As luck would have it, such an enzyme had recently been described in
Thermus aquaticus
(generally known as
Taq
), a bacterium that lives in ocean thermal vents.
36
The final addition to this procedure is that by adding to the test tube short pieces of DNA, fifteen to twenty bases long, which mark the beginning and end of a DNA sequence of interest, it is possible to target the PCR and thereby amplify only the section of DNA that you are interested in.
PCR rapidly overtook the previous technique of inserting a DNA fragment into a phage genome, then infecting bacteria and allowing the bacteria to reproduce, thereby amplifying the DNA. PCR is much simpler, and even a complete novice can soon amplify minute quantities of DNA. In 1993, less than a decade after his invention, Mullis was awarded the Nobel Prize in Chemistry. The initial application of the technique was diagnosis, and it is now routinely used in medicine as a tool for identifying diseases, both infectious and genetic. Coupled with sequencing, PCR has transformed the way in which biology and medicine work.
The practical application of DNA technology really took off in 1984, when Alec Jeffreys of the University of Leicester discovered the existence of small stretches of DNA that can be easily identified and which represent a unique genetic ‘fingerprint’ of each individual. The significance of these bits of DNA, known as minisatellites, was instantly obvious to Jeffreys, and he immediately wrote down a series of potential applications that included forensics, conservation biology and paternity testing. In less than a year, the technique was used to determine the outcome of an immigration case by showing that a young Ghanaian boy was indeed the son of the woman who claimed to be his mother; as a result the child was allowed back into the UK.
37
Jeffrey’s technique soon proved more flexible and simple than the previous method for identifying genetic variants, which involved snipping bits of DNA at defined locations, using special proteins called restriction enzymes. If the population being studied contained variability for the length of DNA between the two sites where the restriction enzymes acted, then those variants could be detected on an electrophoresis gel. This was first demonstrated in 1980 by David Botstein and his colleagues, who were working on the human genetic disorder Huntington’s disease.
38
Outside of medicine, the use of restriction enzymes proved invaluable for mapping genes and for the development of recombinant DNA biotechnology.
All around the world, DNA fingerprinting is now routinely employed by the judicial system to convict criminals and to prove the innocence of the wrongly accused. The routine collection of DNA samples by the police, and the existence of databases permitting the identification of individuals, has led to a continuing ethical debate of the conflict between liberty and justice, with state forces arguing that only the guilty have something to hide, whereas more libertarian arguments underline the potential dangers.
By the late 1980s, machines were able to read DNA sequences, using a system based on fluorescence rather than radioactivity, but still using Sanger’s sequencing method.
39
Sequences were now detected in tiny capillary tubes rather than on huge heavy gels, opening the possibility of simultaneously carrying out many parallel reads, and the sequence could be read in real time, as the reaction took place, rather than waiting for the gel to run and then detecting the radioactive products using a photographic plate. At the beginning of the 1990s, these technical developments led to the creation of a series of projects for sequencing the genomes of multicellular organisms, with the ultimate objective being the sequencing of the human genome. The first animal genome to be completed, in 1998, was that of the nematode worm,
Caenorhabditis elegans,
closely followed by that of the tiny vinegar fly,
Drosophila melanogaster,
in 2000. These projects provided vital information about two widely used laboratory organisms and were testing-grounds for different technical and commercial approaches to genome sequencing. The
C. elegans
genome project, led by John Sulston, was entirely funded by public money, whereas the
Drosophila
genome was a joint effort between publicly funded researchers and a company called Celera Genomics, led by Craig Venter, a molecular biologist turned entrepreneur.
Despite the very different motivations of the public and private researchers, the
Drosophila
genome project was a success. In contrast, the Human Genome Project, which took place in parallel, was the focus of clashes of scientific and commercial outlook as well as of personality.
40
The human genome contains around 3 billion base pairs, far more than that of
C. elegans
(100 million base pairs) or
Drosophila
(140 million base pairs). The size of the human genome and the large stretches of repetitive sequences it contains posed new difficulties that were exacerbated by the very different approaches taken by the public and private researchers.
The publicly funded International Human Genome Sequencing Consortium, led first by Jim Watson and then by Francis Collins, had been working since 1990 to produce a full sequence of every base in the genome, and its members were resolutely hostile to the idea of patenting genes. In contrast, Craig Venter and Celera initially focused on sequencing only genes that were known to be expressed in certain tissues or under certain conditions, with the hope of finding patentable products. They did this by collecting mature mRNA that was present in the cell or tissue of interest, transcribing that back into what is known as complementary DNA (cDNA) and then sequencing this cDNA molecule.
This approach had the great advantage of focusing on genes that were apparently important in a given tissue and meant that researchers did not waste time sequencing the millions of bases in the huge non-coding regions that can be found between genes, or even sequencing the introns of the gene of interest, which had been stripped out by the cellular machinery during the synthesis of RNA from the genomic DNA. Using this method, Venter showed that it was possible to identify genes involved in vital processes with the tantalising possibility of gaining insight into novel medical treatments. While this was extremely productive and held the promise of financial gain, it was at odds with the aim of the publicly funded project, which was to sequence every base in the human genome.

Other books

It Started with a Scandal by Julie Anne Long
The Double Cross by Clare O'Donohue
Tea-Bag by Henning Mankell
Firewalker by Josephine Angelini
Oliver Twist by Charles Dickens
Unacceptable Risk by David Dun