Life's Greatest Secret (38 page)

Read Life's Greatest Secret Online

Authors: Matthew Cobb

BOOK: Life's Greatest Secret
11.14Mb size Format: txt, pdf, ePub
Some parts of our genome seem to be pure selfish DNA – sequences that apparently have no function beyond to survive.
91
Some of these genetic elements, which riddle our genome, are the remnants of what are effectively genetic parasites – transposons. Transposons are sequences of DNA that can move about the genome, jumping from one location to another. They probably originated as RNA retroviruses that copied themselves into DNA and then became trapped in our genomes. They no longer produce viral RNA but retain the ability to move from place to place in our genome by producing an enzyme called transposase, which effectively unglues them from the DNA strand. Occasionally, the transposon may land in or next to a protein-encoding gene that then hijacks the transposon and converts its activity into a product that is delivered into the cell, thereby leading to the evolution of a new gene.
92
Transposons, and their potential regulatory functions, were first identified by Barbara McClintock at Cold Spring Harbor in the 1940s, to widespread disbelief in the scientific community. Not only was she right, in 1983 she was awarded the Nobel Prize in Physiology or Medicine for her discovery – the only woman to be the sole recipient of this prize.
Over evolutionary time, transposon sequences accumulate mutations in the part of their genome that codes for transposase, and cease to be able to move. They become frozen in our DNA, recognisable but immobile. The remnants of these invasive DNA sequences make up an astonishing 45 per cent of the human genome, with one element, known as Alu, leaving genetic traces that make up to 10 per cent of your DNA. Sometimes bits of these pseudogenes can even be transcribed, producing short bits of RNA that can regulate the activity of genes.
93
Apart from their potential transformation into transposons, retroviruses can play a direct evolutionary role – it is probable that the origin of the placenta in mammals was due to our distant ancestors being infected with a retrovirus that produced a protein called syncytin that is now essential for the development of the placenta. This infection seems to have occurred several times in the mammalian lineage, perhaps explaining the varying forms of this organ in different mammals.
94
The long stretches of non-coding DNA lie at the heart of the most mysterious result that has been discovered since the beginning of widespread genomic sequencing. Different species can have substantial differences in the size of their genomes, which do not seem to be related to anything in their ecology or degree of apparent physiological complexity. For example, the genome of the ‘primitive’ lungfish is 350 times larger than that of the pufferfish. No one has been able to come up with any explanation for why this might be. This problem is called the ‘C-value paradox’ or ‘C-value enigma’ – ‘C’ is the amount of DNA in a genome.
95
Some of these differences may be due to a well-known phenomenon: chunks of genomes can be duplicated during evolution, particularly in plants, which can double their genome size in one generation when chromosome duplication goes slightly awry. Because of factors such as duplication, the variation in genomic size that we see between species resists any overall functional explanation. This is highlighted by what is known jocularly as the onion test: the onion genome contains around 16 billion base pairs, or five times that of a human. It is hard to explain this in terms of the contrasting physiology and behaviour of the two organisms, or to imagine that every one of these bases is necessary to the onion.
96
In the late 1950s and early 1960s, researchers began to use the term ‘junk DNA’ to describe DNA that had no apparent function.
97
In 1972, Susumu Ohno defined junk DNA as a sequence that cannot be affected by a deleterious mutation. According to this definition, junk DNA is a sequence that, if it were changed, would have no effect on the organism’s fitness (that is, on its success in passing its genes onto the next generation). Both pseudogenes and the remnants of tranposon activity would seem to be junk DNA, but scientists argue about this term, and some dispute whether any DNA can truly be considered junk.
In September 2012 this rather arcane debate erupted onto the pages of the press and on the Internet, focused on the question of what the human genome actually does. This was prompted by the publication of the findings of a large-scale project to study the cellular activity of the whole of the human genome, called ENCODE (Encyclopaedia of DNA Elements). The results of the ENCODE project were published in an unprecedented wave of thirty papers, signed by 442 authors, backed up by a web site and an iPad app. The leaders of the project claimed that 80 per cent of the human genome could be assigned a ‘biochemical function’; the coordinator, Ewan Birney, went on to claim that the final figure would ‘likely go to 100%’.
98
This led to great excitement in the press:
Science
proclaimed that ENCODE had written the ‘eulogy’ for junk DNA, the
New York Times
stated that ENCODE had shown that 80 per cent of the human genome was ‘critical’ and ‘needed’, while
The Guardian
trumpeted ‘Breakthrough study overturns theory of “junk DNA” in genome’.
99
This hyperbole led to a backlash on the Internet and in scientific publications as scientists who had not been involved in the project disagreed with the suggestion that there was no ‘junk DNA’, or that 80 per cent of our genome is ‘functional’.
100
The argument turned on the meaning of the word ‘function’. The ENCODE project deliberately cast its net wide by looking for a ‘reproducible biochemical signature’, which they defined as any consistent biochemical reaction induced by a given stretch of DNA, from mRNA production to protein binding.
101
That was where the 80 per cent figure came from. The computational biologist Sean Eddy pointed out that the ENCODE study lacked what scientists call a ‘negative control’ – a set of DNA sequences that did not have any function, by any definition, and should therefore have not been identified as functional by the biochemical criteria used by ENCODE.
102
Shortly afterwards, a paper appeared in which researchers carried out this experiment: they randomly generated 1,300 DNA sequences and found that most of these artificial sequences were ‘functional’ according to the ENCODE criteria. This suggested that the ENCODE definition could not systematically discriminate between random bases and DNA that has some kind of biochemical role in the cell.
103
The lead author of the study, Mike White, wrote:
most DNA will look functional at the biochemical level. The inside of a cell nucleus is a chemically active place. The real puzzle is this: how does functional DNA manage to distinguish itself from the vast excess of dead transposable elements, pseudogenes, and other accumulated junk?
104
That question remains unanswered.
In 2014, the ENCODE consortium published a second wave of papers and seemed to back away from their earlier headline claim of 80 per cent function, admitting that ‘it is not at all simple to establish what fraction of the biochemically annotated genome should be regarded as functional’. Instead, they emphasised their indisputable finding that an important part of the human genome seems to induce reliable biochemical activity of some kind:
The major contribution of ENCODE to date has been high-resolution, highly-reproducible maps of DNA segments with biochemical signatures associated with diverse molecular functions. We believe that this public resource is far more important than any interim estimate of the fraction of the human genome that is functional.
105
For the moment, despite the initial claims of ENCODE, and despite the fact that much of the genome seems to be transcribed into RNA in one form or another, a substantial proportion of our DNA, and that of other organisms, seems to have no discernible role in our existence and could be deleted without causing any selective disadvantage. Future discoveries may change this view, but in those strict terms, much of our DNA still appears to be ‘junk’.
* In 1988 the historian Jan Witkowski pointed out that there had been no historical study of this period. Although that might not have been unusual in 1988, it is surprising that nearly 40 years after the discovery there has still been no detailed historical analysis of the revolution and its implications.
* In his Nobel Prize address, Mullis said that when he realised what he had dreamt up, he stopped his car at mile marker 46.7 on Highway 128 and scribbled down the essential elements of the technique.
* The answer, surprisingly, is whale.
* In 2004, the remains of another archaic human,
Homo floresiensis,
were discovered in a tropical cave on the island of Flores in Indonesia (Brown
et al.,
2004; Morwood
et al.,
2004).
H. floresiensis,
popularly known as ‘the hobbit’, inhabited the cave until about 18,000 years ago. For the moment, it is not possible to extract DNA from bones preserved in such bacteria-rich conditions, but this may change (Callaway, 2014a).
–     THIRTEEN     –
THE CENTRAL DOGMA REVISITED
In his 1957 lecture, Francis Crick outlined what he called the central dogma of molecular genetics:
once information (meaning here the determination of a sequence of units) has been passed into a protein molecule it cannot get out again, either to form a copy of the molecule or to affect the blueprint of a nucleic acid.
1
Information can get out of DNA into RNA to determine the structure of a protein, but proteins cannot specify the sequence of new proteins, and the information in proteins cannot make the reverse journey back into your genes – your DNA cannot be rewritten by a protein. The central dogma has been the focus of repeated criticism over the past sixty years, partly because of the discovery of new facts, and partly because the unfortunate term ‘dogma’ tends to be a lightning rod for debate.
In 1970,
Nature
magazine trumpeted ‘Central dogma reversed’ when it was discovered that information can flow from RNA into DNA.
Nature
’s claim was prompted by a discovery that explained how RNA viruses can infect healthy cells and transform them into cancerous cells that produce viruses. In 1964, Howard Temin, a 30-year-old cancer researcher at the University of Wisconsin-Madison, had boldly suggested that the basis of this effect was that RNA viruses turned their RNA code back into DNA, which was then integrated into the host’s chromosome, where it de-regulated cell growth and produced more virus RNA. There was no known mechanism whereby such a ‘reverse transcription’ could take place, so Temin was forced to hypothesise the existence of an enzyme that could carry out that task, transcribing the RNA virus into DNA. In 1970, Temin was proved right, when he, along with 32-year-old David Baltimore at MIT, reported the existence of an enzyme in RNA viruses that copies RNA into DNA. This enzyme, now called reverse transcriptase, enables information to flow from RNA back to DNA.
Nature
magazine, which published Baltimore’s paper, editorialised somewhat pompously:
The central dogma, enunciated by Crick in 1958 and the keystone of molecular biology ever since, is likely to prove a considerable over-simplification.
2
Piqued by the tone of the editorial, Crick replied in the pages of the journal, graciously acknowledging Temin and Baltimore’s ‘very important work’ and setting the record straight with regard to what he had argued thirteen years earlier. Crick’s original hypothesis explored all the possible transfers of information between nucleic acids and proteins, and prohibited only those, such as protein → DNA, that either had been excluded experimentally or for which there was no conceivable mechanism. In subsequent years, this rich view tended to be replaced by the cruder DNA → RNA → protein, as summarised in Jim Watson’s influential 1965 textbook,
Molecular Biology of the Gene.
3
In 1957, Crick considered that the RNA → DNA step was ‘rare or absent’, but not impossible. As Crick pointed out in 1970, there was no ‘good theoretical reason why the transfer RNA → DNA should not sometimes be used. I have never suggested that it cannot occur, nor, as far as I know, have any of my colleagues.’
4
Crick’s view of the significance of Temin and Baltimore’s discovery was that the RNA → DNA transfer probably did not occur in most cells but might take place in special circumstances such as some viral infections. Temin was not so restrained, and within a year he was arguing that RNA → DNA information transfers were a fundamental part of normal development in the somatic cell line (that is, in all cells except the egg and sperm). As a result, he claimed, ‘new DNA sequences are formed by this process during the lifetime of a single organism’.
5
According to Temin, reverse transcription was an everyday process, helping to shape how our cells develop – except that it is not, and reverse transcriptase is only ever found in cells infected by a particular class of RNA viruses called retroviruses. In this respect, Crick was right, and Temin – and the editorial writers at
Nature
– were wrong.

Other books

Spencerville by Nelson Demille
The Butcher Beyond by Sally Spencer
Tell Me Why by Sydney Snow
Highland Temptation by Jennifer Haymore
Days of Grace by Arthur Ashe
Cowboy to the Rescue by Louise M. Gouge
Only in Vegas by Lindsey Brookes