YouTube VideoDesign in DNA: Dual Coding Found in Nearly All Genes
by Rich Deem


The information content in human DNA is enormous, but we are just beginning to understand how efficiently the DNA is encoded. Scientists had originally speculated that the human genome contained up to 100,000 genes. However, the human genome project showed that it contained only one quarter that number, mostly because each gene can code for multiple transcripts. Scientists also thought that only the protein coding DNA, comprising only 3% of the DNA, was useful. The other 97% of the DNA was thought to be junk. However, the last few decades of research have shown that the vast majority (>80%) of non-coding DNA is functional. Much of the non-coding DNA is involved in regulation of transcription (the intermediate step in which mRNA is generated, from which the protein is translated). However, scientists have now discovered that some of the protein coding DNA not only codes for the protein sequence, but simultaneously codes for sequences that bind transcription factors (proteins that regulate the transcription and expression of genes). These dual coding sequences have been termed "duons."

How the study was done

The scientists who authored the study used a naturally occurring enzyme called DNAse I, which digests DNA. It turns out that the enzyme will only degrade DNA that is not bound to proteins. Since transcription factors are proteins that bind DNA, any transcription factors that are bound to DNA when it is isolated are protected from digestion by DNAse I. Scientists isolated the DNA from 81 different cell types and sequenced the fragments of DNA that were preserved by binding to transcription factors. They had to use different cell types because those different cells differentially express genes and transcription factors on the basis of their own particular function. An example dual coding region is shown in the figure to the right, which shows the gene CELSR2, found on chromosome 1. The gene consists of 34 exons (coding regions), with the ninth exon coding for the transcription factor CTFC, which is known to regulate the transcription of numerous genes. It is interesting to note that this short transcription binding site of the exon contains two arginine residues, which are coded using two very different codons (AGG and CGC) in order to match the sequence to which CTFC binds. Although most genes consist of multiple exons (coding regions), the vast majority of duon sequences occur in the first exon, which is what would be expected if the sequences were involved in the regulation of gene expression.

Astounding levels of duons

The scientists had originally expected to find a few genes that simultaneously coded for both proteins and transcription factor binding. However, what they found was that 14% of coding sequence space were duons (which represents over 400 million base pairs). An astounding 86% of all genes expressed at least one duon sequence. Scientist already knew that intronic sequences within the DNA coded for transcription factor binding in order to regulate gene expression. However, since exon coding regions are constrained by their need to code for specific amino acids, it was never imagined that such regions of DNA could simultaneously code for the binding of transcription factors, as well. The finding shows the amazing efficiency of DNA sequences in complex organisms. Although the authors of the study recognized the obvious optimization of the code, they attributed such optimization to natural selection, rather than design:

"Our results indicate that simultaneous encoding of amino acid and regulatory information within exons is a major functional feature of complex genomes. The information architecture of the received genetic code is optimized for superimposition of additional information (34, 35), and this intrinsic flexibility has been extensively exploited by natural selection."

However, they failed to account for how selection could simultaneously select for two diverse functions in the same, overlapping sequence of DNA code.

Conclusion Top of page

Scientists have discovered that regulation of gene expression, originally thought to occur only in non-coding DNA sequences, is, in fact, additionally dual coded into the actual sequence of DNA that defines protein composition. Transcription factors, which bind to specific short sequences of DNA, regulate how the genes are expressed. The fact that these transcription factor binding sequences overlap protein coding sequences, suggest that both sequences were designed together, in order to optimize the efficiency of the DNA code. As we learn more and more about DNA structure and function, it is apparent that the code was not just hobbled together by the trial and error method of natural selection, but that it was specifically designed to provide optimal efficiency and function.

Related Materials Top of page

The Cell's Design: How Chemistry Reveals the Creator's ArtistryReasons To Believe's Fazale Rana has written The Cell's Design, a comprehensive examination of the biochemistry of the cell from a layman's perspective. Even so, the text does not gloss over the significant details of how the cell works. As a scientist myself, I see the design within the cell as much more beautiful than even the most wonderful sunset. The cell's design certainly does reveal the artistry of the Creator.

The Edge of Evolution: The Search for the Limits of Darwinism by Michael Behe

Darwin's Black Box author Michael Behe takes on the limits of evolution through an examination of specific genetic examples. Behe finds that mutation and natural selection is capable of generating trivial examples of evolutionary change. Although he concludes that descent with modification has occurred throughout biological history, the molecular devices found throughout nature cannot be accounted for through natural selection and mutation. Behe's book claims to develop a framework for testing intelligent design by defining the principles by which Darwinian evolution can be distinguished from design.

References Top of page

  1. Andrew B. Stergachis, Eric Haugen, Anthony Shafer, Wenqing Fu, Benjamin Vernot, Alex Reynolds, Anthony Raubitschek, Steven Ziegler, Emily M. LeProust, Joshua M. Akey, and John A. Stamatoyannopoulos 2013. Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution. Science 342: 1325-1326.
Last Modified March 31, 2014


Rich's Blog