Tuesday, May 22, 2007

Overlapping Protein Coding Regions

A research article entitled A First Look at ARFome: Dual-Coding Genes in Mammalian Genomes authored by Wen-Yu Chung1, Samir Wadhawan1, Radek Szklarczyk2, Sergei Kosakovsky Pond3, and Anton Nekrutenko1 contained this interesting opening paragraph:

Coding of multiple proteins by overlapping reading frames is not a feature one would associate with eukaryotic genes. Indeed, codependency between codons of overlapping protein-coding regions imposes a unique set of evolutionary constraints, making it a costly arrangement. Yet in cases of tightly coexpressed interacting proteins, dual coding may be advantageous. Here we show that although dual coding is nearly impossible by chance, a number of human transcripts contain overlapping coding regions. Using newly developed statistical techniques, we identified 40 candidate genes with evolutionarily conserved overlapping coding regions. Because our approach is conservative, we expect mammals to possess more dual-coding genes. Our results emphasize that the skepticism surrounding eukaryotic dual coding is unwarranted: rather than being artifacts, overlapping reading frames are often hallmarks of fascinating biology.

Coding of multiple proteins by overlapping reading frames is indeed an unexpected outcome of a selection process whose options are stochastically generated. The authors allude to the obvious when they observe that dual coding imposes constraints on amino acid possibilities. Note the following paragraph and its introductory headline (article quotes in blue). It contains some helpful definitions:

Dual Coding Is Virtually Impossible by Chance

Before describing our analyses, we define terms used in this paper. A dual-coding gene contains two frames read in the same direction: canonical (annotated as protein coding in literature and/or databases) and alternative. The alternative reading frame (ARF) is shifted forward one or two nucleotides relative to the canonical frame (+1 and +2 ARFs, respectively). To identify dual-coding genes, we used a comparative genomics strategy, because all presently known alternative reading frames are conserved in multiple species. For example, ARFs in Gnas1, XBP1, and INK4A are conserved in all sequenced mammals [8,10,12].


It is highly improbable that the encoding sequences for two distinct proteins would align exactly as required to confer function to both. The conclusion notes probability in deducing a likelihood of functionality.

Maintenance of dual-coding regions is evolutionarily costly and their occurrence by chance is statistically improbable. Therefore, an ARF that is conserved in multiple species is highly likely to be functional.


1 Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania, United States of America, 2 Integrative Bioinformatics Institute, Vrije Universiteit, Amsterdam, The Netherlands, 3 Antiviral Research Center, University of California San Diego, La Jolla, California, United States of America

2 Comments:

At 8:28 AM, Blogger dobson said...

So is it virtually impossible or totally impossible? You seem to be agreeing with the authors of the paper who claim that it is merely statistically improbable, based on the based on the observation that the overwhelming majority of genetic information is either non-coding or codes for single proteins.

The authors of this paper make an interesting point that Genetic information that codes for two (or possibly more) proteins might be very unlikely indeed (which explains why we find it so infrequently), however if the two proteins that it codes for are important for survival then we would expect it to be very strongly conserved, which explains why these obscure phenomena persist, apparently across species.

None of this is evidence of intelligent design or evolutionary intervention; It is however yet another example of how relatively small evolutionary changes can lead to quite big differences - in this case one gene coding for twice the number of expected proteins.

It is fascinating research, so I thank you for bringing it to my attention.

:-)

 
At 10:32 AM, Blogger Paul said...

You're welcome, Dobson. I enjoyed it too.

 

Post a Comment

<< Home