Opportunities for Sequence and Structural Analysis of non-Coding Regions

Gabriele Varani
Laboratory of Molecular Biology
Medical Research Council
MRC Centre, Hills Road
Cambridge CB2 2QH
E-Mail: gv1@mrc-lmb.cam.ac.uk

The imminent completion of the human genome sequence and of the sequence of other eukaryotic genomes provide unprecedented opportunities for dissecting evolutionary relationships not only at the level of protein coding sequences, but also at the level of the regulatory regions that follow or precede open reading frames. Perhaps surprisingly, the regions that immediately precede or follow ORF's often have higher level of sequence conservation than ORS's, implying that the sequence and/or three-dimensional structures of these regions, and the proteins that bind to them, are under strong evolutionary pressure. Some sequences even function at the RNA level, without ever being translated. Gene expression is a primary functional attribute and the untranslated regions of genes control protein production. It would seem that a major task of a post-genomic approach to gene function would be to find ways to effectively use comparative genomics to dissect functional properties of the non-coding regions of genes.

The systematic exploration of sequence/structure relationship and comparative sequence analysis, guided by ongoing efforts in structure determination of RNA and RNA-protein complexes, provides the opportunity to dissect structure/function within regulatory regions of genes. The hierarchical nature of the RNA folding problem (the secondary structure forms first, followed by the establishment of weaker tertiary interactions that lead to the overall folding) makes it possible to use sequence information very effectively. Patterns of sequence conservation or variation can be interpreted in terms of RNA structure more clearly than for proteins, at the present time. When conservation is at the level of primary sequence, a functionally significant protein binding site or RNA-binding site is likely to have been identified. One challenge is to develop generic computational and experimental methods to seek proteins that bind to a given RNA site. The experience gathered in the last few years in understanding protein-RNA interaction is vital, yet real generic experimental and computational methods to study protein-RNA and RNA-RNA interactions would be of very significant value in understanding gene expression.

Back to the schedule