RNA research is expanding very quickly, and
a public resource for these extremely valuable datasets has been long overdue.
Some 30 years ago, scientists realised that
RNA was not just an intermediary between DNA and protein (with a couple of
functions on the side), but a polymer that could fold into complex shapes and
catalyse countless reactions. The importance of RNA was cemented when the structure of the ribosome was determined (something that Venki Ramakrishnan, Ada E. Yonath, and Tom Steitz won the Nobel Prize for, eg here is Venki's Nobel lecture) and it was confirmed that the core function of ribosomes – making a peptide bond between two amino
acids – was catalysed by ribosomal RNA and not by proteins. It’s also likely
that RNA – not protein, not DNA – was the first active biomolecule in the primordial
soup that gave rise to Life. Indeed, one could easily see DNA as an efficient
storage scheme for RNA information, and proteins as an extension of
single-stranded RNA’s catalytic capabilities, enabled by the monstrous enzyme, ribosomal
RNA.
Even focusing on RNAs established role as the
cell’s information carrier, the textbook mRNA, RNA-based interactions are widely recognised as
being important. A real insight was the discovery of microRNA (miRNA): small
RNAs whose actions lead to the down-regulation of transcripts by suppressing translation
efficiency and cleaving mRNAs. MicroRNA has brought to life a whole new world
of other small RNAs, many of which are involved in suppressing “genome
parasites” – repeat sequences that every organism needs to manage.
And then there are long RNAs in mammalian genomes that do not
encode proteins (i.e. long non-coding RNA - lincRNA) have long been recognised as having
some significance – but what do they do? Some are clearly important, like the non-coding
RNA poster child Xist, which inactivates one of the X chromosomes in female mammals
to ensure the correct dosage of gene products. Others are involved in
imprinting/epigenetic processes, for example the curiously named HOTAIR, which
influences transcription on a neighbouring chromosome.
RNA: something missing
Discoveries in RNA biology have expanded
the molecular biologist’s toolkit considerably in recent years. For instance, the
cleavage systems from small RNAs can be used (in siRNA and shRNA ways) to knock
down genes at a transcriptional level. The current “wow” technology, CRISPR/Cas9,
is a bacterial phage defence system that uses an RNA-based component to adapt
to new phages easily. This system has been repurposed for gene editing in
(seemingly) all species – every genetics grant written these days probably has
a CRISPR/Cas9 component.
And yet in terms of bioinformatics, RNA data
was – until this past September – rather uncoordinated. There wasn’t a good way
to talk consistently about well-known RNAs across all types, although this was
sometimes coordinated in sub-fields such as Sam Griffiths-Jones’ excellent
miRBase for miRNAs, or Todd Lowe’s gtRNAdb resource from for tRNAs. But because
RNA data was mostly handled in one-off schemes, researchers working in this
area were hindered. Computational research couldn’t progress to the next
stages, for example capturing molecular function and process terms with GO or
collecting protein–RNA interactions in a consistent way.
RNAcentral in the bioinformatics toolkit
So I’m delighted to see the RNAcentral
project emerge (http://rnacentral.org/). RNAcentral is coordinating the
excellent individual developments emerging in different RNA subdisciplines:
miRNAs, piRNAs, lincRNAs, rRNAs, tRNAs and many more besides. It provides a
common way to talk about RNA, which in turn allows other resources – such as
the Gene Ontology or drug interactions databases – to slot in, usually
precisely in the same “place” as the protein identifier.
Alex Bateman, who leads the RNAcentral
project, has been exploring a more federated approach, quite deliberately gathering
the hard-earned, community-driven expertise of member databases in specific,
specialised areas of RNA biology.
RNAs were, potentially, the first things on
our planet that could be considered “alive”. They are critical components in
biology, not just volatile intermediaries. In terms of bioinformatics, giving
RNA the same love, care and attention as proteins is long overdue, and I look
forward to seeing RNAcentral provide the cohesion and stability this area of
science so richly deserves.
No comments:
Post a Comment