Data di Pubblicazione:
2008
Abstract:
Motivation: The huge amount of data produced by genome sequencing projects
has allowed to highlight information on the genetic content of many organisms in
the form of lists of genes they can express. Although necessary, this knowledge is not
sufficient to understand the mechanisms regulating many events underlying life
(i.e., cell growth, differentiation, development). In this sense, it is crucial to decipher
the control mechanisms ruling the expression of genome in time and space. To
address this problem we have developed a bioinformatic approach based on the
use of data mining techniques to detect frequent association of regulatory motifs in
untranslated regions (UTRs) of transcripts in Metazoa. The idea is that of mining
frequent combinations of translation regulatory motifs, since their significant cooccurrences
could reveal functional relationships important for the posttranscriptional
control of genome expression.
Methods: The experimentation has been carried out using as a test case UTRs
sequences extracted from the MitoRes database, annotated with information
available in UTRef and UTRsite databases and collected in a relational database
named UTRminer, which supports the pattern mining procedure. The mining
approach is two-stepped: first, patterns of regulatory motifs are extracted and
annotated in the form of sequences of motifs with information on their sequence
location and mutual distances (spacers), then the mutual distances are discretized
and the most frequent sequences of motifs and spacers are discovered by means
of an algorithm for sequence pattern mining. Frequent sequences have a support
greater than a user-specified threshold and the procedure for the generation of
frequent sequences is guaranteed to be complete.
Results: The UTR sequences analysed concern ten different species. The total
number of analysed sequences is 3896, among which 1944 5'UTRs and 1952 3'UTRs.
Frequent motifs patterns, generated at first step, have a complexity ranging from 2
to 3 (number of distinct motifs detected on the same UTR) in 5'UTRs and from 2 to 5
in 3'UTRs. Preliminary results based on the observations and comparative analysis of
discovered sequential pattern add new insights to our knowledge about posttranscriptional
regulatory mechanisms controlling genome expression, while
demonstrating the effectiveness of the bioinformatics approach presented in
supporting discovery of motifs patterns.
Tipologia CRIS:
01.05 Abstract in rivista
Keywords:
Bioinformatics; Frequent Pattern Mining; UTR; Regulatory Motifs; Translation
Elenco autori:
D'Elia, Domenica; Grillo, Giorgio
Link alla scheda completa:
Pubblicato in: