Publication Date:
1997
abstract:
The comprehension of the mechanism of gene expression is one of the more relevant fields associated with genome analysis.
The experimental sequencing activities evidence the existence of control domains like promoter, enhancer and silencer.
The promoter signals are well documented in various databases. Inside these regions are distributed a number of characteristic
sequences involved in the interactions between the DNA and the transcriptional complex.
Different approaches have been applied in order to detect these features in genomic sequences.
In the present paper we investigate the statistical properties of a subset of the Eukaryotic Promoter Database (EPD using entropies as a measure of complexity.
An efficient computational approach, suitable for a coarse-grain parallelization, is used for a fast processing of the symbolic sequence.
Results show a large deviation from the randomness for the distribution of words in the EPD database.
Iris type:
04.01 Contributo in Atti di convegno
Keywords:
genome sequencing; Eukariotic Promoter Database; entropies; word frequencies; coarse-grain parallelization; fast sequence analysis
List of contributors:
Arrigo, Patrizio; Milanesi, Luciano; Corana, Angelo
Book title:
Molecular Bioinformatics: Sequence Analysis - The Human Genome Project