Max Planck Institute for Molecular Genetics
Max Planck Institute for Molecular Genetics - Ihnestraße 73 - 14195 Berlin - Germany - Phone: (+49 30) 8413 0 - Fax: (+49 30) 8413 1388

   Evolutionary Genomics Group        Department of Computational Molecular Biology


Home
People
Projects
Lectures
Server
Jobs
Publications
Contact

Projects

Comparative Analysis of Nucleotide Substitutions - Mammalian genomes show large-scale regional variations of GC-content (the isochors), but the substitution processes at the origin of this structure are poorly understood. It has been shown that meiotic recombination has a major impact on substitution patterns in human, driving the evolution of GC-content.

But also other cellular processes have an influence on nucleotide substitutions. A regional analysis of nucleotide substitution rates along human genes and their flanking regions allowed us to quantify the effect of mutational mechanisms associated with transcription in germ line cells. Our analysis revealed three distinct patterns of substitution rates. First, a sharp decline in the deamination rate of methylated CpG dinucleotides, which is observed in the vicinity of the 5' end of genes. Second, a strand asymmetry in complementary substitution rates, which extends from the 5' end to 1 kbp downstream from the 3' end, associated with transcription-coupled repair. Finally, a localized strand asymmetry, an excess of C->T over G->A substitution in the non-template strand confined to the first 1-2 kbp downstream of the 5' end of genes. We hypothesize that higher exposure of the non-template strand near the 5' end of genes leads to a higher cytosine deamination rate.

Recently we also established that the presence of CpG Island has an asymmetric influence on nucleotide substitution up and downstream indicative of a cellular process that starts at a CpG Islands and moves outwards beyond its 5' and 3' end.

Models of Genome Evolution - In the recent past it became clear that besides nucleotide substitutions also the insertion and deletion of short pieces of DNA as well as the insertion of repetitive elements have a substantial influence on the evolution of GC isochors in mammals.

We have shown that simple expansion randomization systems (ERS) are able to generate long-range correlation of the GC content, which is one of the hallmarks of isochors. A wide range of such ERSs fall within one universality class and the characteristic decay exponent of the correlation function can easily be calculated from the rates of the underlying processes. This result gives us also a simple method to simulate long-range correlated sequences and recently we were able to quantify the influence of such correlations on the alignment statistics of sequence, which turned out to be quite substantial. Corresponding corrections should be taken into account when calculating p-values for the alignment of genomic sequences. At the basis of expansion randomization systems are processes that duplicate, insert, or delete segments of a sequence.

Comparing the genome of humans to the ones of its closest relatives, the chimpanzee and rhesus monkeys, gave us the opportunity to investigate instances of nucleotide insertions and deletions on small scales and quantify their rates in the genomic context. In the future we also want to extend our analysis and also include insertion of repetitive elements into the vertebrate genomes. In particular we want to understand the quantitative differences in variations of the GC-content between hominoids and rodents. At the end we will generate a much richer null model of genomic evolution, especially for the evolution of promoter regions, which often include multiple binding sites for the same transcription factors.

In Vitro Selection - The advancements of next generation sequencing technologies give us a novel tool for the quantitative analysis of Systematic Evolution of Ligands by Exponential Enrichment (SELEX) experiments. Such experiments are conducted in close collaboration by the Glökler group (Dept. Lehrach). Starting from a highly divers pool of DNA sequences ligands to particular molecules, e.g. transcription factors or other cellular relevant molecules are enriched through subsequent rounds of selection. In house sequencing capabilities give us the opportunity to sequence the DNA pools after each round of selection. This way we are going to study the dynamics of selection for strong binding ligands in lieu of a highly diverse background of unspecific ligands. Since very high diversities can be charted using Illumina sequencing we will also be able to study non-dominant secondary clones and follow the dynamics of their frequency in the population during rounds of selections. New approaches to cluster and analyze the clonal structure of synthetic sequence pools have to be developed.

Mathematics of Evolutionary Models - Markov models describing the evolution of the nucleotide substitution process, widely used in phylogeny reconstruction, usually assume the hypotheses of stationarity and time reversibility. Although these models give meaningful results when applied to biological data, it is not clear if the two assumptions mentioned above hold and, if not, how much sequence evolution processes deviate from them. To this aim, we introduced two sets of indices that can be calculated from the nucleotide distribution and the substitution rates. The stationarity indices (STIs) can be used to test the validity of the equilibrium assumption. The irreversibility indices (IRIs) are derived from the Kolmogorov cycle conditions for time reversibility and quantify the degree of non-time-reversibility of a process. Computations of these indices for genomic nucleotide substitutions in Drosophila simulans and Homo sapiens reveal statistically significant deviations from the ideal case of a process that has reached stationarity and is time reversible.

Phenotypic Mutations - Recent studies have hinted at the importance of phenotypic mutations (errors made in transcription and translation) in molecular evolution. These are thought to facilitate positive selection for adaptations that require multiple-substitutions but the generality of this phenomenon has yet to be explored.

Our research in this area focuses on the importance of phenotypic mutations to negative selection and to the maintenance of genomic robustness by selective constraint. We initially approached this in the context of Nonsense Mediated Decay (NMD)-based surveillance of human gene transcription. We have discovered a pattern of codon usage in human genes that compensates for the variable NMD efficiency by minimizing nonsense errors during transcription. Our future work will focus on whether phenotypic mutations due to other types of mis-transcription constitute a similar selective force.