Vsevolod Makeev, Engel'hardt Institute of Molecular Biology, Moscow
Segmentation of DNA sequences into blocks with uniform composition
We consider a new straightforward approach to DNA segmentation into compositionally homogenous blocks. The exact optimal segmentation is found via the dynamic programming technique. Bayesian estimator, which is applicable for short blocks and yields Jensen-Shannon information score for long block limit, is used to obtain the homogeneity measure. After completion of the segmentation procedure, the analysis of the hierarchy of the segments may be performed by filtration of boundaries with the help of partition function approach. We present the results of segmentation of sequences of complete eukaryotic chromosomes and demonstrate how different genetic elements (repeats, genes, etc) are revealed.