popular math

Comparative testing of DNA segmentation algorithms using benchmark simulations

E. Elhaik, D. Graur and K. Josić

Numerous segmentation methods for the detection of compositionally homogeneous domains within genomic sequences have been proposed. Unfortunately, these methods yield inconsistent results. To address this problem we present a benchmark consisting of two sets of simulated genomic sequences for testing the performances of segmentation algorithms. Sequences in the first set are composed of fixed-sized homogeneous domains, distinct in their between-domain GC content variability. The chromosomal sequences in the second set are composed of a mosaic of many short domains and few long domains distinguished by sharp GC content boundaries between neighboring ones. We use these benchmarks to test the performance of seven previously proposed. Our results show that recursive segmentation algorithms based on the Jensen-Shannon divergence outperform all other algorithms. However, even these algorithms perform poorly in certain instances because of the arbitrary choice of a segmentation stopping criterion

Contact me for a reprint

______________________________________________________________________________________________

Current Address: Department of Mathematics, PGH Building, University of Houston, Houston, Texas 77204-3008
Phone: (713) 743-3500 - Fax: (713) 743-3505


Image designed by Graham Johnson, Graham Johnson Medical Media, Boulder, Colorado