Abstract

Sequence comparison in molecular biology is in the beginning of a major paradigm shift—a shift from gene comparison based on local mutations (i.e., insertions, deletions, and substitutions of nucleotides) to chromosome comparison based on global rearrangements (i.e., inversions and transpositions of fragments). The classical methods of sequence comparison do not work for global rearrangements, and little is known in computer science about the edit distance between sequences if global rearrangements are allowed. In the simplest form, the problem of gene rearrangements corresponds to sorting by reversals, i.e., sorting of an array using reversals of arbitrary fragments. Recently, Kececioglu and Sankoff gave the first approximation algorithm for sorting by reversals with guaranteed error bound 2 and identified open problems related to chromosome rearrangements. One of these problems is Gollan’s conjecture on the reversal diameter of the symmetric group. This paper proves the conjecture. Further, the problem of expected reversal distance between two random permutations is investigated. The reversal distance between two random permutations is shown to be very close to the reversal diameter, thereby indicating that reversal distance provides a good separation between related and nonrelated sequences in molecular evolution studies. The gene rearrangement problem forces us to consider reversals of signed permutations, as the genes in DNA could be positively or negatively oriented. An approximation algorithm for signed permutation is presented, which provides a performance guarantee of $\tfrac{3}{2}$ . Finally, using the signed permutations approach, an approximation algorithm for sorting by reversals is described which achieves a performance guarantee of $\tfrac{7}{4}$.

MSC codes

  1. 68Q25
  2. 68Q05

Keywords

  1. computational molecular biology
  2. sorting by reversals
  3. genome rearrangements

Get full access to this article

View all available purchase options and get full access to this article.

References

AW87.
Martin Aigner, Douglas B. West, Sorting by insertion of leading elements, J. Combin. Theory Ser. A, 45 (1987), 306–309
ABSR89.
Nancy Amato, Manuel Blum, Sandra Irani, Ronitt Rubinfeld, Reversing trains: a turn of the century sorting problem, J. Algorithms, 10 (1989), 413–428
BP95.
V. Bafna, P. Pevzner, Sorting by reversals: Genome rearrangements in plant organelles and evolutionary history of X chromosome, Molec. Biol. Evolution, 12 (1995), 239–246
CLR90.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Introduction to algorithms, The MIT Electrical Engineering and Computer Science Series, MIT Press, Cambridge, MA, 1990xx+1028
EG81.
S. Even, O. Goldreich, The minimum-length generator sequence problem is NP-hard, J. Algorithms, 2 (1981), 311–313
FH89.
C. Fauron, M. Havlik, The maize mitochondrial genome of the normal type and the cytoplasmic male sterile type T have very different organization, Current Genetics, 15 (1989), 149–154
GP79.
William H. Gates, Christos H. Papadimitriou, Bounds for sorting by prefix reversal, Discrete Math., 27 (1979), 47–57
GT78.
E. Gyori, G. Turan, Stack of pancakes, Studia Sci. Math. Hungar., 13 (1978), 133–137 (1981)
HCKP95.
S. Hannenhalli, C. Chappey, E. Koonin, P. Pevzner, Genome sequence comparison and scenarios for genome rearrangement: A test case, Genomics, (1995),
HBB92.
R. J. Hoffmann, J. L. Boore, W. M. Brown, A novel mitochondrial genome organization for the blue mussel, Mytilus edulis, Genetics, 131 (1992), 397–412
HK73.
John E. Hopcroft, R. Karp, An $n\sp{5/2}$ algorithm for maximum matchings in bipartite graphs, SIAM J. Comput., 2 (1973), 225–231
HP94.
S. B. Hoot, J. D. Palmer, Structural rearrangements including parallel inversions within the chloroplast genome of anemone and related genera, J. Molec. Evolution, 38 (1994), 274–281
J85.
Mark R. Jerrum, The complexity of finding minimum-length generator sequences, Theoret. Comput. Sci., 36 (1985), 265–289
KMS94.
S. Karlin, E. S. Mocarski, G. A. Schachtel, Molecular evolution of herpesviruses: Genomic and protein sequence comparisons, J. Virology, 68 (1994), 1886–1902
KS93.
John Kececioglu, David Sankoff, A. Apostolico, M. Crochemore, Z. Galil, U. Manber, Exact and approximation algorithms for the inversion distance between two chromosomesCombinatorial pattern matching (Padova, 1993), Lecture Notes in Comput. Sci., Vol. 684, Springer, Berlin, 1993, 87–105, Proc. 4th Annual Symposium, Italy
J. Kececioglu, D. Sankoff, Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement, Algorithmica, 13 (1995), 180–210
KS94.
John Kececioglu, David Sankoff, M. Crochemore, D. Gusfield, Efficient bounds for oriented chromosome inversion distanceCombinatorial pattern matching (Asilomar, CA, 1994), Lecture Notes in Comput. Sci., Vol. 807, Springer, Berlin, 1994, 307–325, Proc. 5th Annual Symposium
KDP93.
E. B. Knox, S. R. Downie, J. D. Palmer, Chloroplast genome rearrangements and evolution of giant lobelias from herbaceous ancestors, Molec. Biol. Evolution, 10 (1993), 414–430
K93.
E. V. Koonin, V. V. Dolja, Evolution and taxonomy of positive-strand RNA viruses: Implications of comparative analysis: of amino acid sequences, Crit. Rev. Biochem. Molec. Biol., 28 (1993), 375–430
K68.
Anton Kotzig, Moves without forbidden transitions in a graph, Mat. Časopis Sloven. Akad. Vied, 18 (1968), 76–80
L88.
M. F. Lyon, X-chromosome inactivation and the location and expression of X-linked genes, Amer. J. Human Genetics, 42 (1988), 8–16
PH87.
J. D. Palmer, L. Herbon, Unicircular structure of the Brassica hirta mitochondrial genome, Current Genetics, 11 (1987), 565–570
PH88.
J. D. Palmer, L. Herbon, Plant mitochondrial DNA evolves rapidly in structure, but slowly in sequence, J. Molec. Evolution, 27 (1988), 65–74
P94.
P. Pevzner, DNA physical mapping and alternating Eulerian cycles in colored graphs, Algorithmica, 13 (1995), 77–105
RJ92.
L. A. Raubeson, R. K. Jansen, Chloroplast DNA evidence on the ancient evolutionary split in vascular land plants, Science, 255 (1992), 1697–1699
SLA92.
D. Sankoff, G. Leduc, N. Antoine, B. Paquin, B. F. Lang, R. Cedergren, Gene order comparisons for phylogenetic inference: Evolution of the mitochondrial genome, Proc. Natl. Acad. Sci. U.S.A., 89 (1992), 6575–6579
S93.
D. Sankoff, Analytical approaches to genomic evolution, Biochimie, 7 (1993), 409–413
WPFJ89.
J. Whiting, M. Pliley, J. Farmer, D. Jeffery, In situ hybridization analysis of chromosomal homologies in Drosophila melanogaster and Drosophila virilis, Genetics, 122 (1989), 99–109
WEHM82.
G. A. Watterson, W. J. Ewens, T. E. Hall, A. Morgan, The chromosome inversion problem, J. Theoret. Biol., 99 (1982), 1–7

Information & Authors

Information

Published In

cover image SIAM Journal on Computing
SIAM Journal on Computing
Pages: 272 - 289
ISSN (online): 1095-7111

History

Submitted: 17 June 1993
Accepted: 22 August 1994
Published online: 13 July 2006

MSC codes

  1. 68Q25
  2. 68Q05

Keywords

  1. computational molecular biology
  2. sorting by reversals
  3. genome rearrangements

Authors

Affiliations

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media

The SIAM Publications Library now uses SIAM Single Sign-On for individuals. If you do not have existing SIAM credentials, create your SIAM account https://my.siam.org.