Whole-genome sequencing of many species has presented us with the opportunity to deduce the evolutionary relationships between each and every nucleotide. The problem of determining all such relationships is that of multiple whole-genome alignment. Most previous work on whole-genome alignment has focused on the pairwise case and on the string pattern-matching aspect of the problem. However, to completely describe and determine the evolution of nucleotides in multiple genomes, refined definitions as well as algorithms that go beyond pattern matching are required. This thesis addresses these issues by introducing new evolutionary terms and describing novel methods for alignment at both the whole-genome and nucleotide levels.

Precise definitions for the evolutionary relationships between nucleotides, presented at the beginning of this work, provide the framework within which our methods for genome alignment are described. The sensitivity of alignments to parameter values can be ascertained through the use of alignment polytopes, which are explained. For the problem of aligning multiple whole genomes, this work presents a method that constructs orthology maps, which are high-level mappings between genomes that can be used to guide nucleotide-level alignments. Combining our methods for orthology mapping and alignment polytope determination, we construct a parametric alignment of two whole fruit fly genomes, which describes the alignment of the two genomes for all possible parameter values. The usefulness of whole-genome and parametric alignments in comparative genomics is shown through studies of cis-regulatory element evolution and phylogenetic tree reconstruction.





Download Full History