Combinatorics of genome rearrangements and phylogeny

Author: Niklas Eriksen; Kth; []

Keywords: ;

Abstract: This thesis deals with combinatorial problems taken frombioinformatics. In particular, we study the problem ofinferring distances between bacterial species by looking attheir respective gene orders. We regard one of the gene ordersas a permutation of the other. Given a set of valid operations,we seek the most parsimonious way to sort this permutation. Wealso look at the more complex problem of combining a set ofspecies into a phylogenetic tree, which shows the relationshipsbetween all species.The computer program Derange II by Blanchette andSanko® uses a greedy algorithm to estimate theevolutionary distance between two species. The success dependson a set of weights, which may be specified by the user. Wehave examined which weights are optimal, and also the qualityof this program using optimal weights.Derange II has been extended to solve the medianproblem, that is finding the permutation that is closest tothree other permutations. We then use this new version to buildphylogenetic trees directly from gene order permutations. Insome situations, this new method works much better thanprevious methods.There is an analytical expression for the evolutionarydistance between two species if the set of allowed operationsincludes only inversions (reversing a segment of genes).Allowing transpositions (swapping two adjacent segments) aswell, we have found a (1+")-approximation for this distance,where we have weighted the di®erent operations accordingto our results on the Derange II weights.