Numerical Algorithms for Mapping of Multiple Quantitative Trait Loci in Experimental Populations

University dissertation from Uppsala : Acta Universitatis Upsaliensis

Abstract: Most traits of medical or economic importance are quantitative, i.e. they can be measured on a continuous scale. Strong biological evidence indicates that quantitative traits are governed by a complex interplay between the environment and multiple quantitative trait loci, QTL, in the genome. Nonlinear interactions make it necessary to search for several QTL simultaneously. This thesis concerns numerical methods for QTL search in experimental populations. The core computational problem of a statistical analysis of such a population is a multidimensional global optimization problem with many local optima. Simultaneous search for d QTL involves solving a d-dimensional problem, where each evaluation of the objective function involves solving one or several least squares problems with special structure. Using standard software, already a two-dimensional search is costly, and searches in higher dimensions are prohibitively slow.Three efficient algorithms for evaluation of the most common forms of the objective function are presented. The computing time for the linear regression method is reduced by up to one order of magnitude for real data examples by using a new scheme based on updated QR factorizations. Secondly, the objective function for the interval mapping method is evaluated using an updating technique and an efficient iterative method, which results in a 50 percent reduction in computing time. Finally, a third algorithm, applicable to the imputation and weighted linear mixture model methods, is presented. It reduces the computing time by between one and two orders of magnitude.The global search problem is also investigated. Standard software techniques for finding the global optimum of the objective function are compared with a new approach based on the DIRECT algorithm. The new method is more accurate than the previously fastest scheme and locates the optimum in 1-2 orders of magnitude less time. The method is further developed by coupling DIRECT to a local optimization algorithm for accelerated convergence, leading to additional time savings of up to eight times. A parallel grid computing implementation of exhaustive search is also presented, and is suitable e.g for verifying global optima when developing efficient optimization algorithms tailored for the QTL mapping problem.Using the algorithms presented in this thesis, simultaneous search for at least six QTL can be performed routinely. The decrease in overall computing time is several orders of magnitude. The results imply that computations which were earlier considered impossible are no longer difficult, and that genetic researchers thus are free to focus on model selection and other central genetical issues.