Inferring Evolutionary Processes of Humans

University dissertation from Uppsala : Acta Universitatis Upsaliensis

Abstract: More and more human genomic data has become available in recent years by the improvement of DNA sequencing technologies. These data provide abundant genetic variation information which is an important resource to help us to understand the evolutionary history of humans. In this thesis I evaluated the performance of the Approximate Bayesian Computation (ABC) approach for inferring demographic parameters for large-scale population genomic data. According to simulation results, I can conclude that the ABC approach will continue to be a useful tool for analysing realistic genome-wide population-genetic data in the post-genomic era. Secondly, I implemented the ABC approach to estimate the pre-historic events connected with the “Bantu-expansion”, the spread of peoples from West Africa. The analysis based on genetic data with a large number of loci support a rapid population growth in west Africans, which lead to their concomitant spread to southern and eastern Africa. Contrary to hypotheses based on language studies, I found that Bantu-speakers in south Africa likely migrated directly from west Africa, and not from east Africa. Thirdly, I evaluated Thomson's estimator of the time to most recent common ancestor (TMRCA). It is robust to different recombination rates and the least-biased compared to other commonly used approaches. I used the Thomson estimator to infer the genome-wide distribution of TMRCA for complete human genome sequence data in various populations from across the world and compare the result to simulated data. Finally, I investigated and analysed the effects of selection and demography on genetic polymorphism patterns. In particular, we could detect a clear signal in the distribution of TMRCA caused by selection for a constant-size population. However, if the population was growing, the signal of selection will be difficult to detect under some circumstances. I also discussed and gave a few suggestions that might lead to a more realistic path of successful identification of genes targeted by selection in large-scale genomic data.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)