Probabilistic Programming for Birth-Death Models of Evolution
Abstract: Phylogenetic birth-death models constitute a family of generative models of evolution. In these models an evolutionary process starts with a single species at a certain time in the past, and the speciations—splitting one species into two descendant species—and extinctions are modeled as events of non-homogenous Poisson processes. Different birth-death models admit different types of changes to the speciation and extinction rates.The result of an evolutionary process is a binary tree called a phylogenetic tree, or phylogeny, with the root representing the single species at the origin, internal nodes speciation events, and leaves currently living—extant—species (in the present time) and extinction events (in the past). Usually only a part of this tree, corresponding to the evolution of the extant species and their ancestors, is known via reconstruction from e.g. genomic sequences of these extant species.The task of our interest is to estimate the parameters of birth-death models given this reconstructed tree as the observation. While encoding the generative birth-death models as computer programs is easy and straightforward, developing and implementing bespoke inference algorithms are not. This complicates prototyping, development, and deployment of new birth-death models.Probabilistic programming is a new approach in which the generative models are encoded as computer programs in languages that include support for random variables, conditioning on the observed data, as well as automatic inference. This thesis is based on a collection of papers in which we demonstrate how to use probabilistic programming to solve the above-mentioned task of parameter inference in birth-death models. We show how these models can be implemented as simple programs in probabilistic programming languages. Our contribution also includes general improvements of the automatic inference methods.
CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)