Exact inference in Bayesian networks and applications in forensic statistics

Abstract: Bayesian networks (BNs) are commonly used when describing and analyzing relationships between interacting variables. Approximate methods for performing calculations on BNs are widely used and well developed. Methods for performing exact calculations on BNs also exist but are not always considered, partly because these methods demand strong restrictions on the structure of the BN. Part of this thesis focuses on developing methods for exact calculations in order make them applicable to larger classes of BNs. More specifically, we study the variable elimination (VE) algorithm, which traditionally can only be applied to finite BNs, Gaussian BNs, and combinations of these two types. We argue that, when implementing the VE algorithm, it is important to properly define a set of factors that represents the conditional probability distributions of the BN in a suitable way. Furthermore, one should strive for defining this factor set in such a way that it is closed under the local operations performed by the algorithm: reduction, multiplication, and marginalization. For situations when this is not possible, we suggest a new version of the VE algorithm, which is recursive and makes use of numerical integration. We exemplify the use of this new version by implementing it on $\Gamma$-Gaussian BNs, i.e., Gaussian BNs in which the precision of Gaussian variables can be modeled with gamma distributed variables. Bayesian networks are widely used within forensic statistics, especially within familial relationship inference. In this field, one uses DNA data and knowledge about genetic inheritance to make calculations on probabilities of familial relationships. When doing this, one needs not only DNA from the people to be investigated, but also data base information about population allele frequencies. The possibility of mutations makes these calculations harder, and it is important to employ a reasonable mutation model to make the calculations precise. We argue that many existing mutation models alter the population frequencies, which is both a mathematical nuisance and a potential problem when results are interpreted. As a solution to this, we suggest several methods for stabilizing mutation models, i.e., tuning them so that they no longer alter the population frequencies.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)