On Bicompositional Correlation

Abstract: A composition is a vector of positive components summing to a constant, usually taken to be 1. Hitherto the research on compositional correlation has mainly focused on the correlation between the components of composition. This thesis is concerned with modelling the correlation between two compositions. We introduce a generalization of the Dirichlet distribution to simultaneously describe two compositions, i.e. a bicompositional Dirichlet distribution. The covariation between the two compositions is modelled by a parameter ?. If ?=0, then the two compositions are independent. For compositions with two components, we prove for which ? the distribution exists. We also give expressions for the normalization constant and other properties, such as moments, marginal and conditional distributions. For compositions that have more than two components, we present expressions for the normalization constant and other properties for all non-negative integers ?. We also present a method for generating random numbers from the distribution for all ??0 and for some ?<0 if the compositions have two components. The method is based on the rejection method. We use this bicompositional distribution and a general measure of correlation based on the concept of information gain to calculate a measure of correlation between two compositions for a large number of models. Finally we present an estimator of the general measure of correlation. We compare two suggestions of confidence intervals for the general measure of correlation.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)