Neural Network Ensembles and Combinatorial Optimization with Applications in Medicine

University dissertation from Department of Theoretical Physics

Abstract: Artificial neural network (ANN) and combinatorial optimization algorithms are developed, and applied to the medical domain. A novel method for training an ensemble of ANN is presented, based on random weight updates alternated with replication of networks with low error. The evolution of the ensemble is explored, with particular emphasis on its diversity and internal correlation. The performance is tested on three datasets, and found comparable to a Bayesian algorithm and better than a Bagging ensemble. Hermite decomposition of ECGs is performed, and the coefficients are used as inputs to ANNs for predicting myocardial infarction, with good results. A case-based method for explaining the operation of the ANN is presented, based on perturbing a small number of inputs in a limited interval so as to maximize the change in output. A cost function for maximizing this change while satisfying the constraints is defined in a Potts spin formulation. After optimization using mean field annealing, a perturbed ECG is reconstructed from the perturbed Hermite coefficients. The perturbed ECG leads are found to match those deemed critical by a human expert in half of the cases. The question of what inputs features to use when training ANNs to interpret myocardial perfusion SPECT images is also studied. It is concluded that using additional clinical data as ANN input does not improve the predictive performance. A novel approach for multiple structure alignment of proteins is presented, based on fuzzy pairwise alignments of each protein to a virtual consensus chain. These alignments are alternated with translations and rotations of the proteins onto the consensus structure, and with updating the consensus chain. The pairwise alignments use mean-field annealing of fuzzy alignment variables, based on a cost expressed in terms of distances between aligned atoms and of gaps. Our approach is tested against a set of protein families from the HOMSTRAD database, and against another algorithm based on Monte Carlo, with good results.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.