Enhancing prediction and causal inference in metabolic dyshomeostasis

Abstract: This thesis is focused on two globally prevalent diseases: i) non-alcoholic fatty liver disease (NAFLD) and ii) type 2 diabetes (T2D), with an overall aim of improving prediction and causal inference in the context of these conditions. Our projects were mainly conducted using IMI DIRECT and UK Biobank datasets including multi-omics data, extensive environmental exposures, and biological intermediates.In paper I, we utilized structural equation modeling to test the 'twin-cycle' hypothesis concerning interactions between the liver and the pancreas in the etiology of T2D. Furthermore, the association of physical activity with glycemic control was investigated within the twin-cycle hypothesis. Our results showed the association of physical activity with several metabolic traits and factors. Moreover, the mediation effect of basal insulin secretion rate, insulin sensitivity and liver fat was identified from physical activity towards glucose regulation.In paper II, we developed a series of machine learning-based models for the diagnosis of fatty liver, using different combinations of complex clinical and omics input data, to screen at-risk populations for NAFLD. Beta-cell function and insulin sensitivity appeared to be the most informative predictors in the developed diagnostic models. Furthermore, the derived importance lists of each data set (clinical, genetic, transcriptomic, proteomic, and metabolomic) were highlighting previous findings and suggesting potential molecular features of the NAFLD etiology.In paper III, Bayesian network and Mendelian randomization approaches were deployed to examine a range of putative causal associations underlying the development of fatty liver. Our analyses identified basal insulin secretion rate and visceral fat as two key drivers. In addition, the sensitivity analysis on diabetes and non-diabetes strata identified a network mostly dominated by dysglycemia in presence of T2D, whereas, it was mainly controlled by excess adiposity in the absence of T2D. In paper IV, genotype-based recall (GBR) clinical trials, in which the genetic burden of individuals is used in recruiting two groups of participants with a high and low genetic risk score, were simulated and compared with the conventional randomized controlled trials (RCTs) in terms of their statistical power and the required sample sizes. The analysis showed that GBR trials are, under several diverse scenarios, more powerful than conventional RCTs for testing gene-treatment interactions.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)