Understanding and improving microbial cell factories through Large Scale Data-approaches
Abstract: Since the advent of high-throughput genome sequencing methods in the mid-2000s, molecular biology has rapidly transitioned towards data-intensive science. Recent technological developments have increased the accessibility of omics experiments by decreasing the cost, while the concurrent design of new algorithms have improved the computational work-ﬂow needed to analyse the large datasets generated. This has enabled the long standing idea of a systems approach to the cell, where molecular phenomena are no longer observed in isolation, but as parts of a tightly regulated cell-wide system. However, large data biology is not without its challenges, many of which are directly related to how to store, handle and analyse ome-wide datasets.The present thesis examines large data microbiology from a middle ground between metabolic engineering and in silico data management. The work was performed in the context of applied microbial lignocellulose valorisation with the end goal of generating improved cell factories for the production of value-added chemicals from renewable plant biomass. Three diﬀerent challenges related to this feedstock were investigated from a large data-point of view: bacterial catabolism of lignin and its derived aromatic compounds; tolerance of baker’s yeast Saccharomyces cerevisiae to inhibitory compounds in lignocellulose hydrolysate; and the non-fermentable response to xylose in S. cerevisiae engineered for growth on this pentose sugar.The bibliome of microbial lignin catabolism is vast and consists of a long-standing cohort of fundamental microbiology, and a more recent cohort of applied lignin biovalorisation. Here, an online database was created with the long-term ambition of closing the gap between the two and make new connections that can fuel the generation of new knowledge. Whole-genome sequencing was used to investigate the genetic basis for observed phenotypes in bacterial isolates capable of growing on different kinds of lignin-derived aromatics. A whole-genome approach was also used to identify key sequence variants in the genotype of an industrial S. cerevisiae strain evolved for improved tolerance to inhibitors and high temperature. Finally, assessment of the sugar signalome of S. cerevisiae was enabled by the design and validation of a panel of in vivo ﬂuorescent biosensors for single-cell cytometric analysis. It was found that xylose triggered a signal similar to that of low glucose in yeast cells engineered with xylose utilization pathways, and that introduction of deletions previously related to improved xylose utilization altered the signal towards that of high glucose.Taken together, the present thesis illustrates how omics-approaches can aid design of laboratory experiments to increase the knowledge and understanding of microorganisms, and demonstrates the need for a combined knowledge of molecular and computational biology in large-scale data microbiology.
CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)