Computational Methods for Image-Based Spatial Transcriptomics

Abstract: Why does cancer develop, spread, grow, and lead to mortality? To answer these questions, one must study the fundamental building blocks of all living organisms — cells. Like a well-calibrated manufacturing unit, cells follow precise instructions by gene expression to initiate the synthesis of proteins, the workforces that drive all living biochemical processes.Recently, researchers have developed techniques for imaging the expression of hundreds of unique genes within tissue samples. This information is extremely valuable for understanding the cellular activities behind cancer-related diseases.  These methods, collectively known as image-based spatial transcriptomics (IST) techniques,  use fluorescence microscopy to combinatorically label mRNA species (corresponding to expressed genes) in tissue samples. Here, automatic image analysis is required to locate fluorescence signals and decode the combinatorial code. This process results in large quantities of points, marking the location of expressed genes. These new data formats pose several challenges regarding visualization and automated analysis.This thesis presents several computational methods and applications related to data generated from IST methods. Key contributions include: (i) A decoding method that jointly optimizes the detection and decoding of signals, particularly beneficial in scenarios with low signal-to-noise ratios or densely packed signals;  (ii) a computational method for automatically delineating regions with similar gene compositions — efficient, interactive, and scalable for exploring patterns across different scales;  (iii) a software enabling interactive visualization of millions of gene markers atop Terapixel-sized images (TissUUmaps);  (iv) a tool utilizing signed-graph partitioning for the automatic identification of cells, independent of the complementary nuclear stain;  (v) A fast and analytical expression for a score that quantifies co-localization between spatial points (such as located genes);  (vi) a demonstration that gene expression markers can train deep-learning models to classify tissue morphology.In the final contribution (vii), an IST technique features in a clinical study to spatially map the molecular diversity within tumors from patients with colorectal liver metastases, specifically those exhibiting a desmoplastic growth pattern. The study unveils novel molecular patterns characterizing cellular diversity in the transitional region between healthy liver tissue and the tumor. While a direct answer to the initial questions remains elusive, this study sheds illuminating insights into the growth dynamics of colorectal cancer liver metastases, bringing us closer to understanding the journey from development to mortality in cancer.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)