Bayesian Models for Spatiotemporal Data from Transportation Networks

Abstract: Urbanization has caused a historical transformation at a global scale, and humanity is moving towards a fully connected society where cities will concentrate population, infrastructure and economic activity. A key element in the cities’ infrastructure is the transportation system, as it facilitates the mobility of people and goods. Transportation systems are constantly generating data from, e.g., GPS, sensors and cameras, and the statistical modeling is challenging due to the complex structure and dynamics of the system, and the inherent uncertainty. In this thesis, we develop Bayesian models with applications to transportation. We specifically focus on models that can be trained on spatiotemporal data coming from transport networks to make predictions on, e.g., bus delays or the actual network topology. Special attention has been given to model scalability issues and uncertainty quantification. We have used real-world data from transportation systems in every study to keep a balance between statistical rigor, novelty, and applicability. The thesis consists of four papers. The first study presents a state-of-the-art probabilistic latent network model to forecast multilayer dynamic graphs. The model uses stochastic blockmodeling to reduce the computational burden, and is illustrated on a sample of 10-year data from four major airlines within the US air transportation system. In the second paper, we develop a robust model for real-time bus travel time prediction that departs from Gaussian assumptions by using Student-t errors, and show how Bayesian inference naturally allows for predictive uncertainty quantification in a highly stochastic environment. Experiments are performed using data from high-frequency buses in Stockholm, Sweden. The third paper shows the potential of multi-output Gaussian processes to tackle network-wide travel time prediction in an urban area. We develop a responsive online model based on a coregionalized covariance and test its accuracy on real data from GPS-equipped taxis. Finally, we propose a novel regularization strategy for the vector autoregressive model that is based on a graphical spike-and-slab prior, and present a case study with real airline delay data to assess its predictive performance and analyze network patterns related to the propagation of delays across airports. 

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.