Methods for Travel Pattern Analysis Using Large-Scale Passive Data

Abstract: Comprehensive knowledge of travel patterns is crucial to enable planning for a more efficient traffic system that accommodates human mobility demand. Currently, this knowledge is mainly based on traffic models based on relatively small samples of observations collected from travel surveys and traffic counts. The data is expensive to collect and provides only partial observations of travel patterns. With the rise of new technology, new largescale passive data sources can be used to analyse travel patterns. This thesis aims to expand the knowledge about how to use cellular network data collected by cellular network operators and smart-card data from public transit systems to analyse travel patterns. The focus is particularly on the data processing methods needed to extract travel patterns. The thesis’s contributions include new methods for extracting trips, estimating travel demand, route inference and travel mode choice from cellular network data and a method to extract travel behaviour changes from smart-card data. Different approaches are proposed to evaluate the methods: the validation using experimental data, validation using other available data sources, and comparison of results obtained using different methods. The findings include that methods for extracting travel patterns from largescale passive data need to account for the data’s characteristics. Paper II illustrates that route inference from Call Detail Records by strictly following the used cell towers’ locations is problematic due to the noise and low resolution of the data. Both rule-based and machine learning methods can be used to extract travel patterns. Paper I shows that a rule-based stop detection algorithm can be used to extract longer trips from cellular network data reliably. On the other hand, Paper III shows that for travel mode classification of trips extracted from cellular network data, supervised classification can outperform rule-based methods. Unsupervised machine learning can be used to find patterns without prior specification. Paper V shows how clustering of smart-card data could be used to group public transit users by travel behaviour to understand the effects of a disruption. Supervised machine learning requires training data. When no or little training data is available, using semi-supervised learning is a promising approach as demonstrated in Paper IV. In the studies of this thesis, real-world, large-scale passive datasets have been used to demonstrate how the extraction of travel patterns works under realistic circumstances. This has exposed limitations due to the data source’s characteristics and limitations due to possible sample bias. At the same time, the studies of this thesis show the potential of using large-scale passive data. Changes in travel patterns can be identified quickly as new data can be collected continuously. Due to the large sample size, the data allows understanding travel patterns based on observations instead of relying on traffic models’ underlying assumptions. 

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.