Enhancing Short-Term Traffic Prediction for Large-Scale Transport Networks by Spatio-Temporal Clustering
Abstract: Congestion in large cities is responsible for extra travel time, noise, air pollution, CO2 emissions, and more. Transport is one of the main recognized contributors to global warming and climate change, which is getting increasing attention from authorities and societies around the world. Better utilization of existing resources by Intelligent Transport Systems (ITS) and digital technologies are recognized by the European Commission as technologies with enormous potential to lower the negative impacts associated with high traffic volumes in urban areas.The main focus of this work is on short-term traffic prediction, which is an essential tool in ITS. In combination with providing information, it enables proactive decisions to decrease severity of congestion that occurs regularly or is caused by incidents. The main contribution of this work is to develop a methodological framework and prove its enhancing effects on short-term prediction in the context of large-scale transport networks. It is expected to contribute to more robust and accurate predictions of ITS in traffic management centers.Traffic patterns in large-scale networks, including urban streets, can be heterogeneous during the day and from day-to-day. This work investigates spatio-temporal clustering of heterogeneous data sets to smaller, more homogeneous data sub-sets. This is expected to produce more robust, accurate, scalable, and cost-effective prediction models. This thesis is the collection of five papers that contribute to enhancing short-term traffic prediction in this context. The clustering is recognized to boost prediction performance in Papers II, III, IV, and V. Paper II considers network partitioning and the last three papers study day clustering. The prediction models used across included papers are naive historical mean prediction models and more advanced prediction models such as probabilistic principal component analysis (PPCA) and exponential smoothing. Paper I considers and facilitates floating car data (FCD) as a cost-effective opportunistic source of speed and travel time data with extensive network coverage.Common practice in determining the number of clusters is to rely on internal evaluation indices, and these are very efficient but isolated from application. Paper IV tests this practice by also considering performance in short-term prediction application. Our results show that relying on these indices can lead to a loss of prediction accuracy of about 20% depending on the considered prediction model. Dimensionality reduction has a minimal effect on the resulting prediction performance, but clustering needs 20 times less computational time and only 0.1% of the original information.Finally, in Paper V, we look at similarities of representative day clusters recognized by speed and flows. Furthermore, the interchangeability of speed day-type centroids for flow when predicting speeds has proven to be robust, which is not a case for predicting flows by speed day-type centroids and observations.
CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)