Automated Gravel Road Condition Assessment : A Case Study of Assessing Loose Gravel using Audio Data

Abstract: Gravel roads connect sparse populations and provide highways for agriculture and the transport of forest goods. Gravel roads are an economical choice where traffic volume is low. In Sweden, 21% of all public roads are state-owned gravel roads, covering over 20,200 km. In addition, there are some 74,000 km of gravel roads and 210,000 km of forest roads that are owned by the private sector. The Swedish Transport Administration (Trafikverket) rates the condition of gravel roads according to the severity of irregularities (e.g. corrugations and potholes), dust, loose gravel, and gravel cross-sections. This assessment is carried out during the summertime when roads are free of snow. One of the essential parameters for gravel road assessment is loose gravel. Loose gravel can cause a tire to slip, leading to a loss of driver control.  Assessment of gravel roads is carried out subjectively by taking images of road sections and adding some textual notes. A cost-effective, intelligent, and objective method for road assessment is lacking. Expensive methods, such as laser profiler trucks, are available and can offer road profiling with high accuracy. These methods are not applied to gravel roads, however, because of the need to maintain cost-efficiency. In this thesis, we explored the idea that, in addition to machine vision, we could also use machine hearing to classify the condition of gravel roads in relation to loose gravel. Several suitable classical supervised learning and convolutional neural networks (CNN) were tested. When people drive on gravel roads, they can make sense of the road condition by listening to the gravel hitting the bottom of the car. The more we hear gravel hitting the bottom of the car, the more we can sense that there is a lot of loose gravel and, therefore, the road might be in a bad condition. Based on this idea, we hypothesized that machines could also undertake such a classification when trained with labeled sound data. Machines can identify gravel and non-gravel sounds. In this thesis, we used traditional machine learning algorithms, such as support vector machines (SVM), decision trees, and ensemble classification methods. We also explored CNN for classifying spectrograms of audio sounds and images in gravel roads. Both supervised learning and CNN were used, and results were compared for this study. In classical algorithms, when compared with other classifiers, ensemble bagged tree (EBT)-based classifiers performed best for classifying gravel and non-gravel sounds. EBT performance is also useful in reducing the misclassification of non-gravel sounds. The use of CNN also showed a 97.91% accuracy rate. Using CNN makes the classification process more intuitive because the network architecture takes responsibility for selecting the relevant training features. Furthermore, the classification results can be visualized on road maps, which can help road monitoring agencies assess road conditions and schedule maintenance activities for a particular road.