Toward Accident Prevention Through Machine Learning Analysis of Accident Reports

Abstract: Occupational safety remains of interest in the construction sector. The frequency of accidents has decreased in Sweden but only to a level that remains constant over the last ten years. Although Sweden shows to be performing better in comparison to other European countries, the construction industry continues to contribute to a fifth of fatal accidents in Europe. The latter situation pushes towards the need for reducing the frequency and fatalities of occupational accident occurrences in the construction sector. In the Swedish context, several initiatives have been established for prevention and accident frequency reduction. However, risk analysis models and causal links have been found to be rare in this context. The continuous reporting of accidents and near-misses creates large datasets with potentially useful information about accidents and their causes. In addition to that, there has been an increased research interest in analysing this data through machine learning (ML). The state-of-art research efforts include applying ML to analyse the textual data within the accumulated accident reports, identifying contributing factors, and extracting accident information. However, solutions that are created by ML models can lead to changes for a company and the industry. ML modelling includes a prototype development that is accompanied by the industry’s and domain experts’ requirements. The aim of this thesis is to investigate how ML based methods and techniques could be used to develop a research-based prototype for occupational accident prevention in a contracting company. The thesis focus is on the exploration of a development processes that bridges ML data analysis technical part with the context of safety in a contracting company. The thesis builds on accident causation models (ACMs) and ML methods, utilising the Cross Industry Standard Process Development Method (CRISP-DM) as a method. These were employed to interpret and understand the empirical material of accident reports and interviews within the health and safety (H&S) unit. The results of the thesis showed that analysing accident reports via ML can lead to the discovery of knowledge about accidents. However, there were several challenges that were found to hinder the extraction of knowledge and the application of ML. The identified challenges mainly related to the standardization of the development process and, the feasibility of implementation and evaluation. Moreover, the tendency of the ML-related literature to focus on predicting severity was found not compatible either with the function of ML analysis or the findings of accident causation literature which considers severity as a stochastic element. The analysis further concluded that ACMs seemed to have reached a mature stage, where a new approach is needed to understand the rules that govern the relationships between emergent new risks – rather than the systemization of risks themselves. The analysis of accident reports by ML needs further research in systemized methods for such analysis in the domain of construction and in the context of contracting companies – as only few research efforts have focused on this area regarding ML evaluation metrics and data pre-processing.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)