Big Data Analytics for eMaintenance Modeling of high-dimensional data streams

University dissertation from Luleå tekniska universitet

Abstract: Big Data analytics has attracted intense interest from both academia and industry recently for its attempt to extract information, knowledge and wisdom from Big Data. In industry, with the development of sensor technology and Information & Communication Technologies (ICT), reams of high-dimensional data streams are being collected and curated by enterprises to support their decision-making. Fault detection from these data is one of the important applications in eMaintenance solutions with the aim of supporting maintenance decision-making. Early discovery of system faults may ensure the reliability and safety of industrial systems and reduce the risk of unplanned breakdowns. Both high dimensionality and the properties of data streams impose stringent challenges on fault detection applications. From the data modeling point of view, high dimensionality may cause the notorious “curse of dimensionality” and lead to the accuracy deterioration of fault detection algorithms. On the other hand, fast-flowing data streams require fault detection algorithms to have low computing complexity and give real-time or near real-time responses upon the arrival of new samples. Most existing fault detection models work on relatively low-dimensional spaces. Theoretical studies on high-dimensional fault detection mainly focus on detecting anomalies on subspace projections of the original space. However, these models are either arbitrary in selecting subspaces or computationally intensive. In considering the requirements of fast-flowing data streams, several strategies have been proposed to adapt existing fault detection models to online mode for them to be applicable in stream data mining. Nevertheless, few studies have simultaneously tackled the challenges associated with high dimensionality and data streams. In this research, an Angle-based Subspace Anomaly Detection (ABSAD) approach to fault detection from high-dimensional data is developed. Both analytical study and numerical illustration demonstrated the efficacy of the proposed ABSAD approach. Based on the sliding window strategy, the approach is further extended to an online mode with the aim of detecting faults from high-dimensional data streams. Experiments on synthetic datasets proved that the online ABSAD algorithm can be adaptive to the time-varying behavior of the monitored system, and hence applicable to dynamic fault detection.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.