Unsupervised feature learning applied to condition monitoring

Abstract: Improving the reliability and efficiency of rotating machinery are central problems in many application domains, such as energy production and transportation. This requires efficient condition monitoring methods, including analytics needed to predict and detect faults and manage the high volume and velocity of data. Rolling element bearings are essential components of rotating machines, which are particularly important to monitor due to the high requirements on the operational conditions. Bearings are also located near the rotating parts of the machines and thereby the signal sources that characterize faults and abnormal operational conditions. Thus, bearings with embedded sensing, analysis and communication capabilities are developed. However, the analysis of signals from bearings and the surrounding components is a challenging problem due to the high variability and complexity of the systems. For example, machines evolve over time due to wear and maintenance, and the operational conditions typically also vary over time. Furthermore, the variety of fault signatures and failure mechanisms makes it difficult to derive generally useful and accurate models, which enable early detection of faults at reasonable cost. Therefore, investigations of machine learning methods that avoid some of these difficulties by automated on-line adaptation of the signal model are motivated. In particular, can unsupervised feature learning methods be used to automatically derive useful information about the state and operational conditions of a rotating machine? What additional methods are needed to recognize normal operational conditions and detect abnormal conditions, for example in terms of learned features or changes of model parameters? Condition monitoring systems are typically based on condition indicators that are pre-defined by experts, such as the amplitudes in certain frequency bands of a vibration signal, or the temperature of a bearing. Condition indicators are used to define alarms in terms of thresholds; when the indicator is above (or below) the threshold, an alarm indicating a fault condition is generated, without further information about the root cause of the fault. Similarly, machine learning methods and labeled datasets are used to train classifiers that can be used for the detection of faults. The accuracy and reliability of such condition monitoring methods depends on the type of condition indicators used and the data considered when determining the model parameters. Hence, this approach can be challenging to apply in the field where machines and sensor systems are different and change over time, and parameters have different meaning depending on the conditions. Adaptation of the model parameters to each condition monitoring application and operational condition is also difficult due to the need for labeled training data representing all relevant conditions, and the high cost of manual configuration. Therefore, neither of these solutions is viable in general. In this thesis I investigate unsupervised methods for feature learning and anomaly detection, which can operate online without pre-training with labeled datasets. Concepts and methods for validation of normal operational conditions and detection of abnormal operational conditions based on automatically learned features are proposed and studied. In particular, dictionary learning is applied to vibration and acoustic emission signals obtained from laboratory experiments and condition monitoring systems. The methodology is based on the assumption that signals can be described as a linear superposition of noise and learned atomic waveforms of arbitrary shape, amplitude and position. Greedy sparse coding algorithms and probabilistic gradient methods are used to learn dictionaries of atomic waveforms enabling sparse representation of the vibration and acoustic emission signals. As a result, the model can adapt automatically to different machine configurations, and environmental and operational conditions with a minimum of initial configuration. In addition, sparse coding results in reduced data rates that can simplify the processing and communication of information in resource-constrained systems. Measures that can be used to detect anomalies in a rotating machine are introduced and studied, like the dictionary distance between an online propagated dictionary and a set of dictionaries learned when the machine is known to operate in healthy conditions. In addition, the possibility to generalize a dictionary learned from the vibration signal in one machine to another similar machine is studied in the case of wind turbines. The main contributions of this thesis are the extension of unsupervised dictionary learning to condition monitoring for anomaly detection purposes, and the related case studies demonstrating that the learned features can be used to obtain information about the condition. The cases studies include vibration signals from controlled ball bearing experiments and wind turbines; and acoustic emission signals from controlled tensile strength tests and bearing contamination experiments. It is found that the dictionary distance between an online propagated dictionary and a baseline dictionary trained in healthy conditions can increase up to three times when a fault appears, without reference to kinematic information like defect frequencies. Furthermore, it is found that in the presence of a bearing defect, impulse-like waveforms with center frequencies that are about two times higher than in the healthy condition are learned. In the case of acoustic emission analysis, it is shown that the representations of signals of different strain stages of stainless steel appear as distinct clusters. Furthermore, the repetition rates of learned acoustic emission waveforms are found to be markedly different for a bearing with and without particles in the lubricant, especially at high rotational speed above 1000 rpm, where particle contaminants are difficult to detect using conventional methods. Different hyperparameters are investigated and it is found that the model is useful for anomaly detection with as little as 2.5 % preserved coefficients.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)