Technical Language Supervision for Intelligent Fault Diagnosis

Abstract: Condition Monitoring (CM) is widely used in industry to meet sustainability, safety, and equipment efficiency requirements. Intelligent Fault Diagnosis (IFD) research focuses on automating CM data analysis tasks, to detect and prevent machine faults, and provide decision support. IFD enables trained analysts to focus their efforts on advanced tasks such as fault severity estimation and preventive maintenance optimization, instead of performing routine tasks. Industry datasets are rarely labelled, and IFD models are therefore typically trained on labelled data generated in laboratory environments with artificial or accelerated fault development. In the process industry, fault characteristics are often context-dependent and difficult to predict in sufficient detail due to the heterogeneous environment of machine parts. Furthermore, fault development is non-linear and measurements are subject to varying background noise. Thus, IFD models trained on lab data are not expected to transfer well to process industry environments, and require on-site pre-training or fine-tuning to facilitate accurate and advanced fault diagnosis. While ground truth labels are absent in industrial CM datasets, analysts sometimes write annotations of faults and maintenance work orders that describe the fault characteristics and required actions. These annotations deviate from typical natural language due to the technical language used, characterised by a high frequency of technical terms and abbreviations. Recent advances in natural language processing have enabled simultaneous learning from unlabelled pairs of images and captions through Natural Language Supervision (NLS). In this thesis, opportunities to enable weakly supervised IFD using annotated but otherwise unlabelled CM data are investigated. This thesis proposes novel machine learning methods for joint representation learning for IFD directly on annotated CM data. The main contributions are: (1) the introduction and implementation of technical language supervision to merge advances in natural language processing and, including a literature survey; (2) the creation of a method to improve technical languageprocessing by substituting out-of-vocabulary technical words with natural language descriptions, and to evaluate language model performance without explicit labels or downstream tasks; (3) the creation of a method for small-data language-based fault classification using human-centricvisualisation and clustering. Preliminary results for sensor and cable fault detection show an accuracy of over 90%. These results imply a considerable increase in the value of annotated CM datasets through the implementation of IFD models directly on industry data, e.g. for improving the decision support to avoid unplanned stops

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.