Journeys in vector space: Using deep neural network representations to aid automotive software engineering

Abstract: Context - The automotive industry is in the midst of a transformation where software is becoming the primary tool for delivering value to customers. While this has vastly improved their product offerings, vehicle manufacturers are facing an urgent need to continuously develop, test, and deliver functionality, while maintaining high levels of quality. Increasing digitalization in the past decade allows us to turn to an interesting avenue for addressing this need, which is data . With activities in engineering and operating vehicles being increasingly recorded as data, and with rapid advances in machine learning, this work takes a data-driven, deep learning approach to solve tasks in automotive software engineering. Scope - This work focuses upon two automotive software engineering tasks, (1) assessing whether embedded software complies with specified design guidelines, and (2) generating realistic stimuli to test embedded software in virtual rigs. Contributions - First , as the main tool for solving the design compliance task, we train tasnet , a language model of automotive software. Then, we introduce DECO, a rule-based algorithm which assesses the compliance of query programs with the Controller-Handler automotive software design pattern. Utilizing the property of semantic regularity in language models, DECO conducts this assessment by comparing the geometric alignment between query and benchmark programs in tasnet 's representation space. Second , focusing upon stimulus generation, we train logan , a deep generative model of in-vehicle behavior. We then introduce MLERP, a rule-based algorithm which takes user-specified test conditions and samples logan to generate realistic test stimuli which adhere to the conditions. Using the property of interpolation in representation space for semantic combination, MLERP generates novel stimuli within the boundaries of specification. Third , staying with the testing use case, we improve logan to train silgan , which simplifies the specification of test conditions. Then, noting that sampling a generative model is less efficient, we introduce GRADES, a rule-based algorithm that uses a specially constructed objective to search for stimuli. GRADES is built upon the fact that neural networks in silgan are differentiable, and, given an appropriate objective, a gradient descent-based search in model representation space efficiently yields suitable stimuli. Fourth , we note that our recipe for solving automotive software engineering tasks consistently pairs a self-supervised foundation model with a rule-based algorithm operating in the model's representation space. This paradigm for building predictive models, which we refer to as 'pre-train and calculate', not only extracts nuanced predictions without any supervision, but is also relatively transparent. Fifth , with our predictive approach relying heavily upon properties in abstract representation space, we develop techniques that explain and characterize selected high-dimensional vector spaces. Overall , by taking a data-driven deep learning approach, techniques we introduce reduce manual effort in undertaking two crucial engineering tasks. This has a direct effect on improving the cadence of automotive software engineering without compromising the quality of delivery.