Learning of Timed Systems

University dissertation from Uppsala : Acta Universitatis Upsaliensis

Abstract: Regular inference is a research direction in machine learning. The goal of regular inference is to construct a representation of a regular language in the form of deterministic finite automaton (DFA) based on the set of positive and negative examples. DFAs take strings of symbols (words) as input, and produce a binary classification as output, indicating whether the word belongs to the language or not. There are two types of learning algorithms for DFAs: passive and active learning algorithms. In passive learning, the set of positive and negative examples is given and not chosen by inference algorithm. In contrast, in active learning, the learning algorithm chooses examples from which a model is constructed.Active learning was introduced in 1987 by Dana Angluin. She presented the L' algorithm for learning DFAs by asking membership and equivalence queries to a teacher who knows the regular language accepted by DFA to be learned. A membership query checks whether a word belongs to the language or not. An equivalence query checks whether a hypothesized model is equivalent to the DFA to be learned.The L' algorithm has been found to be useful in different areas, including black box checking, compositional verification and integration testing. There are also other algorithms similar to L' for regular inference. However, the learning of timed systems has not been studied before. This thesis presents algorithms for learning timed systems in an active learning framework.As a model of timed system we choose event-recording automata (ERAs), a determinizable subclass of the widely used timed automata. The advantages of ERA in comparison with timed automata, is that it is known priori the set of clocks of an ERA and when clocks are reset. The contribution of this thesis is four algorithms for learning deterministic event-recording automaton (DERA). Two algorithms learn a subclass of DERA, called event-deterministic ERA (EDERA) and two algorithms learn general DERA.The problem with DERAs that they do not have canonical form. Therefore we focus on subclass of DERAs that have canonical representation, EDERA, and apply the L' algorithm to learn EDERAs. The L' algorithm in timed setting requires a procedure that learns clock guards of DERAs. This approach constructs EDERAs which are exponentially bigger than automaton to be learned. Another procedure can be used to lean smaller EDERAs, but it requires to solve NP-hard problem.We also use the L' algorithm to learn general DERA. One drawback of this approach that inferred DERAs have a form of region graph and there is blow-up in the number of transitions. Therefore we introduce an algorithm for learning DERA which uses a new data structure for organising results of queries, called a timed decision tree, and avoids region graph construction. Theoretically this algorithm can construct bigger DERA than the L' algorithm, but in the average case we expect better performance.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)