On Solving String Constraints

Abstract: Software systems are deeply involved in diverse human activities as everyone uses a variety of software systems on a daily basis. It is essential to guarantee that software systems all work correctly. Two popular methods for finding failures of software systems are testing and model checking. Various efficient testing and model checking approaches are satisfiability-based, where the core of the approaches is Satisfiability Modulo Theories (SMT) solvers for solving the path feasibility and/or reachability problems. The significant growth of string manipulating programs in modern programming languages, including Python and JavaScript, demands SMT solvers being capable of analysing string constraints. This thesis proposes two frameworks for checking the satisfiability of extensive classes of string constraints, discovers a new decidable fragment of string constraints, and introduces efficient solvers for solving string constraints.The first framework for checking the satisfiability of string constraints is based on Counter-Example Guided Abstract Refinement (Cegar) procedure, and applicable to diverse classes of string constraints. It is worth mentioning that the framework is the first one ever that can support both context-free membership and transducer constraints. The framework has two components: under-approximation and over-approximation. The under-approximation uses flat automata to restrict the search for a solution to only strings generated by a flat automaton. The over-approximation abstracts the input constraints and produces a counter-example of the abstraction. In the second framework for checking the satisfiability string constraints, the under-approximation uses parametric flat automata to restrict the domain of variables, thus allows better performance. Furthermore, the second framework is capable of solving string-number conversion constraints. It is a crucial characteristic since string-number conversion is a part of the definition of core semantics in numerous program languages such as Python and JavaScript. The thesis introduces a new decidable fragment of string constraints, called weakly-chaining. This fragment pushes the borders of decidability of string constraints by generalising the existing straight-line as well as the acyclic fragment of the string logic. The new decidable fragment is empirically useful as it helps string solvers guarantee termination in many more cases since the solvers do not provide any guarantee of termination to handle string constraints in general.The thesis also presents three efficient solvers for solving string constraints, called Trau, Trau+, and Z3-Trau. Trau uses the first framework presented above and is capable of solving a large class of constraints including transducer and context-free grammar. Trau+ is a later version of Trau and implemented the decision procedure of the weakly-chaining fragment in the over-approximation. Z3-Trau follows the second framework above and uses parametric flat automata for under-approximating the domain of variables. These three string solvers are evaluated on not only existing but also newly generated benchmarks. Evaluation results show that the solvers significantly outperform other state-of-the-art string solvers.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.