Strings are a fundamental data type in virtually all programming languages. Their generic nature can, however, lead to many subtle programming bugs, some with security consequences, e.g., cross-site scripting (XSS), which is among the OWASP Top 10 Application Security Risks. One effective automatic testing method for identifying subtle programming errors is based on symbolic execution.
Symbolic execution of string manipulating programs is essentially to solve the feasibility of the execution paths in those programs. A path in string manipulating programs is given as a finite sequence of assignments and assertions (i.e. without loops), and checking its feasibility amounts to solving the satisfiability of string constraints obtained from the assignments and assertions in the path. Modern programming languages (e.g. JavaScript, PHP, and Python) support many complex string operations, and strings are also often implicitly modified during a computation in some intricate fashion (e.g. by some autoescaping mechanisms). Therefore, supporting complex string operations in string constraint solvers is desirable in order to smooth the symbolic execution of string manipulating programs.
Dr. WU Zhilin, an associate research professor from State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, conducted a study on decision procedures for path feasibility of string manipulating programs with complex string operations.
Dr. WU and his co-authors proposed two general semantic conditions ensuring the decidability of path feasibility problem, and designed one conceptually simple and generic automata-based decision procedure for string manipulating programs satisfying the two semantic conditions. Moreover, they provided an implementation of the decision procedure and developed a novel string constraint solver called OSTRICH. The OSTRICH solver, on the one hand, is expressive, since it has built-in support for complex string operations like concatenation, reverse, functional transducers, and replace All function, and on the other hand, is efficient, since its performance is comparable to the current main-stream string constraint solvers when solving the widely-used string constraint benchmarks. Furthermore, OSTRICH is extensible since the generic decision procedure allows users to introduce their own new string operations satisfying the semantic conditions, while still preserving the completeness.
The results demonstrate that in string constraint solving, it is possible to achieve completeness without sacrificing efficacy, in contrast to the state of the art that almost all string constraint solvers resort to heuristics without completeness guarantee, especially when involving complex string operations.
The study entitled “Decision Procedures for Path Feasibility of String-Manipulating Programs with Complex Operations” has been published in the 46th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2019).