Sunday, May 31, 2015

Automatic Test Case Generation: Part 1

While discussing software test with a co-worker, we thought it would be interesting to see if we could automatically generate inputs to an arbitrary function to see if we can stimulate every branch and every conditional in an application.

Academic papers and commercial tools exist to do this, but I would like the experience of doing this myself. Additionally, furthering my understanding of programming language techniques, building a virtual machine, and working with Constraint Problem Solvers sounds very interesting.

There will be multiple steps to this process:
  1. Define a simplified programming language to target.
  2. Develop lexer/parser
  3. Develop an interpreter/VM that can:
    1. Evaluate a given function
    2. Run test cases from a file
    3. Provide Statement, MC/DC Coverage statistics
    4. Display in a human readable format
  4. Write a constraint solver. After sufficiently toiling with my custom constraint solver, use a open source constraint solver.
This post will cover Steps 1 and 2. 
First, defining the language. The design factors for this language are:
  • Easy to write & parse
  • Limited types
    • boolean, uint32, int32, float32
    • Types can be expanded as needed, but we will start with a limited set. 
  • Limited operators
    •  'NOT', 'AND', 'OR', 'XOR', 'GT', 'LT', 'EQ'
    • Can also be expanded as needed. 
  • No local variables
    • Support for this could be added, but for now we will skip it. 
  • Limited control flow
    • Allow if/else
    • No loops (for, while). See below rule.
    • Recursion is allowed. See above rule.
With these limited requirements, a sample test script can be created:

Using ANTLR4, a G4 grammar definition can also be created. I am relatively unfamiliar with this tool. I suspect the grammar could be cleaner, but this should be adequate for now. The executionStatement -> evaluationStatement -> operatorStatement part of the tree seemingly should be simplified to operatorStatement, but I don't know how to do this in ANTLR4, yet.

Running this grammar file against our sample, ANTLR4 provides us with a reasonable representation.


In the next part, I will develop a virtual machine to execute this script.  

No comments:

Post a Comment