MODELING AND MARSHALING: MAKING TESTS FROM MODEL CHECKER COUNTEREXAMPLES

Paul E. Black
National Institute of Standards and Technology
Gaithersburg, Maryland 
paul.black@nist.gov


ABSTRACT

Recently model checkers have been applied to software areas such as
analyzing protocols and algorithms, measuring test adequacy, and
generating abstract tests from formal models.  When using model
checkers to generate tests, the generated tests are execution traces
of the models.  Thus the type, occurrence, and order of variables,
calls, and events in the execution traces are intimately tied to the
choice of modeling representation.

We briefly review how to use a model checker to generate tests from a
high-level representation, such as MATLAB?, UML, or SCR.  Since the
model checker uses a particular general model, the analyst has choices
about how a piece of software may be modeled.  We list some choices
and discuss their advantages and disadvantages.  We also describe a
program to marshal model variables from resultant model checker traces
and translate them into function calls, program variables, and other
software artifacts.


INTRODUCTION

A model checker is a formal evaluation tool whose inputs are a state
machine and a set of temporal logic predicates on possible executions
of the state machine.  If a predicate is not true, that is,
inconsistent with the state machine, the model checker reports the
inconsistency and gives a counterexample, if possible.  A
counterexample is an execution trace of the state machine showing how
the predicate is false. Among the popular model checkers are SMV [1]
and Spin [2].

Figure 1 is an example of a simple state machine specification in SMV.
One variable, pressure, is an input, that is, its value may change
arbitrarily.  The other variable, mode, is just a discrete summary of
pressure.

VAR
	mode: {tooLow, okay, tooHigh};
	pressure: 0..300;
ASSIGN
	mode := case
		pressure > 150: tooHigh;
		pressure >= 50: okay;
		1: tooLow;
	esac;
Figure 1. A State Machine Specification in SMV

In addition to the state machine, we can declare predicates quantified
over possible alternative executions or over future states.  For
instance, we may say that in all states, the mode may be too high
eventually.  We may also say that in all states, if the pressure is
less than 50, the mode is too low.

SPEC AG EF mode = tooHigh
SPEC AG (pressure < 50 -> mode = tooLow)

The temporal operator AG means for all (A) alternative executions and
for all states along (G) paths beginning at the current one, the
predicate holds.  The operator EF means there exists (E) some
alternative execution in which there exists some future (F) state
where the predicate holds.  Briefly, the first predicate says, "For
all states, mode may become too high."

Although model checking began as a method for verifying hardware, it
has successfully been used to analyze software, such as air traffic
flight rules, protocols, operating systems, and security.  Its
advantage over other formal methods is that although some models may
take exponential time, checking predicates is completely automatic.


GENERATING TESTS FROM SPECIFICATIONS

In this section we outline a relatively new method to automatically
generate complete test cases from specifications [3].  The method is
automatic since human input is not needed once the model is created.
Before describing the method, let us define a few terms.  A complete
test case includes expected output, as well as input.  A test
criterion is the overall strategic goal, or a judgment about what
aspects one wishes to test.  Some criteria are code branch coverage,
boundary testing, random testing, use cases, or mutation adequacy.  A
test objective is a specific, tactical goal.  Here are some typical
test objectives.

* Execute the program so branch #27 is not taken.
* Choose inputs just within the x<y boundary.
* Create an account, make a deposit to it, and then try to delete it.

The steps in this method are (1) automatically extract a state machine
model from a higher level specification, (2) select a test criterion
and generate test objectives, (3) use a model checker to generate
counterexamples, and (4) combine duplicates and minimize to produce a
few tests which cover all the counterexamples.  The rest of this
section describes each step in more detail.

Where Does the Model Come From?

Although automatic test generation can improve software development,
analysts cannot be expected to manually redo their designs as
relatively low-level state machines to use the method described here.
To improve productivity and reduce errors, the state machine model
must be taken from existing specifications.  Some work has been done
in this area.  For instance, Atlee and Buckley show how to turn SCR
(Software Cost Reduction) specifications into state machines for SMV
[4]. In addition, we have a proof-of-concept tool to extract state
machines from UML StateChart diagrams or OCL (Object Constraint
Language) specifications, and another tool to extract machines from
MATLAB Stateflow or Simulink models.

We have also showed how to extract sound, small models from models
with very large, or even unbounded, domains [5].  The transformations
used to extract a small model are reversible, which is necessary to
turn tests on the small model into tests on the original models.  For
large specifications, the analyst may need to extract many small
models and generate tests from each one, since model checkers can only
handle limited domains.

How Does a Model Checker Make Tests?

A model checker can be induced to find counterexamples that are useful
as tests by carefully defining temporal logic predicates.  Suppose a
predicate asserts that a test objective cannot be achieved.  If the
test objective can be, in fact, achieved, the model checker produces a
counterexample demonstrating that the predicate is wrong.  In other
words, the counterexample disproves the predicate by giving an
execution satisfying that very test objective!

To illustrate how test objectives can be expressed as predicates to
generate counterexamples, suppose the test criterion is state
coverage.  That is, a set of tests is acceptable if they drive the
system into every state at least once.  For the system in Figure 1,
the states of mode are tooLow, okay, and tooHigh.  The following
predicates assert that these states can never be reached.

SPEC AG !(mode = tooLow)
SPEC AG !(mode = okay)
SPEC AG !(mode = tooHigh)

The model checker shows that these predicates are false by producing
counterexamples, or executions of exactly how these states can be
achieved.  Suppose the pressure begins at zero and increases at most
20 each step.  The counterexample for mode okay is shown in Figure 2.

state 2.1:
mode = tooLow
pressure = 0

state 2.2:
pressure = 20

state 2.3:
pressure = 40

state 2.4:
mode = okay
pressure = 56

Figure 2. Counterexample of "Okay" Mode

Notice that the counterexample has all the information to make a
complete test case that satisfies the test objective: it lists the
inputs and expected state changes.

We have written a program that scans SMV output looking for
counterexamples and writes the associated variable changes on a single
line, for further processing.

State coverage is a simplistic criterion to illustrate the idea.  In
practice, we can use any criterion whose test objectives can be stated
as temporal logic predicates.  Indeed many useful criteria, such as
mutation adequacy, automata theoretic, branch coverage, disconnection
faults, and stuck-at faults, have this property.  Most importantly,
the test objectives for these criteria can be automatically generated
from state machine descriptions.

Winnowing the Counterexamples

Often the same counterexample is produced for many test objectives.
Hence we can reduce the size of the final test set by combining
syntactically identical counterexamples.  Some counterexamples are
prefixes of longer counterexamples.  Since a prefix of a
counterexample exercises a subset of the machine that the
counterexample exercises, prefixes may be discarded.  Thus, a single
step of sorting and syntactic comparing reduces the number of
counterexamples, and hence the size of the resulting test set, by an
order of magnitude.

Since a single test case typically satisfies more than one objective,
we may further reduce the size of the final test set by choosing a
subset of the counterexamples.  Although finding a minimum subset
which covers the objectives is NP-complete, a simple greedy heuristic
usually produces a subset about half the size of the original.

With these steps we can automatically generate a relatively small set
of counterexamples satisfying the chosen criterion.  However to turn
the counterexamples into abstract test cases, we must reverse any
modeling transformations done in creating the state machine model.
The abstract test cases may then be turned into code.

Getting Tests From Counterexamples

After winnowing, each counterexample is converted to a format suitable
for NIST's Test Assistant for Objects (TAO) [6], a test generation
tool.  In TAO's nomenclature, each counterexample becomes a scenario.
The program we wrote, ce2tcases, expands implicit variable values,
when necessary, to provide actual values in each step.  It turns each
variable occurrence into either a production or a check, depending on
whether it is an input or an output.  It uses a user-provided
configuration file to make these conversions.

Figure 3 is a configuration file for the state machine in Figure 1.
The variable methods section lists the input variables and how they
are set.  Double semicolons (;;) separate different parts.  The
leftmost parts are matched with variable occurrences in the
counterexample.  The %d matches any number.  For example, pressure =
%d matches pressure = 20 and sets the parameter p1 to 20.  Each
matching occurrence in the counterexample yields a call to setPress
with the first (and only) parameter equal to the matched number.

More complex expressions can be built up using registers.  For
example, a variable's occurrence can store a value in a register, then
the occurrence of a different variable uses the register's value, say
as a parameter.

The variable tests section describes how to check the output
variables.  Each possible value of mode is enumerated, and the
corresponding function return value is given.  The notation iut refers
to the instance under test.  The apostrophe (') signifies the value
after the function is called.

The variable skew section tells when to use respective values of the
variables.  In this model pressure and mode are taken from the same
step.  The mode should be checked in every step whether it has changed
or not.

variableMethods = {
	pressure = %d;; setPress;; p1=$1
}
variableTests = {
	mode = tooLow;; iut.getMode()'=low
	mode = okay;;  iut.getMode()'=okay
	mode = tooHigh;; iut.getMode()'=hi
}
variableSkew = {
STEP
	pressure
	mode, EVERY
}

Figure 3. ce2tcases Configuration File

Without further explanation, we present part of the output of
ce2tcases corresponding to the counterexample in Figure 2.

test_q_PRE_q-SP2 = setPress1 CHECK1 
	setPress2 CHECK2 setPress3 CHECK3 
	setPress4 CHECK4;
setPress1().p1 = 0;
setPress2().p1 = 20;
setPress3().p1 = 40;
setPress4().p1 = 56;
CHECK1: "q_PRE_q-SP2: mode = tooLow"
	iut.getMode()' == tooLow;
CHECK2: "q_PRE_q-SP2: mode = tooLow"
	iut.getMode()' == tooLow;
CHECK3: "q_PRE_q-SP2: mode = tooLow"
	iut.getMode()' == tooLow;
CHECK4: "q_PRE_q-SP2: mode = okay"
	iut.getMode()' == okay;

Turning Tests Into Code

In the final step of test generation, we use TAO to turn the abstract
test cases into C or Java code.  We provide an additional file,
declaring function calls, and TAO generates code to create new test
instances, call the interface functions to set and get values, make
sure the check conditions hold, and report any differences.


MODELING STYLES

As mentioned previously, model checkers require software entities,
such as variables, function calls, and events, to be modeled in a
particular style.  For instance, SMV uses a nondeterministic finite
state machine with the next state selected by constraint satisfaction
and parallel assignment.  Software is usually written very
differently: simple control flow resulting in serial assignment.
Additionally summary properties of software, such as permission to
read a file or the current permission mode, may be a single variable
in the model, but may not be represented directly in software at all.
Even the division of software activities into distinct steps is
somewhat arbitrary.  Thus one must recast the software entities and
concepts of interest.  There are many ways to model software; here we
explain three.

Single Phase

The simplest model is single phase.  That is, in each simulation step
inputs are determined and the results computed.  Figure 4 is a diagram
of the state machine given in Figure 1 showing that it is a
single-phase model.

	     (Figure not available)

Figure 4. Machine in Figure 1 is Single Phase.

Two Phase

Because of parallel assignment and constraint satisfaction the simple
single-phase model may not be able to express the behavior of complex
software.  Consider the following example of a dependency loop.  In a
secure operating system, a process may only change a file's security
level if the process level is greater than the file level.  We could
model this with an additional variable, status, which is true if the
operation succeeds.  The file level becomes the new level if the
operation succeeds; otherwise, it stays the same.  The operation
succeeds if the process level is greater than the file's; otherwise it
fails.

next(file_level) := case
	next(status): next(new_level);
	1: file_level;
esac;
status := proc_level >= file_level;

Consider the possible results if the process level is three, the file
level is five, and the new level is two.  Since the process level is
less than the file level, the operation fails in the actual system.
SMV could consistently make the status zero (fail) and the file level
five (unchanged).  However since SMV has parallel assignment and
constraint satisfaction, it could also consistently make status equal
to one (succeed) and the file level equal to two (the new level),
which is wrong!

Splitting the computation between steps prevents a dependency loop
between variables.  The preceding example may be corrected by
referring to the status in the current step, not the next step.

next(file_level) := case
	status: next(new_level);
	1: file_level;
esac;
status := proc_level >= file_level;

Figure 5 diagrams the two-phase update that prevents a dependency
loop.  We leave out the new level to make the diagram clearer.  Note
that the status is computed from the values in the same simulation
step, and the file level is computed from the previous status.

	     (Figure not available)

Figure 5. Two-Phase Update Prevents Loops.

The configuration file for this model reflects the skew in the
variables.  The process level is carried from one step to the next
step where it is used with the new level and the resulting file level.
The status is not reported because it is an artifact of modeling, not
a value in the software.  We want to report new level and check the
file level every step whether they change or not, so they are marked
accordingly.

variableSkew = {
	proc_level
STEP
	new_level, EVERY
	file_level, EVERY
}

Both the status computation and state update are delayed until the
next step in the variant diagrammed in Figure 6.  The change to the
SMV is to refer to next(status) in both the computation of the next
file level and the left hand side of the status computation.  The
diagram in Figure 6 makes the difference quite apparent.

	     (Figure not available)

Figure 6. Two-Phase Update, Delayed.

Implementation Skew

The final example we show is implementation skew.  Some software is
event, or edge, sensitive, instead of value, or level, sensitive.
That is, we need some history to determine the next state.  In SMV
this history must be kept by new variables that hold previous values
of variables.

For instance, the automobile cruise control mode [3] changes from
cruise to override when the brake is pressed.  When the brake is
released, the mode remains override until Resume goes true.  Even if
Resume were true when the brake is released, the mode would stay
override: there must be a Resume event, that is, it was false and
becomes true, for the mode to return to cruise.  Figure 7 shows the
skew of variables holding previous values.

	     (Figure not available)

Figure 7. Implementation Skew.


FUTURE WORK

We are recruiting industrial and academic groups to develop an
open-source reader for Matlab Stateflow and Simulink models.  In
addition to being available for other analysis tools, we will use the
reader as the basis of a tool to convert Matlab models to SMV.  We are
also working on a UML StateChart to SMV converter.

Along with researchers at UMBC and GMU, we are working on identifying
and grading additional test criteria, such as mutation adequate and
transition pair coverage.  We hope to be able to suggest which
criteria will be best for different specification styles and testing
needs.

Finally we are applying these techniques to large, real specifications
from industry to further validate them and make them practically
useful.  The marshaling program must be more flexible, intuitive, and
complete.


CONCLUSIONS

We explained a recent technique to automatically generate complete
tests from formal specifications.  We have also shown how modeling
decisions affect resulting counterexamples.  This paper outlines a
table-driven program to reverse modeling transformations so the tests
correspond to the original software.  Eventually all designers should
benefit from having a "bag of tricks" which include state machine
modeling and variable marshaling techniques.

[1] McMillan, Kenneth L., 1993, Symbolic Model Checking, Kluwer
Academic Publishers.

[2] Holzmann, Gerald J., 1997, The Model Checker SPIN, IEEE Trans. on
Software Eng., 23(5), pp.  279-295.

[3] Ammann, Paul A., Paul E. Black, William Majurski, 1998, Using
Model Checking to Generate Test from Specifications, Proc. 2nd IEEE
Intern'l Conf. on Formal Eng. Methods, IEEE, pp. 46-54.

[4] Atlee, Joanne M., M. A. Buckley, 1996, A Logic-Model Semantics for
SCR Software Requirements, Proc. 1996 Intern'l Symp. on Software
Testing and Analysis, pp. 280-292.

[5] Ammann, Paul A., Paul E. Black, 1999, Abstracting Formal
Specifications to Generate Software Tests via Model Checking,
Proc. 18th Digital Avionics Systems Conf., IEEE, section 10.A.6.

[6] Majurski, William J., 2000, Issues in Software Component Testing,
to appear in Computing Surveys, ACM.