U2.T2 |
Provide T2's main APIs; this doc also explains how T2 engine works.
Note: this is Prototype version 0.3. Its engine works very differently than
the previous versions.
Random testing is fully automatic, though one must keep in mind
that the resulting code coverage is very limited. It should be used as
a complement to other testing methods.
The units subjected to testing by T2 are Java classes. T2 checks
for class invariant as well as the specifications of methods, if they
are provided. T2 does not use a special specification language. All
specifications are in plain Java.
How T2 tests
Testing a class is done by a test engine. This is implemented by the
class {@link U2.T2.RndEngine RndEngine}. An instance of this class is
what we mean by T2 test engine.
Given a target class C, a test engine tests C by generating
random executions. Each execution starts by creating an
instance of C, which takes the role of the target
object. During each step of the execution, the test engine either
randomly updates a field of the target object, or randomly calls a
method of C. When a method m is called, the engine either passes the
target object as the receiver of m, or as a parameter. After each
step, the target object will be checked against the class invariant of
C, if one is specified. Furthermore, if the step calls a method,
internal error and run time exceptions are checked. If the method has
a specification, it will be checked as well. See also the doc of
{@link U2.T2.RndEngine RndEngine}.
When a violation is found, the execution will be reported. To
report an execution we will need to print the state of involved
objects at each step of the execution. This means that we need to be
able to replay an execution. To be able to do this we maintain a meta
representation of the ongoing execution. This meta representation is
called {@link U2.T2.Trace execution trace} (though internally it is
actually a tree rather than a linear trace). The important thing about
this execution trace is that it allows us to replay the actual
execution, exactly as it was. Furthermore, executation traces can be
saved, so they can be used in regression.
During an execution the test engine will need to generate objects,
e.g. to be passed as parameters when methods are called. Since in real
execution objects may be linked to each other, the test engine has to
be able to reuse old objects rather than keep generating fresh
objects. To facilitate this the test engine maintains an {@link
U2.T2.Pool object pool}. Whenever objects are created during an
execution, they are put in the pool. When an execution needs an
object, the engine can decide to just pick one (of the right type)
from the pool rather than creating a fresh one. Each object in the
pool also receives a unique integer ID. This ID is very
important. When an object from the pool is reused, we remember its ID
in the execution trace, so that when the trace is played again we know
exactly which objects are, e.g. passed as arguments to a method
call.
Whenever a new execution is started, the used pool has to be
reset. This makes sure we start from a fresh pool, free from side
effect of the previous execution.
The algorithm for 'generating' objects is actually a bit more
complicated than above. Suppose the engine has to generate an object
of class E. The algorithm is as follows:
- Since E can be an interface or an abstract class, and hence
has no constructor, the engine first consults an {@link
U2.T2.InterfaceMap interface map}. This map can tell the engine to
generate an instance of another class E' instead, which should be a
concrete implementation of E. Currently we only provide a very small
and incomplete interface map, which is hardwired in the engine; it
should be made parameterizable in the future. Anyway, if you implement
your own interface map, it you the map's responsibility to provide a
consistent mapping.
- Next the engine tries to find an instance of E (or E') in a
{@link U2.T2.BaseDomain base domain}. The domain is passed to the
engine upon creation. A base domain is essentially just a set of
objects. When the engine can find an instance of E in the domain, it
will clone it and use the clone.
Because the base domain is always checked first we can use it to
limit the base value space from which the engine generates objects.
For example, if the only intergers in the base domain are -1 and 1,
then these will be the only integers the engine generates whevener it
needs one. Alternatively, we can choose to use a base domain that can
supply a random integer from the entire range of int values.
Because of the cloning objects in the base domain are also safe
from the engine's side effect (in contrast, objects in the pool are
not, and should not, protected from the engine's side effect). The
cloning relies on serialization, so we should only put serializable
objects in a base domain. The cloning step is not done by the base
domain itself; it is the responsibility of the engine.
When looking for an instance of E in a base domain we will
not look for instances from subclasses of E.
- Only when we can't find an instance of E (E') in the base
domain then we'll either look for one in either the pool or creating a
fresh; this goes as described before.
T2 Class Coverage
When testing a class C, T2 basically randomly call methods of C and
check each call against the specification provided in C. But which
methods should be tested? Obviously we should test public methods. But
how about e.g. private and protected methods? How about methods from
the super and subclasses of C?
All methods and fields covered when testing C are called the
interface points of C. Having more interface points obviously
makes the testing more complete, but on the other hand makes it
multiplies the number of possible behaviors. Certain type of methods,
e.g. private methods, are arguably less relevant to test, so we
exclude them.
This is currently the class coverage setting in T2:
--------------------------
Own class
Constructor
private NO
default YES
protected YES
public YES
Field
private NO
static NO
final NO
ELSE IF:
default YES
protected YES
public YES
Method
private NO
ELSE IF:
default YES, if either non-static or can accept CUT-object as parameter
protected YES, if either non-static or can accept CUT-object as parameter
public YES, if either non-static or can accept CUT-object as parameter
static YES, if it can accept CUT as parameter
--------------------------
Superclass
Field
private NO
static NO
final NO
default NO
protected NO
ELSE IF:
public YES
Method
private NO
default NO
protected NO
ELSE IF:
public YES, if either non-static or can accept CUT-object as parameter
static YES, if it can accept CUT as parameter
--------------------------
Subclass NO
--------------------------
Default and protected members of superclasses are currently excluded
simply because it is a bit more work to get to them.
The above coverage is currently hard wired. We plan to make customizable.
Replaying Executions
The main API of {@link U2.T2.RndEngine RndEngine} will save violating
executions it found. After fixing the target class we can replay the
saved executions to see if they still cause violation. This process is
called regressing. The class {@link U2.T2.Regression Regression}
provides the main API to do this task.
Ownership Property
When C defines aggregates objects, its specifications, e.g. class
invariant, usually specify some relation between subobjects. Now when
testing C, T2 only covers the methods of C. In particular, T2 will not
check the effect of executions of the subobjects. We could, but this
would blow up the possibilities. We have not decided what to do
here. Future work.
Concurrency
T2's model of execution is only good for testing properties in a pure
single threaded setup (well, since the order of method calls in each
execution is random, T2's execution model is also good enough to model
multi threaded execution when all methods within its coverage space
are synchronized and do not spawn new threads). Testing against fully
concurrent multi threads cannot be done in T2. Running multi threads
is not the problem. The problem is reproducing a violating
(concurrent) execution. T2's meta representation of execution cannot
represent actions whose atomicity is finer than method calls (or field
updates).
Testing Temporal Properties?
T2's model of execution also allows temporal properties to be tested,
for example the method below expressed a safety property that if the
sponsor field is ever set to non-null, then its value will then never
change.
public static boolean temporalProp(told,tnew) {
return (told.sponsor!=null && tnew.sponsor == told.sponsor)
}
Not implemented yet.
|