Contract Language Operator Specification
Contract Operators are executable Object predicates, based upon
Cougaar UnaryPredicates, that can be dynamically compiled,
executed, visualized, and statically analyzed.
Introduction:
This document introduces the existing COUGAAR filtering predicate,
UnaryPredicate, and the new language for
expressing these filters as comparable Operators.
Examples are given, followed by a BNF
specification.
An introduction to UnaryPredicates
A UnaryPredicate is a simple filtering predicate defined in Java:
public interface UnaryPredicate {
public boolean execute(Object o);
}
For example, one can write a predicate that matches any
String :
public class MyStringPred implements UnaryPredicate {
public boolean execute(Object o) {
return (o instanceof String);
}
}
One can then use MyStringPred to test any Java
Object :
{
MyStringPred strPred = new MyStringPred();
boolean b1 = strPred.execute("someString"); // true
boolean b2 = strPred.execute(null); // false
boolean b3 = strPred.execute(new Integer(3)); // false
// etc..
}
A simple use of a UnaryPredicate is to write an
"assertion" test in code, acting as a sanity check on method
parameters, fields, or so on. Another use is to filter a
Collection to the subset that match the predicate.
COUGAAR uses UnaryPredicates to define Subscription s, where
an COUGAAR Plugin specifies interest in Object s
contained in a shared blackboard (LogPlan). A Plugin is then
notified whenever an Object matching the Subscription's UnaryPredicate
is added-to/modified-in/removed-from the LogPlan.
One drawback to the pure-Java definition of UnaryPredicates is
their "black-box" behavior -- one is unable to view the original
source or reason about what the UnaryPredicate is "trying" to
select. Additionally the use of Java as the language requires a
full Java compiler (i.e. UnaryPredicates are often statically compiled
with the Plugin code) and allows excessive UnaryPredicate code
complexity.
An introduction to Operators
An Operator is an extension to UnaryPredicate :
public interface Operator extends UnaryPredicate {
// .. more methods here defined below ..
}
Unlike the basic UnaryPredicate interface, Operators come with an
implementation, a "language" and a simple parser.
The Operator parser will dynamically compile an Operator from either
a String representation (XML or Lisp-style as defined below) or from
an XML DOM tree. This flexablility will easily allow for run-time
definition and use of Operators in Plugin Subscriptions and queries.
The goal of the language is to expose a tree-like structure of the
filtering predicates. One benefit of this representation is the
ability to pretty-print the behavior of the predicate as
the tree-structure in XML or Lisp-style parenthesis form.
The tree-structure can then be used to compare Operators with other
Operators to compare their behavior. There are two comparisons that
can be performed, "implies" and "allows":
For any two Operators, A and B:
if "A implies B" then, for all Objects "o":
if (A.execute(o) == true) then (B.execute(o) == true).
For any two Operators, A and B:
if "A allows B" then there exists an Object "o" such that:
((A.execute(o) == true) and (B.execute(o) == true)).
The comparison algorithm is basically a tree comparison of the two
Operators, plus using some knowledge of the class heirarchies.
These analysis features will be used to specify the input/output
behavior of Plugins and determine when the output (publish) of one
Plugin matches the input (subscription) of another Plugin or Plugins.
The langauge supports type checks (instanceof), basic logical operators
(and/or/not), some set operators (exists/all/empty), and support for
Java reflection-based method calls on an arbitrary Object (e.g, for
String , one can call "equals", "toLowerCase", etc).
Further syntatic details are defined below in BNF form.
The choice of language primitives was partially guided by the analysis
goal -- the Operator language is not a "complete" programming language.
While the exclusion of some language constructs might limit the
expressibility of the Operators (i.e., no if/while/variables/functions),
this mini-language lends itself to better automated reasoning and can
always defer back to Java by using reflection.
List of Operator language primitives:
The Operator language defines it's primitives as Operators themselves.
Many of these primitives can contain other primitives as arguments,
forming a tree structure. Examples and full
BNF specification of the language is defined in
later sections of this document.
The language primitives/keywords are:
Logical Operators:
true
false
not
and
or
List Operators:
all
empty
exists
Reflective Operators:
is:[Not:][Null|classname]
-- e.g. is:Null, is:Not:Null, is:List, is:Not:String, etc.
[classname:]fieldname
-- e.g. Integer:MAX_VALUE, java.awt.Dimension:width, etc.
[classname:]methodname[-method_argument_decl]
-- e.g. equals, toLowerCase, etc. One rarely needs to specify
the optional type-declarations for the arguments; see the
notes on method resolution.
apply
-- typically only used by the parser.
Constant Operators:
"constant"
-- e.g. "SomeTestString", "5", etc.
const
-- allows one to define a constant of a non-String type.
This is rarely used in practice.
get
-- fetch an Operator.setConst(String name, Object value) constant,
which allows the Operator to act as a simple template.
In practice the primitives "..methodname..", "is:..", and "and" are
often used the most. Frequence statistics are detailed in the section
on Statistics.
Examples:
A full BNF specification is provided later in this
document -- this section provides illustrative examples of the predicate
language, many based on ALP data
structures. These examples are formatted in XML, one of many
representation formats.
"Always true"
<true/>
-- note the Operator.execute(Object) method will always
return true for this Operator; it is equivalent to:
public class MyTruePred implements UnaryPredicate {
public boolean execute(Object o) {
return true;
}
}
"A Task"
<is:Task/>
-- note the short-hand use of a default package prefix
for "Task", which is short for "org.cougaar.planning.ldm.plan.Task".
"A URL"
<is:java.net.URL/>
-- note "java.net" is not one of the default package prefixes,
so one must specify the full classname.
"An Asset that is not an Organization"
<and>
<is:Asset/>
<is:Not:Organization/>
</and>
-- note these instance checks are done in sequence
(left-to-right) and lazily (if not an Asset, the Organization check
is not performed). If both instance checks are true, then the top-level
"and" returns true.
"A Task with a non-null Verb"
<and>
<is:Task/>
<getVerb>
<is:Not:Null/>
</getVerb>
</and>
-- note once the Object has been cast to a
Task , the method "[Task.]getVerb" is called. The
result, a Verb , is then passed down to the "is:Not:Null"
check. As explained in the notes on method resolution
and the "apply" primitive, this is actually equivalent to:
<and>
<is:Task/>
<apply>
<getVerb/>
<is:Not:Null/>
</apply>
</and>
but the short-hand helps keep the predicates terse.
Another way to visualize this predicate is to imagine writing
a Java UnaryPredicate without using any variables:
public class MyPred implements UnaryPredicate {
pubic boolean execute(Object o) {
return
((o instanceof Task) &&
(((Task)o).getVerb() != null));
}
}
"An Allocation to an Organization"
<and>
<is:Allocation/>
<getAsset>
<is:Organization/>
</getAsset>
</and>
-- note this is very similar to the prior example.
"A String equal to 'TEST'"
<and>
<is:String/>
<equals>
<const value="TEST"/>
</equals>
</and>
-- note this is different than the prior example!
Once the type has been cast to String , the call to
"equals" is resolved to:
In class String:
"public boolean equals(String s) {..}"
Note that the parser selected the "equals(String)" method, not
the less-specific:
In class Object:
"public boolean equals(Object o) {..}"
In this example the "const" operator was used explicitly; one
could also write <equals>Test</equals> since
this constant is a String with no leading/trailing
whitespace.
Here's a look at the equivalent Java UnaryPredicate :
public class MyPred implements UnaryPredicate {
pubic boolean execute(Object o) {
return
((o instanceof String) &&
((String)o).equals("TEST"));
}
}
"An Asset with a type-id of 'MAINTENANCE'"
<and>
<is:Asset/>
<getTypeIdentificationPG>
<getTypeIdentification>
<equals>
<const value="MAINTENANCE"/>
</equals>
</getTypeIdentification>
</getTypeIdentificationPG>
</and>
-- note the use of nested method calls, each separately defined
to take zero arguments:
In class Asset:
"public TypeIdentificationPG getTypeIdentificationPG() {..}"
In class TypeIdentificationPG:
"public String getTypeIdentification() {..}"
then the final call to "equals", which is similar to the previous
example's resolution. A key difference here is that there is no
"equals(String)" method in TypeIdentification , so the basic:
In class Object:
"public boolean equals(Object o) {..}"
is used.
"An AssetTransfer of a Task with a verb matching the field-constant ReportForDuty"
<and>
<is:AssetTransfer/>
<getTask>
<getVerb>
<equals>
<org.cougaar.glm.Constants.Verb:ReportForDuty/>
</equals>
</getVerb>
</getTask>
</and>
-- note this is very similar to the prior example, but with a
static field reference to "ReportForDuty". Also note that this
class is not a default package, so the full classname must
be specified.
"A task which, in it's workflow, contains a subtask with a null plan element"
<and>
<is:Task/>
<getWorkflow>
<getTasks>
<exists>
<and>
<is:Task/>
<getPlanElement>
<is:Null/>
</getPlanElement>
</and>
</exists>
</getTasks>
</getWorkflow>
</and>
-- note that "getTasks" returns an Enumeration , and the
"exists" operator will test each element of the Enumeration to
see if it's "a subtask with a null plan element".
Statistics
Sample statistics for the average Operator size (in terms of the
number of primitives used) and primitive usage frequence are detailed
in the tables below. This sampling was gathered from predicates used
for a contract-analysis tool to examine the subscribe/publish behavior
of the COUGAAR Plugins in ALP's "MiniTestConfig" society. Details can be
found on the ALPINE web site.
Many of these predicates are similar to the ones defined in the
examples section above.
totals:
lines of formatted XML: 811
total number of "composed" Operators: 95
total number of primitives used: 522
average primitives per Operator: 5.5
frequency of primitives:
primitive #occurences out of 522 notes
---------------------------------------------------------------------
..methodname.. 204 (39%) 173 (33%) "get"-ers
54 (10%) "equals"
is:.. 173 (33%) 31 ( 6%) "is:Null"
and 74 (14%)
..fieldname.. 45 ( 8%) all static-finals
exists 16 ( 3%)
const 9 ( 2%)
not 1 ( 0.2%)
XML and parenthesized representations
Operators can contain other Operators as arguments, constructing a
tree. This tree can be represented in XML or by using Lisp-style
parentheses, and the Operator parser will accept either format.
This previously seen example:
<and>
<is:Task/>
<getVerb>
<is:Not:Null/>
</getVerb>
</and>
is equivalent to this Lisp-style parenthesis expression:
(and (is:Task) (getVerb (is:Not:Null)))
It's easy to convert from XML to parenthesis -- use these rules:
"<?>" becomes " (?"
"</?>" becomes ")"
"<?/>" becomes " (?)"
For example, "<a><b/><c><d/></c></a>"
becomes " (a (b) (c (d)))". The only special case is when an XML tag
has text, such as "<a>sometext</a>", which should be expressed
in parenthesis-form as "(a (\"sometext\"))".
Some users might prefer the XML representation, some the parenthesis
representation -- they are equivalent.
Language specification:
XML is an awkward format for defining a language BNF, so the
parenthesized format is used here. These representations are
equivalent.
Operators may contain other Operators as children, generating a tree
structure with arbitrary branching. The result is somewhat similar
to a decision tree that only specifies the TRUE leaves, considering
everything else to be FALSE.
Here are the BNF rules with detailed notes:
-- start here at S
S := OPERATOR
OPERATOR := (logicalOp | listOp | reflectOp | constantOp)
logicalOp := (trueOp | falseOp | andOp | orOp | notOp)
listOp := (allOp | emptyOp | existsOp)
reflectOp := (instanceOp | methodOp | fieldOp | applyOp)
constantOp := (constOp | getOp)
-- these are definitions of OPERATOR that return a boolean:
boolOp := (logicalOp | listOp | boolReflectOp | boolConstantOp)
boolReflectOp := (instanceOp | boolFieldOp | boolMethodOp | applyOp)
boolConstantOp := (boolConstOp | boolGetOp)
boolFieldOp := fieldOp -- where the field type is "boolean"
boolMethodOp := methodOp -- where the method return type is "boolean"
boolConstOp := constOp -- where the type is "boolean"
boolGetOp := getOp -- where the type is "boolean"
-- these are the operators:
trueOp := "(true)" -- returns boolean TRUE
falseOp := "(false)" -- returns boolean FALSE
andOp := "(and" ((" " boolOp)+) ")"
-- returns boolean logical-AND of each
boolOp, evaluated lazily from left to right
orOp := "(or" ((" " boolOp)+) ")"
-- returns boolean logical-OR of each
boolOp, evaluated lazily from left to right
allOp := "(all " boolOp ")" -- only applicable when the passed Object
is a "ConceptualList"; an instance of:
java.util.Collection
java.util.Iterator
java.util.Enumeration
Java_primitive_array
Returns the boolean TRUE if the
boolOp.execute(Object) returns TRUE
for all elements in the list.
emptyOp := "(empty)" -- only applicable when the passed Object
is a "ConceptualList", as defined in allOp.
Returns the boolean TRUE if the list has
zero elements.
existsOp := "(exists " boolOp ")"
-- only applicable when the passed Object
is a "ConceptualList", as defined in allOp.
Returns the boolean TRUE if the
boolOp.execute(Object) returns TRUE
for any element in the list.
instanceOp := "(is:" ["Not:"] classname ")"
-- returns boolean if passed
Object is/is-not
an instance of the classname. If the
given Object passes the
instance test then this type information
can be used by methodOp
fieldOp := "(" [classname ":"] fieldname ")"
-- if the field is non-static then
the passed Object must
be cast by a prior instanceOp to a
class that defines this field. The fact
that this is a field and not a method is
determined by Java reflection; in the
case that both a field and method with
the same name exist, the method is taken.
methodOp := "(" [classname ":"] methodname [methodargdecls] OPERATOR* ")"
-- if the method is non-static then
the passed Object must
be cast by a prior instanceOp to a
class that defines this method. If
the method is defined to take zero
arguments but a single argument is
specified then this is shorthand for
"(apply " ..method.. " " OPERATOR ")"
otherwise the number of specified arguments
must match the method declaration. See
the notes on method resolution.
applyOp := "(apply " OPERATOR " " boolOp ")"
-- takes the results of the OPERATOR and
passes it to the boolOp; the result
is the boolOp's boolean. Note that
methodOp uses shorthand for the applyOp.
constOp := (("\"" value "\"") | ("(const " [constOp " "] constOp ")"))
-- the value defaults to a
String . In the "(const " format,
if the first optional argument is given then
the type is cast to that type. For example,
"(const \"int\" 5)"
In XML this can be formatted as
"<const type=\"int\">5</const>
getOp := "(get " [constOp " "] constOp ")"
-- similar to a regular constOp, but the
last constOp argument is used as an identifier
"variable", as set by the external Operator
user. For example, one can set the value using:
Operator.set(String key, Object value)
and use this to make the Operator act as a
template.
-- these are some Java-related class specifications
classname := [packagename "."] name_of_a_Java_class
-- for example, "List" or "java.net.URL".
The default packages "imported" by the parser
are currently:
java.lang
java.util
org.cougaar.planning.ldm.plan
org.cougaar.planning.ldm.asset
org.cougaar.planning.ldm.measure
org.cougaar.glm.asset
references to other packages must
use the full classname, such as
"(is:java.net.URL)"
packagename := name_of_a_Java_package
-- for example, "java.net"
methodname := name_of_a_method_in_the_current_Object
-- the method
type := (classname | Java_primitive)
-- for example, "List" or "int"
methodargdecls := "-" ((type ":")*)
-- rarely used to clear up
methodOp/applyOp ambiguity; see the
notes on method resolution.
-- these are approximations of the Java specifications:
char := ("a" | .. | "z" | "A" | .. | "Z" | "0" | .. | "9")
value := (char | "." | "-" | "_" | ":" | "\" | "\"")*
name_of_a_Java_class := (char | "_")+
name_of_a_Java_package := char (["."] char)*
name_of_a_method_in_the_current_Object := char (["_"] char)*
Java_primitive := ("boolean" | "byte" | "char" | "short" |
"int" | "float" | "double" | "long")
-- this is ambiguous with the classname rule,
but you get the idea...
-- I'm not so sure this BNF would be accepted by a parser tool like YACC,
but they are close.
Notes on method resolution
First, note that this explanation uses the parenthesized format,
matching the formal BNF specification.
If the methodOp has N arguments and a corresponding method of N
type-compatable parameters exists, this is the match. If there are
multiple matches then this is ambiguous and would require the full
method definition with the method-type-declarations.
If the multi-parameter method is found, then the current Object
is passed to all the arguments.
The methodOp quietly uses applyOp when the resolved method has zero
arguments but one argument is specified. For example,
"(and (is:Exception) (getMessage (is:Null)))"
actually means
"(and (is:Exception) (apply (getMessage) (is:Null)))"
This is a useful shorthand that saves many "(apply " wrappings.
The ambiguity arrives in method polymorphism and is rare in practice.
For example, suppose class "X" had two methods:
"boolean getY()" and "boolean getY(boolean)".
The Operator
"(and (is:X) (getY (is:Null)))"
is ambiguous without specifying if this means:
"(and (is:X) (apply (getY-) (is:Null)))"
or
"(and (is:X) (getY-boolean (is:Null)))"
|