xtc.typical

Java Source Code / Java Documentation

1.	6.0 JDK Core
2.	6.0 JDK Modules
3.	6.0 JDK Modules com.sun
4.	6.0 JDK Modules com.sun.java
5.	6.0 JDK Modules sun
6.	6.0 JDK Platform
7.	Ajax
8.	Apache Harmony Java SE
9.	Aspect oriented
10.	Authentication Authorization
11.	Blogger System
12.	Build
13.	Byte Code
14.	Cache
15.	Chart
16.	Chat
17.	Code Analyzer
18.	Collaboration
19.	Content Management System
20.	Database Client
21.	Database DBMS
22.	Database JDBC Connection Pool
23.	Database ORM
24.	Development
25.	EJB Server geronimo
26.	EJB Server GlassFish
27.	EJB Server JBoss 4.2.1
28.	EJB Server resin 3.1.5
29.	ERP CRM Financial
30.	ESB
31.	Forum
32.	GIS
33.	Graphic Library
34.	Groupware
35.	HTML Parser
36.	IDE
37.	IDE Eclipse
38.	IDE Netbeans
39.	Installer
40.	Internationalization Localization
41.	Inversion of Control
42.	Issue Tracking
43.	J2EE
44.	JBoss
45.	JMS
46.	JMX
47.	Library
48.	Mail Clients
49.	Net
50.	Parser
51.	PDF
52.	Portal
53.	Profiler
54.	Project Management
55.	Report
56.	RSS RDF
57.	Rule Engine
58.	Science
59.	Scripting
60.	Search Engine
61.	Security
62.	Sevlet Container
63.	Source Control
64.	Swing Library
65.	Template Engine
66.	Test Coverage
67.	Testing
68.	UML
69.	Web Crawler
70.	Web Framework
71.	Web Mail
72.	Web Server
73.	Web Services
74.	Web Services apache cxf 2.0.1
75.	Web Services AXIS2
76.	Wiki Engine
77.	Workflow Engines
78.	XML
79.	XML UI

Java

Java Tutorial

Illustrator Tutorials

GIMP Tutorials

C# / C Sharp

C# / CSharp Tutorial

C# / CSharp Open Source

SQL Server / T-SQL Tutorial

Oracle PL / SQL

Oracle PL/SQL Tutorial

Flash / Flex / ActionScript

VBA / Excel / Access / Word

XML

XML Tutorial

Microsoft Office PowerPoint 2007 Tutorial

Microsoft Office Excel 2007 Tutorial

Microsoft Office Word 2007 Tutorial

Java Source Code / Java Documentation » Parser » Rats Parser Generators » xtc.typical

xtc.typical
The Typical type checker generator. This package implements Typical, a language and compiler for specification and generation of type checkers. Typical is a domain specific language based on ML, and currently supports Java as a target language. The compiler's main class is {@link xtc.typical.Typical}. The rest of this document covers the following topics: ML Ancestry Features Library Usage and example ML Ancestry Typical is a domain specific language built on the functional core of the O'Caml dialect of ML. Notably; Typical has the following syntactic differences from O'Caml: `";"` instead of `";;"` for terminators "," instead of ";" for list separators "mltype" instead of "type" "mlvalue" instead of "value" no global "rec" since recursion is automatically detected Features Typical also provides the following built-in constructs, types, and features specifically for expressing type checker specifications. Type Representation Typical represents source language types using the variant type `raw_type`. For example, the type system for an implementation of the Simply Typed Lambda Calculus with Integer and String values can be expressed as: mltype raw_type = IntegerT \| StringT \| FunctionT of type * type ; The raw_type variant is sufficient to represent the type system of some languages such as the Simply Typed Lambda Calculus, ML and Typical itself. For other languages, however, we need to capture information about attributes; for example, C's qualifiers and storage classes. For representing attributed type systems, Typical provides the `attribute` and `eqattribute` constructs. For example, C's type system could be expressed as mltype raw_type = VoidT \| IntT \| PointerT of type \| ... ; mltype storage_class = Auto \| Extern \| Register \| Static \| Typedef ; mltype qualifier = Const \| Volatile \| Restrict ; attribute storage : storage_class ; attribute qualifiers : qualifier list; The Typical compiler collects all attributes and combines them with the `raw_type` variant to create a record representation of types which will be used by the typechecker. mltype type = { type : raw_type; storage : storage_class; qualifiers = qualifer list; ... } ; By default, Typical evaluates type equality as structural equality ignoring attributes introduced with `attribute` as opposed to `eqattribute`. This default behavior may be overridden for variant types by using the `equality` construct which specifies which constructor arguments to compare. For example, consider the following definition of C�s structure type: mltype raw_type = ... \| StructT of int * string * member list ; Consistent with the C standard, it uses its integer argument as a nonce that distinguishes between structures that have the same name and members but are declared in different scopes. The corresponding equality declaration (meaning two `StructT`s compare equal if they have equal first arguments) reads: equality raw_type = ... \| StructT (n, _, _) ; Name Management Features Typical supports name management by providing (1) a symbol table that is implicitly available to all programs, (2) the `scope` construct to declare which AST nodes introduce which scopes, and (3) the `namespace` construct to declare individual namespaces, the types of their values, and which AST nodes specify which names. Typical provides the following constructs for symbol table operations: `define`, `redefine`, `lookup`, and `lookup_locally`. The constructs have the structures shown: define no type [error\|warn "error message"] [at loc] redefine no type lookup no [error\|warn "error message"] [at loc] lookup_locally no [error\|warn "error message"] [at loc] Both `define` and `redefine` take a node argument (no), a value argument (type), and optional error messages and source location nodes. The snippet below illustrates the use of `define` and `lookup`. \| Abstraction (id, type, body) -> let param = analyze type in let _ = define id param in let res = analyze body in { type = FunctionT (param, res) } \| Identifier _ as id -> let t = lookup id in t In this snippet, `define` adds the name derived from the node `id` to the symbol table with the value type. `lookup` looks up the name derived from the node id. The symbol table operations have built in error management. If a name is previously defined (in the case of define) or not defined (in the case of lookup) the operations report default error messages and return an error. The source location for the error is also automatically derived from the current position in the ast. The default source location can be overridden by using the `at loc` construct where loc is the name of an ast node. The default error messages can be overridden by explicitly adding an error clause to the operations; for example: \| Abstraction (id, type, body) -> let param = analyze type in let _ = define id param error "previously defined identifier" in let res = analyze body in { type = FunctionT (param, res) } \| Identifier _ as id -> let t = lookup id error "identifier undefined" in t The `redefine` operator is used for overriding previously defined symbol table entries without reporting errors. `lookup_locally` is used to retrieve entries that are only visibly within the current scope. The symbol table operations depend on Typical's namespace construct to introduce new namespaces and their types and also how to extract names from AST nodes. The namespace construct hast the structure shown: namespace "default"\|identifier : value = PatternMatching [and "default"\|identifier : value = PatternMatching]* Here, "default" and identifier identify the namespace, value indicates the type of values stored in this namespace, and PatternMatching is a matching from AST nodes to a Name constructor. For example the namespace declaration: namespace default : type = Identifier (id) -> SimpleName(id); indicates that there's a single default namespace, where all values have type type. Identifiers are matched to names by extracting the string child from an Identifier node and using it to create a new SimpleName. Names can be simple, omitting the scope, or qualified, explicitly specifying a scope according the built-in variant type declaration: mltype name = \| SimpleName of string \| QualifiedName of string list ; The symbol table operations also depend on the scope construct to manage scopes while traversing the AST. The scope construct (non-exhaustively) maps AST nodes to scopes; each right-hand side must evaluate to a Scope constructor value defined by the built-in variant declarations: mltype scope_kind = \| Named of name \| Anonymous of string \| Temporary of string ; mltype scope = \| Scope of scope_kind * node list ; A scope spans the specified list of nodes. It can be one of three kinds. First, a named scope is introduced by a function or class. Second, an anonymous scope is introduced by a block or compound statement. Third, a temporary scope, unlike a named or anonymous scope, is deleted after the AST traversal has left the scope�s AST nodes. It is necessary for typing C or C++ function declarations, which may contain parameter names different from the corresponding definitions. Named scopes assume the specified name, while anonymous and temporary scopes receive a fresh name based on the specified string. An implementation of the scope construct depends on the ability to accurately trace a Typical program�s AST traversal. To this end, we restrict AST node patterns: they may specify several (recursively) nested nodes but only use variable or alias patterns for arguments of a single node constructor. With this restriction in place, each right-hand side of a pattern match has a unique parent. As a result, we can easily model the path representing the AST traversal�s dynamic state through a stack: traversal_path : node list Typical�s runtime updates the stack with the parent and any matched ancestors before evaluating a pattern match�s right-hand side and restores the stack before returning the right-hand side�s value. Whenever the runtime pushes nodes on the stack, it evaluates the scope construct�s pattern match against each node being added to the stack. On a match, it updates the current scope as specified by the corresponding Scope constructor value. Independent of a match, it annotates the node with the result. It uses the annotations to restore previous scopes when popping nodes of the stack. It also uses the annotations to avoid re-evaluation of the scope construct�s pattern match for repeated passes over the same AST. Error Management, Detection, and Reporting. For error management, Typical provides a system-wide no-information monad. All types are automatically injected into this monad and implicitly represent the union with the `bottom` type; that type�s only value is `bottom`. Typical's `bottom` value can be compared with Java's `null` value. In fact, in the generated Java code, `bottom` is represented by `null`. Built-in constructs and primitives generally return `bottom` if any of their arguments are `bottom`. Furthermore, all pattern matches return `bottom` due to an implicit clause mapping `bottom` to itself; this default can be overridden by explicitly matching `bottom`. However, type constructors such as those for tuples, lists, and variants treat `bottom` just like any other value. It allows for marking, say, a type attribute as having no information without forcing the entire type record to `bottom`. Finally, the `is_bottom` primitive enables the detection of `bottom` values, since the `=` and `!=` operators yield `bottom` when either operand is `bottom`. To actually detect and report error conditions, Typical uses `require` and `guard`. The require construct enforces one or more boolean constraints on an expression. For example, consider : require param = tr error "argument type mismatch" in res If the constraints, here "param = tr", evaluate to true, `require` evaluates the expression, here "res" and returns the corresponding value. If the constraints evaluate to false, it reports an error, here "argument type mismatch", and returns `bottom`. Unless explicitly specified, the traversal_path�s top node supplies the error�s source location. However, if the constraints evaluate to `bottom`, reduce silently returns `bottom`. This feature avoids cascading errors since constraints that evaluate to `bottom` means that an error was previously generated (and reported). List Reduction. C-like languages rely on lists of specifiers or modifiers to express the properties of declarations, including types and their attributes. When processing such lists, a type checker needs to (1) map AST nodes to the corresponding internal representation, (2) enforce semantic constraints on the number and combination of specifiers, and (3) provide comprehensive error reporting. Typical addresses these needs using the `reduce` construct. The reduce construct has the following structure: reduce to ["singleton" \| "list" \| "set" \| "optional" \| "required"] tag with PatternMatching As illustrated in Figure 1 using C�s type specifiers as an example, the reduce construct selects values from a list, while also mapping them to different values and enforcing constraints on their number and combination. It uses an extension of ML�s pattern syntax, the set pattern `{ ... }`, to denote that the elements may appear in any order in the list; alternatively, a regular list pattern can be used to indicate that the elements must appear in the specified order. While mapping the (non-exhaustive) pattern match over a list, reduce ignores elements that do not match any patterns; we assume that lists, in particular those generated by the parser, only contain legal elements. While mapping, reduce also enforces the reduction constraints. The singleton constraint in the example indicates that the pattern match may be triggered at most once, while set and list constraints allow for multiple triggers (with duplicate triggers being ignored for set constraints). A further optional constraint specifies that the pattern match need not match any elements. The design of the reduce construct reflects our analysis of how to type check C specifiers or Java modifiers and supports a generalization of the corresponding requirements. For example, a list of C specifiers must include a single type specifier such as int, reducing the list to a "singleton". It may optionally include one storage class specifier such as register, reducing to an "optional singleton". It may also include one or more qualifiers such as const, which may be duplicated and thus reduce to an "optional set". Furthermore, Figure 5 illustrates how reduce combines several type specifiers into one internal value; as an added benefit, the code directly reflects the constraints specified in the C99 standard. Like require and guard, reduce integrates error checking and reporting, basing any message on a string `tag` that describes the kind of values being selected such as "type" or "access modifier". For example, if processing the C declaration `long int float foo = 5;`, the reduce construct shown below would generate a "multiple types detected" error and indicate the source location. (** Reduce a list of declaration specifiers to a type. ) mlvalue extractType = reduce to required singleton "type" with [Unsigned _, Long _, Long _, Int _] -> {type = ULongLongT} \| [Unsigned _, Long _, Long _] -> {type = ULongLongT} \| [Signed _, Long _, Long _, Int _] -> {type = LongLongT} \| [Signed _, Long _, Long _] -> {type = LongLongT} \| [Unsigned _, Long _, Int _] -> {type = ULongT} \| [Signed _, Long _, Int _] -> {type = LongT} \| [Signed _, Short _, Int _] -> {type = ShortT} \| [Unsigned _, Short _, Int _] -> {type = UShortT} \| [Long _, Double _, Complex _] -> {type = LongDoubleComplexT} \| [Long _, Long _, Int _] -> {type = LongLongT} \| [Unsigned _, Long _ ] -> {type = ULongT} \| [Signed _, Short _] -> {type = ShortT} \| [Short _, Int _] -> {type = ShortT} \| [Unsigned _, Short _] -> {type = UShortT} \| [Signed _, Char _] -> {type = SCharT} \| [Unsigned _, Char _] -> {type = UCharT} \| [Float _, Complex _] -> {type = FloatComplexT} \| [Double _, Complex _] -> {type = DoubleComplexT} \| [Signed _, Long _] -> {type = LongT} \| [Long _, Int _] -> {type = LongT} \| [Long _, Double _] -> {type = LongDoubleT} \| [Long _, Long _] -> {type = LongLongT} \| [Unsigned _, Int _] -> {type = UIntT} \| [Signed _, Int _] -> {type = IntT} \| [Unsigned _] -> {type = UIntT} \| [Signed _] -> {type = IntT} \| [Long _] -> {type = LongT} \| [Int _] -> {type = IntT} \| [VoidTypeSpecifier _] -> {type = VoidT} \| [Char _] -> {type = CharT} \| [Float _] -> {type = FloatT} \| [Double _] -> {type = DoubleT} \| [Short _] -> {type = ShortT} \| [Bool _] -> {type = IntT} \| [VarArgListSpecifier _] -> {type = VarArgT} \| [(StructureTypeDefinition _ ) as me] -> analyze me \| _ (Similar for enum, union and typedef ) ; Figure 1. reduce pattern to process C declaration specifiers* Module System. Library Typical provides a library of primitive operations. The Java implementations of these operations can be found in Primitives.java. The operations, their signatures and descriptions are listed below: Map operations get : 'a -> 'b put : 'a -> 'b Prelude operations: abs_float : float -> float Affirm a float abs_int : int -> int Affirm an int and_bits : int -> int -> int Perform logical and on two integers ancestor : 'a -> node Get the ancestor of the top node of the traversal path if it matches a specified pattern annotate : node -> string -> 'a -> 'a Annotate a node with a named value annotate_list node list -> string -> 'a -> 'a Annotate a list of nodes ftoi : float -> int Convert a float to an int get_annotation : node -> string -> 'a Get the named annotation from a node has_annotation : node -> string -> bool Test if a node has an annotation with the specified name is_bottom : 'a -> bool Test a value for bottom is_empty : 'a list -> bool Test a list for emptiness is_not_bottom : 'a -> bool Test if a value is not bottom negate_bits : int -> int Compute the bitwise negation of an integer negate_float : float -> float Negate a float value negate_int : int -> int Negate an int value node_name : node -> string Get the name of a node nth : 'a list -> int -> 'a Get a list's nth element or_bits : int -> int Compute the bitwise or of an int parent : 'a -> node Get the parent of the top node on the traversal path if it matches a specified pattern shift_left : int -> int -> int Left shift and int by the specified number of bits shift_right : int -> int -> int Right shift and int by the specified number of bits trace : 'a -> 'a Debugging function to print a value to the standard output trace2 : 'a -> string -> 'a Debugging function to print a prefied value to the standard output xor_bits : int -> int -> int Compute the exclusive or of two integers List operations append : 'a list -> 'a list -> 'a list Append a list to another cons : 'a -> 'a list Cons a new list from an value and an existing list contains : 'a -> 'a list -> bool Test if a list contains an element exists : (fn : 'a -> bool) -> 'a list Test if a list element satisfies a predicate foldl : (fn: 'a -> 'a -> 'a) -> 'a list Fold a list head : 'a list -> 'a Get the head of a list intersection : 'a list -> 'a list 'a list Compute the intersection of two lists iter : (fn: 'a -> 'b) -> 'a list Iterate a function over a list length : 'a list -> int Compute the length of a list subtraction : 'a list -> 'a list -> 'a list Get the set subtraction of two lists tail : 'a list -> 'a list Get the tail of a list union : 'a list -> 'a list -> 'a list Compute the set union of two lists String operations concat : string -> string -> string Concatenate two strings ends_with : string -> string -> bool Test if a string ends with a substring ends_withi : string -> string -> bool Test if a string ends with a substring - case insensitive join_strings : string list -> string Combine a list of strings into a single string ssize : string -> int Compute the size of a string starts_with : string -> string -> bool Test if a string begins with a specified substring starts_withi : string -> string -> bool Test if a string begins with a specified substring - case insensitive stof : string -> float Convert a string to a float stoi : string -> int Convert a string to an integer substring : string -> int Get the substring starting at the specified index substring2 : string -> int -> int Get the substring starting at the specified range Usage and Example The Typical system encompasses three different programming languages and their compilers. The source language, the meta-language, and the bootstrap language. The source language is the language whose type-checker is being implemented. The meta-language is the language in which the type checker specification is written; i.e. Typical. The Typical compiler will generate code in the bootstrap language, which in the current implementation is Java. As an introduction to Typical the following example presents the implementation of a complete type checker using the Simply Typed Lambda Calculus as the source language. In this example, the calculus is treated not as a formal system, but as a programming language whose front end we wish to implement. Below we have the calculus' syntax. The corresponding grammar (found in lang/TypedLambda.rats) is written for the Rats! parser generator and largely follows the syntactic specification shown. Expression <- Application EOF Application <- Application BasicExpression / BasicExpression Abstraction <- LAMBDA Identifier COLON FunctionType DOT Application BasicExpression <- Abstraction / Identifier / IntegerConstant / StringConstant / OPEN Application CLOSE Identifier <- [a-zA-Z]+ IntegerConstant <- [1-9] [0-9]* StringConstant <- ["] (!["] _)* ["]" FunctionType <- BasicType ARROW FunctionType / BasicType BasicType <- IntegerType / StringType / OPEN FunctionType CLOSE IntegerType <- "int" StringType <- "string" LAMBDA <- "\\" COLON <- ":" DOT <- "." ARROW <- "->" OPEN <- "(" CLOSE <- ")" Figure 2. Next, we use the Typical compiler and the complete TypedLambda.tpcl specification below in Figure 3 to create the type checker. 1. (** 2. * Typing rules for the simply typed lambda calculus. 3. * 4. * @author Robert Grimm 5. * @version $Revision: 1.14 $ 6. ) 7. module xtc.lang.TypedLambda; 8. 9. mltype node = 10. \| Application of node node 11. \| Abstraction of node * node * node 12. \| Identifier of string 13. \| IntegerConstant of string 14. \| StringConstant of string 15. \| FunctionType of node * node 16. \| IntegerType 17. \| StringType 18. ; 19. 20. mltype raw_type = IntegerT \| StringT \| FunctionT of type * type ; 21. 22. scope Abstraction (id, _, body) -> Scope(Anonymous("lambda"), [id, body]) ; 23. 24. namespace default : type = Identifier (id) -> SimpleName(id) ; 25. 26. mlvalue analyze = function 27. \| Application (lambda, expr) -> 28. let tl = analyze lambda 29. and tr = analyze expr in 30. require (predicate FunctionT _) tl.type 31. error "application of non-function" in begin 32. match tl.type, tr with 33. \| FunctionT (param, res), param -> res 34. \| _ -> error "argument/parameter type mismatch" 35. end 36. \| Abstraction (id, type, body) -> 37. let param = analyze type in 38. let _ = define id param in 39. let res = analyze body in 40. { type = FunctionT (param, res) } 41. \| Identifier _ as id -> 42. let t = lookup id in 43. require t != bottom error (get_name id) ^ " undefined" in t 44. \| IntegerConstant _ -> 45. { type = IntegerT } 46. \| StringConstant _ -> 47. { type = StringT } 48. \| FunctionType (parameter, result) -> 49. { type = FunctionT (analyze parameter, analyze result) } 50. \| IntegerType -> 51. { type = IntegerT } 52. \| StringType -> 53. { type = StringT } 54. ; 55. 56. mlvalue get_name = function 57. \| Identifier (name) -> name 58. \| _ -> bottom 59. ; 60. Figure 3. Type checker specification for the Simply Typed Lambda Calculus The specification begins with the module declaration on line 7. Lines 9-17 declare the variant type node to represent the STLC's abstract syntax tree structure. The variants of the node type indicate the names and the type of the children of all the nodes that may appear in the STLC AST to be type checked. Note that this specification need not be written by hand, but can automatically generated from a Rats! grammar specification using the `-ast` option. Line 20 declares the variant type `raw_type` to represent the type structure of our STLC. In this case we have three types: an integer type, a string type, and a function type. Line 22 uses Typical's scope construct to declaratively specify scope management for the STLC type checker. The declaration can be interpreted as follows: If an `Abstraction` node is encountered while traversing the AST the type checker will enter a new Anonymous scope with a name derived from "lambda", and the first and third children of the `Abstraction` node will belong to the new scope. Line 24 uses Typical's scope construct to declaratively specify the STLC's namespace management. The declaration can be interpreted as follows: All STLC names belong to a single default namespace and have value 'type'. Symbol table names are obtained by extracting the string value from an `Identifier` node, and using this string to create a new SimpleName. Lines 26-55 define the type rules of our STLC as pattern matches from nodes to type. Lines 27-34 typecheck an `Application` node by first typechecking each child of the application, then using the `require` construct to enforce that the first child has function type, and the `error` construct to report failure. Finally the `error` construct is used to report an error on type/argument mismatch. Lines 36-40 process an abstraction node by analysing the second and third children, and adding an symbol table entry using the `define` construct. Since the define construct has built in error management, an error will be reported if the identifier was previously defined. Successful processing of the `Abstraction` node returns the appropriate FunctionType. Line 41 types an `Identifier` via a symbol table lookup on the node. Lines 44-54 are straightforward mapping of the`IntegerConstant, StringConstant, FunctionType, IntegerType, and StringType` nodes to their corresponding types. Finally the specification can be compiled using the command `java xtc.typical.Typical TypedLambda.tpcl`. This generates three files: TypedLambdaTypes.java which contains definitions for all the types used by the STCL type checker, TypedLambdaSupport.java which contains general supporting infrastructure, and TypedLambdaAnalyzer.java which is the STLC typechecker itself. TypedLambdaAnalyzer can be incorporated into a Compiler or Driver (for example, TypedLambda.java) for processing STLC programs.
Java Source File Name	Type	Comment
Analyzer.java	Class	The base class of all Typical-generated type checkers.
FlagSetter.java	Class	Visitor to set flags indicating which liraries are used.
FreeVariableCollector.java	Class	Visitor to add all find all free identifiers in an guard expression.
Name.java	Class	The base class of all NameSpace names.
Primitives.java	Class	The primitives for Typical.
Record.java	Interface	Marker interface for record types.
Reduction.java	Class
Scope.java	Class	The predefined scope structure.
ScopeKind.java	Class	The base class of all scope kinds.
Transformer.java	Class	Visitor to translate Typical ASTs into Java ASTs.
TreeFactory.java	Class	Node factory `xtc.typical.TreeFactory`.
Tuple.java	Class	The superclass of all anonymous tuples.
TypeCollector.java	Class	Visitor to add all types reachable from a given type definition.
TypeMapper.java	Class	Type to Java type mapper.
Typical.java	Class	The Typical compiler.
TypicalAnalyzer.java	Class	Type checker for Typical.
TypicalParser.java	Class	Packrat parser for grammar `xtc.typical.Typical`.
TypicalSupport.java	Class	Helper functionality for Typical.
TypicalTypes.java	Class	Types for Typical.
Variant.java	Class	The superclass of all variants.

www.java2java.com | Contact Us

All other trademarks are property of their respective owners.

xtc.typical

The Typical type checker generator.