Java Doc for BaseRecognizer.java in  » Parser » antlr-3.0.1 » org » antlr » runtime » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Parser » antlr 3.0.1 » org.antlr.runtime 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   org.antlr.runtime.BaseRecognizer

All known Subclasses:   org.antlr.runtime.Lexer,  org.antlr.runtime.tree.TreeParser,  org.antlr.runtime.Parser,
BaseRecognizer
abstract public class BaseRecognizer (Code)
A generic recognizer that can handle recognizers generated from lexer, parser, and tree grammars. This is all the parsing support code essentially; most of it is error recovery stuff and backtracking.


Field Summary
final public static  intDEFAULT_TOKEN_CHANNEL
    
final public static  intHIDDEN
    
final public static  intINITIAL_FOLLOW_STACK_SIZE
    
final public static  intMEMO_RULE_FAILED
    
final public static  IntegerMEMO_RULE_FAILED_I
    
final public static  intMEMO_RULE_UNKNOWN
    
final public static  StringNEXT_TOKEN_RULE_NAME
    
protected  int_fsp
    
protected  intbacktracking
     If 0, no backtracking is going on.
protected  booleanerrorRecovery
     This is true when we see an error and before having successfully matched a token.
protected  booleanfailed
     In lieu of a return value, this indicates that a rule or token has failed to match.
protected  BitSet[]following
     Track the set of token types that can follow any rule invocation. Stack grows upwards.
protected  intlastErrorIndex
     The index into the input stream where the last error occurred. This is used to prevent infinite loops where an error is found but no token is consumed during recovery...another error is found, ad naseum.
protected  Map[]ruleMemo
     An array[size num rules] of Map that tracks the stop token index for each rule.


Method Summary
public  booleanalreadyParsedRule(IntStream input, int ruleIndex)
     Has this rule already parsed input at the current index in the input stream? Return the stop token index or MEMO_RULE_UNKNOWN.
public  voidbeginResync()
     A hook to listen in on the token consumption during error recovery.
protected  BitSetcombineFollows(boolean exact)
    
protected  BitSetcomputeContextSensitiveRuleFOLLOW()
     Compute the context-sensitive FOLLOW set for current rule. This is set of token types that can follow a specific rule reference given a specific call chain.
protected  BitSetcomputeErrorRecoverySet()
    
public  voidconsumeUntil(IntStream input, int tokenType)
    
public  voidconsumeUntil(IntStream input, BitSet set)
    
public  voiddisplayRecognitionError(String[] tokenNames, RecognitionException e)
    
public  voidemitErrorMessage(String msg)
    
public  voidendResync()
    
public  intgetBacktrackingLevel()
    
public  StringgetErrorHeader(RecognitionException e)
    
public  StringgetErrorMessage(RecognitionException e, String[] tokenNames)
     What error message should be generated for the various exception types? Not very object-oriented code, but I like having all error message generation within one method rather than spread among all of the exception classes.
public  StringgetGrammarFileName()
     For debugging and other purposes, might want the grammar name.
public  ListgetRuleInvocationStack()
     Return List of the rules in your parser instance leading up to a call to this method.
public static  ListgetRuleInvocationStack(Throwable e, String recognizerClassName)
     A more general version of getRuleInvocationStack where you can pass in, for example, a RecognitionException to get it's rule stack trace.
public  intgetRuleMemoization(int ruleIndex, int ruleStartIndex)
     Given a rule number and a start token index number, return MEMO_RULE_UNKNOWN if the rule has not parsed input starting from start index.
public  intgetRuleMemoizationCacheSize()
     return how many rule/input-index pairs there are in total. TODO: this includes synpreds.
public  StringgetTokenErrorDisplay(Token t)
     How should a token be displayed in an error message? The default is to display just the text, but during development you might want to have a lot of information spit out.
public  String[]getTokenNames()
     Used to print out token names like ID during debugging and error reporting.
public  voidmatch(IntStream input, int ttype, BitSet follow)
     Match current input symbol against ttype.
public  voidmatchAny(IntStream input)
    
public  voidmemoize(IntStream input, int ruleIndex, int ruleStartIndex)
     Record whether or not this rule parsed the input at this position successfully.
protected  voidmismatch(IntStream input, int ttype, BitSet follow)
     factor out what to do upon token mismatch so tree parsers can behave differently.
protected  voidpushFollow(BitSet fset)
    
public  voidrecover(IntStream input, RecognitionException re)
     Recover from an error found on the input stream.
protected  booleanrecoverFromMismatchedElement(IntStream input, RecognitionException e, BitSet follow)
     This code is factored out from mismatched token and mismatched set recovery.
public  voidrecoverFromMismatchedSet(IntStream input, RecognitionException e, BitSet follow)
    
public  voidrecoverFromMismatchedToken(IntStream input, RecognitionException e, int ttype, BitSet follow)
     Attempt to recover from a single missing or extra token. EXTRA TOKEN LA(1) is not what we are looking for.
public  voidreportError(RecognitionException e)
     Report a recognition problem. This method sets errorRecovery to indicate the parser is recovering not parsing.
public  voidreset()
    
public  ListtoStrings(List tokens)
     A convenience method for use most often with template rewrites.
public  voidtraceIn(String ruleName, int ruleIndex, Object inputSymbol)
    
public  voidtraceOut(String ruleName, int ruleIndex, Object inputSymbol)
    

Field Detail
DEFAULT_TOKEN_CHANNEL
final public static int DEFAULT_TOKEN_CHANNEL(Code)



HIDDEN
final public static int HIDDEN(Code)



INITIAL_FOLLOW_STACK_SIZE
final public static int INITIAL_FOLLOW_STACK_SIZE(Code)



MEMO_RULE_FAILED
final public static int MEMO_RULE_FAILED(Code)



MEMO_RULE_FAILED_I
final public static Integer MEMO_RULE_FAILED_I(Code)



MEMO_RULE_UNKNOWN
final public static int MEMO_RULE_UNKNOWN(Code)



NEXT_TOKEN_RULE_NAME
final public static String NEXT_TOKEN_RULE_NAME(Code)



_fsp
protected int _fsp(Code)



backtracking
protected int backtracking(Code)
If 0, no backtracking is going on. Safe to exec actions etc... If >0 then it's the level of backtracking.



errorRecovery
protected boolean errorRecovery(Code)
This is true when we see an error and before having successfully matched a token. Prevents generation of more than one error message per error.



failed
protected boolean failed(Code)
In lieu of a return value, this indicates that a rule or token has failed to match. Reset to false upon valid token match.



following
protected BitSet[] following(Code)
Track the set of token types that can follow any rule invocation. Stack grows upwards. When it hits the max, it grows 2x in size and keeps going.



lastErrorIndex
protected int lastErrorIndex(Code)
The index into the input stream where the last error occurred. This is used to prevent infinite loops where an error is found but no token is consumed during recovery...another error is found, ad naseum. This is a failsafe mechanism to guarantee that at least one token/tree node is consumed for two errors.



ruleMemo
protected Map[] ruleMemo(Code)
An array[size num rules] of Map that tracks the stop token index for each rule. ruleMemo[ruleIndex] is the memoization table for ruleIndex. For key ruleStartIndex, you get back the stop token for associated rule or MEMO_RULE_FAILED. This is only used if rule memoization is on (which it is by default).





Method Detail
alreadyParsedRule
public boolean alreadyParsedRule(IntStream input, int ruleIndex)(Code)
Has this rule already parsed input at the current index in the input stream? Return the stop token index or MEMO_RULE_UNKNOWN. If we attempted but failed to parse properly before, return MEMO_RULE_FAILED. This method has a side-effect: if we have seen this input for this rule and successfully parsed before, then seek ahead to 1 past the stop token matched for this rule last time.



beginResync
public void beginResync()(Code)
A hook to listen in on the token consumption during error recovery. The DebugParser subclasses this to fire events to the listenter.



combineFollows
protected BitSet combineFollows(boolean exact)(Code)



computeContextSensitiveRuleFOLLOW
protected BitSet computeContextSensitiveRuleFOLLOW()(Code)
Compute the context-sensitive FOLLOW set for current rule. This is set of token types that can follow a specific rule reference given a specific call chain. You get the set of viable tokens that can possibly come next (lookahead depth 1) given the current call chain. Contrast this with the definition of plain FOLLOW for rule r: FOLLOW(r)={x | S=>*alpha r beta in G and x in FIRST(beta)} where x in T* and alpha, beta in V*; T is set of terminals and V is the set of terminals and nonterminals. In other words, FOLLOW(r) is the set of all tokens that can possibly follow references to r in *any* sentential form (context). At runtime, however, we know precisely which context applies as we have the call chain. We may compute the exact (rather than covering superset) set of following tokens. For example, consider grammar: stat : ID '=' expr ';' // FOLLOW(stat)=={EOF} | "return" expr '.' ; expr : atom ('+' atom)* ; // FOLLOW(expr)=={';','.',')'} atom : INT // FOLLOW(atom)=={'+',')',';','.'} | '(' expr ')' ; The FOLLOW sets are all inclusive whereas context-sensitive FOLLOW sets are precisely what could follow a rule reference. For input input "i=(3);", here is the derivation: stat => ID '=' expr ';' => ID '=' atom ('+' atom)* ';' => ID '=' '(' expr ')' ('+' atom)* ';' => ID '=' '(' atom ')' ('+' atom)* ';' => ID '=' '(' INT ')' ('+' atom)* ';' => ID '=' '(' INT ')' ';' At the "3" token, you'd have a call chain of stat -> expr -> atom -> expr -> atom What can follow that specific nested ref to atom? Exactly ')' as you can see by looking at the derivation of this specific input. Contrast this with the FOLLOW(atom)={'+',')',';','.'}. You want the exact viable token set when recovering from a token mismatch. Upon token mismatch, if LA(1) is member of the viable next token set, then you know there is most likely a missing token in the input stream. "Insert" one by just not throwing an exception.



computeErrorRecoverySet
protected BitSet computeErrorRecoverySet()(Code)



consumeUntil
public void consumeUntil(IntStream input, int tokenType)(Code)



consumeUntil
public void consumeUntil(IntStream input, BitSet set)(Code)
Consume tokens until one matches the given token set



displayRecognitionError
public void displayRecognitionError(String[] tokenNames, RecognitionException e)(Code)



emitErrorMessage
public void emitErrorMessage(String msg)(Code)
Override this method to change where error messages go



endResync
public void endResync()(Code)



getBacktrackingLevel
public int getBacktrackingLevel()(Code)



getErrorHeader
public String getErrorHeader(RecognitionException e)(Code)
What is the error header, normally line/character position information?



getErrorMessage
public String getErrorMessage(RecognitionException e, String[] tokenNames)(Code)
What error message should be generated for the various exception types? Not very object-oriented code, but I like having all error message generation within one method rather than spread among all of the exception classes. This also makes it much easier for the exception handling because the exception classes do not have to have pointers back to this object to access utility routines and so on. Also, changing the message for an exception type would be difficult because you would have to subclassing exception, but then somehow get ANTLR to make those kinds of exception objects instead of the default. This looks weird, but trust me--it makes the most sense in terms of flexibility. For grammar debugging, you will want to override this to add more information such as the stack frame with getRuleInvocationStack(e, this.getClass().getName()) and, for no viable alts, the decision description and state etc... Override this to change the message generated for one or more exception types.



getGrammarFileName
public String getGrammarFileName()(Code)
For debugging and other purposes, might want the grammar name. Have ANTLR generate an implementation for this method.



getRuleInvocationStack
public List getRuleInvocationStack()(Code)
Return List of the rules in your parser instance leading up to a call to this method. You could override if you want more details such as the file/line info of where in the parser java code a rule is invoked. This is very useful for error messages and for context-sensitive error recovery.



getRuleInvocationStack
public static List getRuleInvocationStack(Throwable e, String recognizerClassName)(Code)
A more general version of getRuleInvocationStack where you can pass in, for example, a RecognitionException to get it's rule stack trace. This routine is shared with all recognizers, hence, static. TODO: move to a utility class or something; weird having lexer call this



getRuleMemoization
public int getRuleMemoization(int ruleIndex, int ruleStartIndex)(Code)
Given a rule number and a start token index number, return MEMO_RULE_UNKNOWN if the rule has not parsed input starting from start index. If this rule has parsed input starting from the start index before, then return where the rule stopped parsing. It returns the index of the last token matched by the rule. For now we use a hashtable and just the slow Object-based one. Later, we can make a special one for ints and also one that tosses out data after we commit past input position i.



getRuleMemoizationCacheSize
public int getRuleMemoizationCacheSize()(Code)
return how many rule/input-index pairs there are in total. TODO: this includes synpreds. :(



getTokenErrorDisplay
public String getTokenErrorDisplay(Token t)(Code)
How should a token be displayed in an error message? The default is to display just the text, but during development you might want to have a lot of information spit out. Override in that case to use t.toString() (which, for CommonToken, dumps everything about the token). This is better than forcing you to override a method in your token objects because you don't have to go modify your lexer so that it creates a new Java type.



getTokenNames
public String[] getTokenNames()(Code)
Used to print out token names like ID during debugging and error reporting. The generated parsers implement a method that overrides this to point to their String[] tokenNames.



match
public void match(IntStream input, int ttype, BitSet follow) throws RecognitionException(Code)
Match current input symbol against ttype. Upon error, do one token insertion or deletion if possible. You can override to not recover here and bail out of the current production to the normal error exception catch (at the end of the method) by just throwing MismatchedTokenException upon input.LA(1)!=ttype.



matchAny
public void matchAny(IntStream input)(Code)



memoize
public void memoize(IntStream input, int ruleIndex, int ruleStartIndex)(Code)
Record whether or not this rule parsed the input at this position successfully. Use a standard java hashtable for now.



mismatch
protected void mismatch(IntStream input, int ttype, BitSet follow) throws RecognitionException(Code)
factor out what to do upon token mismatch so tree parsers can behave differently. Override this method in your parser to do things like bailing out after the first error; just throw the mte object instead of calling the recovery method.



pushFollow
protected void pushFollow(BitSet fset)(Code)
Push a rule's follow set using our own hardcoded stack



recover
public void recover(IntStream input, RecognitionException re)(Code)
Recover from an error found on the input stream. Mostly this is NoViableAlt exceptions, but could be a mismatched token that the match() routine could not recover from.



recoverFromMismatchedElement
protected boolean recoverFromMismatchedElement(IntStream input, RecognitionException e, BitSet follow)(Code)
This code is factored out from mismatched token and mismatched set recovery. It handles "single token insertion" error recovery for both. No tokens are consumed to recover from insertions. Return true if recovery was possible else return false.



recoverFromMismatchedSet
public void recoverFromMismatchedSet(IntStream input, RecognitionException e, BitSet follow) throws RecognitionException(Code)



recoverFromMismatchedToken
public void recoverFromMismatchedToken(IntStream input, RecognitionException e, int ttype, BitSet follow) throws RecognitionException(Code)
Attempt to recover from a single missing or extra token. EXTRA TOKEN LA(1) is not what we are looking for. If LA(2) has the right token, however, then assume LA(1) is some extra spurious token. Delete it and LA(2) as if we were doing a normal match(), which advances the input. MISSING TOKEN If current token is consistent with what could come after ttype then it is ok to "insert" the missing token, else throw exception For example, Input "i=(3;" is clearly missing the ')'. When the parser returns from the nested call to expr, it will have call chain: stat -> expr -> atom and it will be trying to match the ')' at this point in the derivation: => ID '=' '(' INT ')' ('+' atom)* ';' ^ match() will see that ';' doesn't match ')' and report a mismatched token error. To recover, it sees that LA(1)==';' is in the set of tokens that can follow the ')' token reference in rule atom. It can assume that you forgot the ')'.



reportError
public void reportError(RecognitionException e)(Code)
Report a recognition problem. This method sets errorRecovery to indicate the parser is recovering not parsing. Once in recovery mode, no errors are generated. To get out of recovery mode, the parser must successfully match a token (after a resync). So it will go: 1. error occurs 2. enter recovery mode, report error 3. consume until token found in resynch set 4. try to resume parsing 5. next match() will reset errorRecovery mode



reset
public void reset()(Code)
reset the parser's state; subclasses must rewinds the input stream



toStrings
public List toStrings(List tokens)(Code)
A convenience method for use most often with template rewrites. Convert a List to List



traceIn
public void traceIn(String ruleName, int ruleIndex, Object inputSymbol)(Code)



traceOut
public void traceOut(String ruleName, int ruleIndex, Object inputSymbol)(Code)



Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.