| java.lang.Object org.antlr.tool.Grammar
Grammar | public class Grammar (Code) | | Represents a grammar in memory.
|
Inner Class :public static class Decision | |
Inner Class :public class LabelElementPair | |
Field Summary | |
public static String | ANTLRLiteralCharValueEscape Given a char, we need to be able to show as an ANTLR literal. | public static int | ANTLRLiteralEscapedCharValue When converting ANTLR char and string literals, here is the
value set of escape chars. | final public static String | ARTIFICIAL_TOKENS_RULENAME | final public static int | CHAR_LABEL | final public static int | COMBINED | public long | DFACreationWallClockTimeInMS How long in ms did it take to build DFAs for this grammar?
If this grammar is a combined grammar, it only records time for
the parser grammar component. | final public static String | FRAGMENT_RULE_MODIFIER | final public static String | GRAMMAR_FILE_EXTENSION | final public static String | IGNORE_STRING_IN_GRAMMAR_FILE_NAME | final public static int | INITIAL_DECISION_LIST_SIZE | final public static int | INVALID_RULE_INDEX | final public static int | LEXER | final public static String | LEXER_GRAMMAR_FILE_EXTENSION | public static String[] | LabelTypeToString | final public static int | PARSER | final public static int | RULE_LABEL | final public static int | RULE_LIST_LABEL | final public static String | SYNPREDGATE_ACTION_NAME | final public static String | SYNPRED_RULE_PREFIX | final public static int | TOKEN_LABEL | final public static int | TOKEN_LIST_LABEL | final public static int | TREE_PARSER | protected Map | actions Map a scope to a map of name:action pairs. | protected boolean | allDecisionDFACreated | public Set<GrammarAST> | blocksWithSemPreds Track decisions with syn preds specified for reporting. | public Set<GrammarAST> | blocksWithSynPreds Track decisions with syn preds specified for reporting. | protected boolean | builtFromString We need a way to detect when a lexer grammar is autogenerated from
another grammar or we are just sending in a string representing a
grammar. | protected IntSet | charVocabulary | protected int | decisionNumber | public Set | decisionsWhoseDFAsUsesSemPreds Track decisions that actually use the syn preds in the DFA. | public Set<DFA> | decisionsWhoseDFAsUsesSynPreds Track decisions that actually use the syn preds in the DFA. | final public static Map | defaultOptions | final public static Set | doNotCopyOptionsToLexer | protected boolean | externalAnalysisAbort An external tool requests that DFA analysis abort prematurely. | protected String | fileName | protected CodeGenerator | generator If non-null, this is the code generator we will use to generate
recognizers in the target language. | protected int | global_k Is there a global fixed lookahead set for this grammar?
If 0, nothing specified. | protected GrammarAST | grammarTree An AST that records entire input grammar with all rules. | final public static String[] | grammarTypeToFileNameSuffix | final public static String[] | grammarTypeToString | protected Grammar | importTokenVocabularyFromGrammar For interpreting and testing, you sometimes want to import token
definitions from another grammar (instead of reading token defs from
a file). | protected Vector | indexToDecision Each subrule/rule is a decision point and we must track them so we
can go back later and build DFA predictors for them. | protected Set | leftRecursiveRules A list of all rules that are in any left-recursive cycle. | final public static Set | legalOptions | protected StringTemplate | lexerGrammarST For merged lexer/parsers, we must construct a separate lexer spec.
This is the template for lexer; put the literals first then the
regular rules. | protected Set<String> | lexerRules If combined or lexer grammar, track the rules; Set. | Map | lineColumnToLookaheadDFAMap For ANTLRWorks, we want to be able to map a line:col to a specific
decision DFA so it can display DFA. | protected Set | lookBusy | protected int | maxTokenType Token names and literal tokens like "void" are uniquely indexed.
with -1 implying EOF. | public String | name | NameSpaceChecker | nameSpaceChecker | protected LinkedHashMap | nameToRuleMap | protected LinkedHashMap | nameToSynpredASTMap When we read in a grammar, we track the list of syntactic predicates
and build faux rules for them later. | protected NFA | nfa The NFA that represents the grammar with edges labelled with tokens
or epsilon. | public int | numberOfManualLookaheadOptions | public int | numberOfSemanticPredicates | protected Map | options A list of options specified at the grammar level such as language=Java.
The value can be an AST for complicated values such as character sets.
There may be code generator specific options in here. | protected int | ruleIndex | protected Vector | ruleIndexToRuleList Map a rule index to its name; use a Vector on purpose as new
collections stuff won't let me setSize and make it grow. | protected Set<antlr.Token> | ruleRefs The unique set of all rule references in any rule; set of Token
objects so two refs to same rule can exist but at different line/position. | GrammarSanity | sanity Factored out the sanity checking code; delegate to it. | protected Map | scopes Track the scopes defined outside of rules and the scopes associated
with all rules (even if empty). | public Set | setOfDFAWhoseConversionTerminatedEarly | public Set | setOfNondeterministicDecisionNumbers | public Set | setOfNondeterministicDecisionNumbersResolvedWithPredicates | protected Map | stringLiteralToTypeMap Map token literals like "while" to its token type. | public Set<String> | synPredNamesUsedInDFA Track names of preds so we can avoid generating preds that aren't used
Computed during NFA to DFA conversion. | protected TokenStreamRewriteEngine | tokenBuffer This is the buffer of *all* tokens found in the grammar file
including whitespace tokens etc... | protected Set<antlr.Token> | tokenIDRefs | protected Map | tokenIDToTypeMap | public Tool | tool | public int | type | protected Vector | typeToTokenList Map a token type to its token name. | protected Set | visitedDuringRecursionCheck The checkForLeftRecursion method needs to track what rules it has
visited to track infinite recursion. | protected boolean | watchNFAConversion |
Method Summary | |
public LookaheadSet | LOOK(NFAState s) From an NFA state, s, find the set of all labels reachable from s.
This computes FIRST, FOLLOW and any other lookahead computation
depending on where s is.
Record, with EOR_TOKEN_TYPE, if you hit the end of a rule so we can
know at runtime (when these sets are used) to start walking up the
follow chain to compute the real, correct follow set.
This routine will only be used on parser and tree parser grammars.
TODO: it does properly handle a : b A ; where b is nullable
Actually it stops at end of rules, returning EOR. | public boolean | NFAToDFAConversionExternallyAborted() | protected LookaheadSet | _LOOK(NFAState s) | public GrammarAST | addArtificialMatchTokensRule(GrammarAST grammarAST, List ruleNames, boolean filterMode) Parse a rule we add artificially that is a list of the other lexer
rules like this: "Tokens : ID | INT | SEMI ;" nextToken() will invoke
this to set the current token. | public boolean | allDecisionDFAHaveBeenCreated() | public void | altReferencesRule(String ruleName, GrammarAST refAST, int outerAltNum) Track a rule reference within an outermost alt of a rule. | public void | altReferencesTokenID(String ruleName, GrammarAST refAST, int outerAltNum) Track a token reference within an outermost alt of a rule. | public int | assignDecisionNumber(NFAState state) | public boolean | buildAST() | public boolean | buildTemplate() | public List | checkAllRulesForLeftRecursion() | public void | checkAllRulesForUselessLabels() Remove all labels on rule refs whose target rules have no return value. | public void | checkRuleReference(GrammarAST refAST, GrammarAST argsAST, String currentRuleName) | public IntSet | complement(IntSet set) For lexer grammars, return everything in unicode not in set. | public IntSet | complement(int atom) | public String | computeTokenNameFromLiteral(int tokenType, String literal) given a token type and the text of the literal, come up with a
decent token type label. | protected Decision | createDecision(int decision) | public void | createLookaheadDFA(int decision) | public void | createLookaheadDFAs() For each decision in this grammar, compute a single DFA using the
NFA states associated with the decision. | public void | createNFAs() Walk the list of options, altering this Grammar object according
to any I recognize. | public AttributeScope | createParameterScope(String ruleName, Token argAction) | public AttributeScope | createReturnScope(String ruleName, Token retAction) | public AttributeScope | createRuleScope(String ruleName, Token scopeAction) | public AttributeScope | defineGlobalScope(String name, Token scopeAction) | protected void | defineLabel(Rule r, antlr.Token label, GrammarAST element, int type) Define a label defined in a rule r; check the validity then ask the
Rule object to actually define it. | public void | defineLexerRuleForAliasedStringLiteral(String tokenID, String literal, int tokenType) | public void | defineLexerRuleForStringLiteral(String literal, int tokenType) | public void | defineLexerRuleFoundInParser(antlr.Token ruleToken, GrammarAST ruleAST) | public void | defineNamedAction(GrammarAST ampersandAST, String scope, GrammarAST nameAST, GrammarAST actionAST) Given @scope::name {action} define it for this grammar. | public void | defineRule(antlr.Token ruleToken, String modifier, Map options, GrammarAST tree, GrammarAST argActionAST, int numAlts) Define a new rule. | public void | defineRuleListLabel(String ruleName, antlr.Token label, GrammarAST element) | public void | defineRuleRefLabel(String ruleName, antlr.Token label, GrammarAST ruleRef) | public String | defineSyntacticPredicate(GrammarAST blockAST, String currentRuleName) Define a new predicate and get back its name for use in building
a semantic predicate reference to the syn pred. | public void | defineToken(String text, int tokenType) Define a token at a particular token type value. | public void | defineTokenListLabel(String ruleName, antlr.Token label, GrammarAST element) | public void | defineTokenRefLabel(String ruleName, antlr.Token label, GrammarAST tokenRef) | protected void | examineAllExecutableActions() Before generating code, we examine all actions that can have
$x.y and $y stuff in them because some code generation depends on
Rule.referencedPredefinedRuleAttributes. | public void | externallyAbortNFAToDFAConversion() Terminate DFA creation (grammar analysis). | public static String | getANTLRCharLiteralForChar(int c) Return a string representing the escaped char for code c. | public Map | getActions() | public IntSet | getAllCharValues() If there is a char vocabulary, use it; else return min to max char
as defined by the target. | protected List | getArtificialRulesForSyntacticPredicates(ANTLRParser parser, LinkedHashMap nameToSynpredASTMap) for any syntactic predicates, we need to define rules for them; they will get
defined automatically like any other rule. | public static int | getCharValueFromGrammarCharLiteral(String literal) Given a literal like (the 3 char sequence with single quotes) 'a',
return the int value of 'a'. | public CodeGenerator | getCodeGenerator() | protected Decision | getDecision(int decision) | public GrammarAST | getDecisionBlockAST(int decision) | public NFAState | getDecisionNFAStartState(int decision) | public List | getDecisionNFAStartStateList() | public String | getDefaultActionScope(int grammarType) Given a grammar type, what should be the default action scope?
If I say @members in a COMBINED grammar, for example, the
default scope should be "parser". | public String | getFileName() | public AttributeScope | getGlobalScope(String name) | public Map | getGlobalScopes() | public int | getGrammarMaxLookahead() | public GrammarAST | getGrammarTree() | public String | getImplicitlyGeneratedLexerFileName() | public File | getImportedVocabFileName(String vocabName) | public Set<String> | getLabels(Set<GrammarAST> rewriteElements, int labelType) Given a set of all rewrite elements on right of ->, filter for
label types such as Grammar.TOKEN_LABEL, Grammar.TOKEN_LIST_LABEL, ... | public Set | getLeftRecursiveRules() Return a list of left-recursive rules; no analysis can be done
successfully on these. | public String | getLexerGrammar() If the grammar is a merged grammar, return the text of the implicit
lexer grammar. | public Map | getLineColumnToLookaheadDFAMap() | public DFA | getLookaheadDFA(int decision) | public List | getLookaheadDFAColumnsForLineInFile(int line) returns a list of column numbers for all decisions
on a particular line so ANTLRWorks choose the decision
depending on the location of the cursor (otherwise,
ANTLRWorks has to give the *exact* location which
is not easy from the user point of view). | public DFA | getLookaheadDFAFromPositionInFile(int line, int col) | public int | getMaxCharValue() What is the max char value possible for this grammar's target? Use
unicode max if no target defined. | public int | getMaxTokenType() | public NFAState | getNFAStateForAltOfDecision(NFAState decisionState, int alt) Get the ith alternative (1..n) from a decision; return null when
an invalid alt is requested. | public int | getNewTokenType() | public int | getNumberOfAltsForDecisionNFA(NFAState decisionState) Decisions are linked together with transition(1). | public int | getNumberOfCyclicDecisions() | public int | getNumberOfDecisions() | public Object | getOption(String key) | public Rule | getRule(String ruleName) | public int | getRuleIndex(String ruleName) | public String | getRuleModifier(String ruleName) | public String | getRuleName(int ruleIndex) | public NFAState | getRuleStartState(String ruleName) | public NFAState | getRuleStopState(String ruleName) | public Collection | getRules() | public IntSet | getSetFromRule(TreeToNFAConverter nfabuilder, String ruleName) Get the set equivalent (if any) of the indicated rule from this
grammar. | public Set | getStringLiterals() | public GrammarAST | getSyntacticPredicate(String name) | public LinkedHashMap | getSyntacticPredicates() | public String | getTokenDisplayName(int ttype) Given a token type, get a meaningful name for it such as the ID
or string literal. | public Set | getTokenDisplayNames() Get a list of all token IDs and literals that have an associated
token type. | public Set | getTokenIDs() | public int | getTokenType(String tokenName) | public IntSet | getTokenTypes() | public Collection | getTokenTypesWithoutID() Return an ordered integer list of token types that have no
corresponding token ID like INT or KEYWORD_BEGIN; for stuff
like 'begin'. | public Tool | getTool() | public static StringBuffer | getUnescapedStringFromGrammarStringLiteral(String literal) ANTLR does not convert escape sequences during the parse phase because
it could not know how to print String/char literals back out when
printing grammars etc... | public boolean | getWatchNFAConversion() | public String | grammarTreeToString(GrammarAST t) | public String | grammarTreeToString(GrammarAST t, boolean showActions) | public int | importTokenVocabulary(Grammar importFromGr) Pull your token definitions from an existing grammar in memory.
You must use Grammar() ctor then this method then setGrammarContent()
to make this work. | public int | importTokenVocabulary(String vocabName) Load a vocab file .tokens and return max token type found. | protected void | initTokenSymbolTables() | public boolean | isBuiltFromString() | public boolean | isEmptyRule(GrammarAST block) Rules like "a : ;" and "a : {...} ;" should not generate
try/catch blocks for RecognitionException. | public boolean | isValidSet(TreeToNFAConverter nfabuilder, GrammarAST t) | public boolean | optionIsValid(String key, Object value) | public void | printGrammar(PrintStream output) | public void | referenceRuleLabelPredefinedAttribute(String ruleName) To yield smaller, more readable code, track which rules have their
predefined attributes accessed. | protected void | removeUselessLabels(Map ruleToElementLabelPairMap) A label on a rule is useless if the rule has no return value, no
tree or template output, and it is not referenced in an action. | public void | setCodeGenerator(CodeGenerator generator) | public void | setDecisionBlockAST(int decision, GrammarAST blockAST) | public void | setDecisionNFA(int decision, NFAState state) | public void | setFileName(String fileName) | public void | setGrammarContent(String grammarString) | public void | setGrammarContent(Reader r) | public void | setLookaheadDFA(int decision, DFA lookaheadDFA) Set the lookahead DFA for a particular decision. | public void | setName(String name) | public String | setOption(String key, Object value, antlr.Token optionsStartToken) Save the option key/value pair and process it; return the key
or null if invalid option. | public void | setOptions(Map options, antlr.Token optionsStartToken) | public void | setRuleAST(String ruleName, GrammarAST t) | public void | setRuleStartState(String ruleName, NFAState startState) | public void | setRuleStopState(String ruleName, NFAState stopState) | public void | setTool(Tool tool) | public void | setWatchNFAConversion(boolean watchNFAConversion) | public void | synPredUsedInDFA(DFA dfa, SemanticContext semCtx) | public String | toString() |
ANTLRLiteralCharValueEscape | public static String ANTLRLiteralCharValueEscape(Code) | | Given a char, we need to be able to show as an ANTLR literal.
|
ANTLRLiteralEscapedCharValue | public static int ANTLRLiteralEscapedCharValue(Code) | | When converting ANTLR char and string literals, here is the
value set of escape chars.
|
ARTIFICIAL_TOKENS_RULENAME | final public static String ARTIFICIAL_TOKENS_RULENAME(Code) | | |
CHAR_LABEL | final public static int CHAR_LABEL(Code) | | |
COMBINED | final public static int COMBINED(Code) | | |
DFACreationWallClockTimeInMS | public long DFACreationWallClockTimeInMS(Code) | | How long in ms did it take to build DFAs for this grammar?
If this grammar is a combined grammar, it only records time for
the parser grammar component. This only records the time to
do the LL(*) work; NFA->DFA conversion.
|
FRAGMENT_RULE_MODIFIER | final public static String FRAGMENT_RULE_MODIFIER(Code) | | |
GRAMMAR_FILE_EXTENSION | final public static String GRAMMAR_FILE_EXTENSION(Code) | | |
IGNORE_STRING_IN_GRAMMAR_FILE_NAME | final public static String IGNORE_STRING_IN_GRAMMAR_FILE_NAME(Code) | | |
INITIAL_DECISION_LIST_SIZE | final public static int INITIAL_DECISION_LIST_SIZE(Code) | | |
INVALID_RULE_INDEX | final public static int INVALID_RULE_INDEX(Code) | | |
LEXER | final public static int LEXER(Code) | | |
LEXER_GRAMMAR_FILE_EXTENSION | final public static String LEXER_GRAMMAR_FILE_EXTENSION(Code) | | used for generating lexer temp files
|
LabelTypeToString | public static String[] LabelTypeToString(Code) | | |
PARSER | final public static int PARSER(Code) | | |
RULE_LABEL | final public static int RULE_LABEL(Code) | | |
RULE_LIST_LABEL | final public static int RULE_LIST_LABEL(Code) | | |
SYNPREDGATE_ACTION_NAME | final public static String SYNPREDGATE_ACTION_NAME(Code) | | |
SYNPRED_RULE_PREFIX | final public static String SYNPRED_RULE_PREFIX(Code) | | |
TOKEN_LABEL | final public static int TOKEN_LABEL(Code) | | |
TOKEN_LIST_LABEL | final public static int TOKEN_LIST_LABEL(Code) | | |
TREE_PARSER | final public static int TREE_PARSER(Code) | | |
actions | protected Map actions(Code) | | Map a scope to a map of name:action pairs.
Map>
The code generator will use this to fill holes in the output files.
I track the AST node for the action in case I need the line number
for errors.
|
allDecisionDFACreated | protected boolean allDecisionDFACreated(Code) | | |
blocksWithSemPreds | public Set<GrammarAST> blocksWithSemPreds(Code) | | Track decisions with syn preds specified for reporting.
This is the a set of BLOCK type AST nodes.
|
blocksWithSynPreds | public Set<GrammarAST> blocksWithSynPreds(Code) | | Track decisions with syn preds specified for reporting.
This is the a set of BLOCK type AST nodes.
|
builtFromString | protected boolean builtFromString(Code) | | We need a way to detect when a lexer grammar is autogenerated from
another grammar or we are just sending in a string representing a
grammar. We don't want to generate a .tokens file, for example,
in such cases.
|
charVocabulary | protected IntSet charVocabulary(Code) | | TODO: hook this to the charVocabulary option
|
decisionNumber | protected int decisionNumber(Code) | | Be able to assign a number to every decision in grammar;
decisions in 1..n
|
decisionsWhoseDFAsUsesSemPreds | public Set decisionsWhoseDFAsUsesSemPreds(Code) | | Track decisions that actually use the syn preds in the DFA. Set
|
decisionsWhoseDFAsUsesSynPreds | public Set<DFA> decisionsWhoseDFAsUsesSynPreds(Code) | | Track decisions that actually use the syn preds in the DFA.
Computed during NFA to DFA conversion.
|
defaultOptions | final public static Map defaultOptions(Code) | | |
doNotCopyOptionsToLexer | final public static Set doNotCopyOptionsToLexer(Code) | | |
externalAnalysisAbort | protected boolean externalAnalysisAbort(Code) | | An external tool requests that DFA analysis abort prematurely. Stops
at DFA granularity, which are limited to a DFA size and time computation
as failsafe.
|
fileName | protected String fileName(Code) | | What file name holds this grammar?
|
generator | protected CodeGenerator generator(Code) | | If non-null, this is the code generator we will use to generate
recognizers in the target language.
|
global_k | protected int global_k(Code) | | Is there a global fixed lookahead set for this grammar?
If 0, nothing specified. -1 implies we have not looked at
the options table yet to set k.
|
grammarTree | protected GrammarAST grammarTree(Code) | | An AST that records entire input grammar with all rules. A simple
grammar with one rule, "grammar t; a : A | B ;", looks like:
( grammar t ( rule a ( BLOCK ( ALT A ) ( ALT B ) ) ) )
|
grammarTypeToFileNameSuffix | final public static String[] grammarTypeToFileNameSuffix(Code) | | |
grammarTypeToString | final public static String[] grammarTypeToString(Code) | | |
importTokenVocabularyFromGrammar | protected Grammar importTokenVocabularyFromGrammar(Code) | | For interpreting and testing, you sometimes want to import token
definitions from another grammar (instead of reading token defs from
a file).
|
indexToDecision | protected Vector indexToDecision(Code) | | Each subrule/rule is a decision point and we must track them so we
can go back later and build DFA predictors for them. This includes
all the rules, subrules, optional blocks, ()+, ()* etc... The
elements in this list are NFAState objects.
|
leftRecursiveRules | protected Set leftRecursiveRules(Code) | | A list of all rules that are in any left-recursive cycle. There
could be multiple cycles, but this is a flat list of all problematic
rules.
|
legalOptions | final public static Set legalOptions(Code) | | |
lexerGrammarST | protected StringTemplate lexerGrammarST(Code) | | For merged lexer/parsers, we must construct a separate lexer spec.
This is the template for lexer; put the literals first then the
regular rules. We don't need to specify a token vocab import as
I make the new grammar import from the old all in memory; don't want
to force it to read from the disk. Lexer grammar will have same
name as original grammar but will be in different filename. Foo.g
with combined grammar will have FooParser.java generated and
Foo__.g with again Foo inside. It will however generate FooLexer.java
as it's a lexer grammar. A bit odd, but autogenerated. Can tweak
later if we want.
|
lexerRules | protected Set<String> lexerRules(Code) | | If combined or lexer grammar, track the rules; Set.
Track lexer rules so we can warn about undefined tokens.
|
lineColumnToLookaheadDFAMap | Map lineColumnToLookaheadDFAMap(Code) | | For ANTLRWorks, we want to be able to map a line:col to a specific
decision DFA so it can display DFA.
|
lookBusy | protected Set lookBusy(Code) | | Used during LOOK to detect computation cycles
|
maxTokenType | protected int maxTokenType(Code) | | Token names and literal tokens like "void" are uniquely indexed.
with -1 implying EOF. Characters are different; they go from
-1 (EOF) to \uFFFE. For example, 0 could be a binary byte you
want to lexer. Labels of DFA/NFA transitions can be both tokens
and characters. I use negative numbers for bookkeeping labels
like EPSILON. Char/String literals and token types overlap in the same
space, however.
|
name | public String name(Code) | | What name did the user provide for this grammar?
|
nameToSynpredASTMap | protected LinkedHashMap nameToSynpredASTMap(Code) | | When we read in a grammar, we track the list of syntactic predicates
and build faux rules for them later. See my blog entry Dec 2, 2005:
http://www.antlr.org/blog/antlr3/lookahead.tml
This maps the name (we make up) for a pred to the AST grammar fragment.
|
nfa | protected NFA nfa(Code) | | The NFA that represents the grammar with edges labelled with tokens
or epsilon. It is more suitable to analysis than an AST representation.
|
numberOfManualLookaheadOptions | public int numberOfManualLookaheadOptions(Code) | | |
numberOfSemanticPredicates | public int numberOfSemanticPredicates(Code) | | |
options | protected Map options(Code) | | A list of options specified at the grammar level such as language=Java.
The value can be an AST for complicated values such as character sets.
There may be code generator specific options in here. I do no
interpretation of the key/value pairs...they are simply available for
who wants them.
|
ruleIndex | protected int ruleIndex(Code) | | Rules are uniquely labeled from 1..n
|
ruleIndexToRuleList | protected Vector ruleIndexToRuleList(Code) | | Map a rule index to its name; use a Vector on purpose as new
collections stuff won't let me setSize and make it grow. :(
I need a specific guaranteed index, which the Collections stuff
won't let me have.
|
ruleRefs | protected Set<antlr.Token> ruleRefs(Code) | | The unique set of all rule references in any rule; set of Token
objects so two refs to same rule can exist but at different line/position.
|
scopes | protected Map scopes(Code) | | Track the scopes defined outside of rules and the scopes associated
with all rules (even if empty).
|
setOfDFAWhoseConversionTerminatedEarly | public Set setOfDFAWhoseConversionTerminatedEarly(Code) | | |
setOfNondeterministicDecisionNumbers | public Set setOfNondeterministicDecisionNumbers(Code) | | |
setOfNondeterministicDecisionNumbersResolvedWithPredicates | public Set setOfNondeterministicDecisionNumbersResolvedWithPredicates(Code) | | |
stringLiteralToTypeMap | protected Map stringLiteralToTypeMap(Code) | | Map token literals like "while" to its token type. It may be that
WHILE="while"=35, in which case both tokenNameToTypeMap and this
field will have entries both mapped to 35.
|
synPredNamesUsedInDFA | public Set<String> synPredNamesUsedInDFA(Code) | | Track names of preds so we can avoid generating preds that aren't used
Computed during NFA to DFA conversion. Just walk accept states
and look for synpreds because that is the only state target whose
incident edges can have synpreds. Same is try for
decisionsWhoseDFAsUsesSynPreds.
|
tokenBuffer | protected TokenStreamRewriteEngine tokenBuffer(Code) | | This is the buffer of *all* tokens found in the grammar file
including whitespace tokens etc... I use this to extract
lexer rules from combined grammars.
|
tokenIDRefs | protected Set<antlr.Token> tokenIDRefs(Code) | | The unique set of all token ID references in any rule
|
tokenIDToTypeMap | protected Map tokenIDToTypeMap(Code) | | Map token like ID (but not literals like "while") to its token type
|
type | public int type(Code) | | What type of grammar is this: lexer, parser, tree walker
|
typeToTokenList | protected Vector typeToTokenList(Code) | | Map a token type to its token name.
Must subtract MIN_TOKEN_TYPE from index.
|
visitedDuringRecursionCheck | protected Set visitedDuringRecursionCheck(Code) | | The checkForLeftRecursion method needs to track what rules it has
visited to track infinite recursion.
|
watchNFAConversion | protected boolean watchNFAConversion(Code) | | |
Grammar | public Grammar(String grammarString) throws antlr.RecognitionException, antlr.TokenStreamException(Code) | | |
Grammar | public Grammar(String fileName, String grammarString) throws antlr.RecognitionException, antlr.TokenStreamException(Code) | | |
Grammar | public Grammar(Tool tool, String fileName, Reader r) throws antlr.RecognitionException, antlr.TokenStreamException(Code) | | Create a grammar from a Reader. Parse the grammar, building a tree
and loading a symbol table of sorts here in Grammar. Then create
an NFA and associated factory. Walk the AST representing the grammar,
building the state clusters of the NFA.
|
LOOK | public LookaheadSet LOOK(NFAState s)(Code) | | From an NFA state, s, find the set of all labels reachable from s.
This computes FIRST, FOLLOW and any other lookahead computation
depending on where s is.
Record, with EOR_TOKEN_TYPE, if you hit the end of a rule so we can
know at runtime (when these sets are used) to start walking up the
follow chain to compute the real, correct follow set.
This routine will only be used on parser and tree parser grammars.
TODO: it does properly handle a : b A ; where b is nullable
Actually it stops at end of rules, returning EOR. Hmm...
should check for that and keep going.
|
NFAToDFAConversionExternallyAborted | public boolean NFAToDFAConversionExternallyAborted()(Code) | | |
addArtificialMatchTokensRule | public GrammarAST addArtificialMatchTokensRule(GrammarAST grammarAST, List ruleNames, boolean filterMode)(Code) | | Parse a rule we add artificially that is a list of the other lexer
rules like this: "Tokens : ID | INT | SEMI ;" nextToken() will invoke
this to set the current token. Add char literals before
the rule references.
If in filter mode, we want every alt to backtrack and we need to
do k=1 to force the "first token def wins" rule. Otherwise, the
longest-match rule comes into play with LL(*).
The ANTLRParser antlr.g file now invokes this when parsing a lexer
grammar, which I think is proper even though it peeks at the info
that later phases will compute. It gets a list of lexer rules
and builds a string representing the rule; then it creates a parser
and adds the resulting tree to the grammar's tree.
|
allDecisionDFAHaveBeenCreated | public boolean allDecisionDFAHaveBeenCreated()(Code) | | |
altReferencesRule | public void altReferencesRule(String ruleName, GrammarAST refAST, int outerAltNum)(Code) | | Track a rule reference within an outermost alt of a rule. Used
at the moment to decide if $ruleref refers to a unique rule ref in
the alt. Rewrite rules force tracking of all rule AST results.
This data is also used to verify that all rules have been defined.
|
altReferencesTokenID | public void altReferencesTokenID(String ruleName, GrammarAST refAST, int outerAltNum)(Code) | | Track a token reference within an outermost alt of a rule. Used
to decide if $tokenref refers to a unique token ref in
the alt. Does not track literals!
Rewrite rules force tracking of all tokens.
|
assignDecisionNumber | public int assignDecisionNumber(NFAState state)(Code) | | |
buildAST | public boolean buildAST()(Code) | | |
buildTemplate | public boolean buildTemplate()(Code) | | |
checkAllRulesForLeftRecursion | public List checkAllRulesForLeftRecursion()(Code) | | |
checkAllRulesForUselessLabels | public void checkAllRulesForUselessLabels()(Code) | | Remove all labels on rule refs whose target rules have no return value.
Do this for all rules in grammar.
|
complement | public IntSet complement(IntSet set)(Code) | | For lexer grammars, return everything in unicode not in set.
For parser and tree grammars, return everything in token space
from MIN_TOKEN_TYPE to last valid token type or char value.
|
computeTokenNameFromLiteral | public String computeTokenNameFromLiteral(int tokenType, String literal)(Code) | | given a token type and the text of the literal, come up with a
decent token type label. For now it's just T. Actually,
if there is an aliased name from tokens like PLUS='+', use it.
|
createDecision | protected Decision createDecision(int decision)(Code) | | |
createLookaheadDFA | public void createLookaheadDFA(int decision)(Code) | | |
createLookaheadDFAs | public void createLookaheadDFAs()(Code) | | For each decision in this grammar, compute a single DFA using the
NFA states associated with the decision. The DFA construction
determines whether or not the alternatives in the decision are
separable using a regular lookahead language.
Store the lookahead DFAs in the AST created from the user's grammar
so the code generator or whoever can easily access it.
This is a separate method because you might want to create a
Grammar without doing the expensive analysis.
|
createNFAs | public void createNFAs()(Code) | | Walk the list of options, altering this Grammar object according
to any I recognize.
protected void processOptions() {
Iterator optionNames = options.keySet().iterator();
while (optionNames.hasNext()) {
String optionName = (String) optionNames.next();
Object value = options.get(optionName);
if ( optionName.equals("tokenVocab") ) {
}
}
}
|
defineLabel | protected void defineLabel(Rule r, antlr.Token label, GrammarAST element, int type)(Code) | | Define a label defined in a rule r; check the validity then ask the
Rule object to actually define it.
|
defineLexerRuleForAliasedStringLiteral | public void defineLexerRuleForAliasedStringLiteral(String tokenID, String literal, int tokenType)(Code) | | If someone does PLUS='+' in the parser, must make sure we get
"PLUS : '+' ;" in lexer not "T73 : '+';"
|
defineLexerRuleForStringLiteral | public void defineLexerRuleForStringLiteral(String literal, int tokenType)(Code) | | |
defineLexerRuleFoundInParser | public void defineLexerRuleFoundInParser(antlr.Token ruleToken, GrammarAST ruleAST)(Code) | | |
defineNamedAction | public void defineNamedAction(GrammarAST ampersandAST, String scope, GrammarAST nameAST, GrammarAST actionAST)(Code) | | Given @scope::name {action} define it for this grammar. Later,
the code generator will ask for the actions table.
|
defineRule | public void defineRule(antlr.Token ruleToken, String modifier, Map options, GrammarAST tree, GrammarAST argActionAST, int numAlts)(Code) | | Define a new rule. A new rule index is created by incrementing
ruleIndex.
|
defineRuleListLabel | public void defineRuleListLabel(String ruleName, antlr.Token label, GrammarAST element)(Code) | | |
defineRuleRefLabel | public void defineRuleRefLabel(String ruleName, antlr.Token label, GrammarAST ruleRef)(Code) | | |
defineSyntacticPredicate | public String defineSyntacticPredicate(GrammarAST blockAST, String currentRuleName)(Code) | | Define a new predicate and get back its name for use in building
a semantic predicate reference to the syn pred.
|
defineToken | public void defineToken(String text, int tokenType)(Code) | | Define a token at a particular token type value. Blast an
old value with a new one. This is called directly during import vocab
operation to set up tokens with specific values.
|
defineTokenListLabel | public void defineTokenListLabel(String ruleName, antlr.Token label, GrammarAST element)(Code) | | |
defineTokenRefLabel | public void defineTokenRefLabel(String ruleName, antlr.Token label, GrammarAST tokenRef)(Code) | | |
examineAllExecutableActions | protected void examineAllExecutableActions()(Code) | | Before generating code, we examine all actions that can have
$x.y and $y stuff in them because some code generation depends on
Rule.referencedPredefinedRuleAttributes. I need to remove unused
rule labels for example.
|
externallyAbortNFAToDFAConversion | public void externallyAbortNFAToDFAConversion()(Code) | | Terminate DFA creation (grammar analysis).
|
getANTLRCharLiteralForChar | public static String getANTLRCharLiteralForChar(int c)(Code) | | Return a string representing the escaped char for code c. E.g., If c
has value 0x100, you will get "\u0100". ASCII gets the usual
char (non-hex) representation. Control characters are spit out
as unicode. While this is specially set up for returning Java strings,
it can be used by any language target that has the same syntax. :)
11/26/2005: I changed this to use double quotes, consistent with antlr.g
12/09/2005: I changed so everything is single quotes
|
getAllCharValues | public IntSet getAllCharValues()(Code) | | If there is a char vocabulary, use it; else return min to max char
as defined by the target. If no target, use max unicode char value.
|
getArtificialRulesForSyntacticPredicates | protected List getArtificialRulesForSyntacticPredicates(ANTLRParser parser, LinkedHashMap nameToSynpredASTMap)(Code) | | for any syntactic predicates, we need to define rules for them; they will get
defined automatically like any other rule. :)
|
getCharValueFromGrammarCharLiteral | public static int getCharValueFromGrammarCharLiteral(String literal)(Code) | | Given a literal like (the 3 char sequence with single quotes) 'a',
return the int value of 'a'. Convert escape sequences here also.
ANTLR's antlr.g parser does not convert escape sequences.
11/26/2005: I changed literals to always be '...' even for strings.
This routine still works though.
|
getDecision | protected Decision getDecision(int decision)(Code) | | |
getDecisionNFAStartState | public NFAState getDecisionNFAStartState(int decision)(Code) | | |
getDecisionNFAStartStateList | public List getDecisionNFAStartStateList()(Code) | | |
getDefaultActionScope | public String getDefaultActionScope(int grammarType)(Code) | | Given a grammar type, what should be the default action scope?
If I say @members in a COMBINED grammar, for example, the
default scope should be "parser".
|
getGlobalScopes | public Map getGlobalScopes()(Code) | | |
getGrammarMaxLookahead | public int getGrammarMaxLookahead()(Code) | | |
getImplicitlyGeneratedLexerFileName | public String getImplicitlyGeneratedLexerFileName()(Code) | | |
getImportedVocabFileName | public File getImportedVocabFileName(String vocabName)(Code) | | |
getLabels | public Set<String> getLabels(Set<GrammarAST> rewriteElements, int labelType)(Code) | | Given a set of all rewrite elements on right of ->, filter for
label types such as Grammar.TOKEN_LABEL, Grammar.TOKEN_LIST_LABEL, ...
Return a displayable token type name computed from the GrammarAST.
|
getLeftRecursiveRules | public Set getLeftRecursiveRules()(Code) | | Return a list of left-recursive rules; no analysis can be done
successfully on these. Useful to skip these rules then and also
for ANTLRWorks to highlight them.
|
getLexerGrammar | public String getLexerGrammar()(Code) | | If the grammar is a merged grammar, return the text of the implicit
lexer grammar.
|
getLineColumnToLookaheadDFAMap | public Map getLineColumnToLookaheadDFAMap()(Code) | | |
getLookaheadDFA | public DFA getLookaheadDFA(int decision)(Code) | | |
getLookaheadDFAColumnsForLineInFile | public List getLookaheadDFAColumnsForLineInFile(int line)(Code) | | returns a list of column numbers for all decisions
on a particular line so ANTLRWorks choose the decision
depending on the location of the cursor (otherwise,
ANTLRWorks has to give the *exact* location which
is not easy from the user point of view).
This is not particularly fast as it walks entire line:col->DFA map
looking for a prefix of "line:".
|
getLookaheadDFAFromPositionInFile | public DFA getLookaheadDFAFromPositionInFile(int line, int col)(Code) | | Useful for ANTLRWorks to map position in file to the DFA for display
|
getMaxCharValue | public int getMaxCharValue()(Code) | | What is the max char value possible for this grammar's target? Use
unicode max if no target defined.
|
getMaxTokenType | public int getMaxTokenType()(Code) | | How many token types have been allocated so far?
|
getNFAStateForAltOfDecision | public NFAState getNFAStateForAltOfDecision(NFAState decisionState, int alt)(Code) | | Get the ith alternative (1..n) from a decision; return null when
an invalid alt is requested. I must count in to find the right
alternative number. For (A|B), you get NFA structure (roughly):
o->o-A->o
|
o->o-B->o
This routine returns the leftmost state for each alt. So alt=1, returns
the upperleft most state in this structure.
|
getNewTokenType | public int getNewTokenType()(Code) | | Return a new unique integer in the token type space
|
getNumberOfAltsForDecisionNFA | public int getNumberOfAltsForDecisionNFA(NFAState decisionState)(Code) | | Decisions are linked together with transition(1). Count how
many there are. This is here rather than in NFAState because
a grammar decides how NFAs are put together to form a decision.
|
getNumberOfCyclicDecisions | public int getNumberOfCyclicDecisions()(Code) | | |
getNumberOfDecisions | public int getNumberOfDecisions()(Code) | | |
getSetFromRule | public IntSet getSetFromRule(TreeToNFAConverter nfabuilder, String ruleName) throws RecognitionException(Code) | | Get the set equivalent (if any) of the indicated rule from this
grammar. Mostly used in the lexer to do ~T for some fragment rule
T. If the rule AST has a SET use that. If the rule is a single char
convert it to a set and return. If rule is not a simple set (w/o actions)
then return null.
Rules have AST form:
^( RULE ID modifier ARG RET SCOPE block EOR )
|
getStringLiterals | public Set getStringLiterals()(Code) | | Get the list of ANTLR String literals
|
getTokenDisplayName | public String getTokenDisplayName(int ttype)(Code) | | Given a token type, get a meaningful name for it such as the ID
or string literal. If this is a lexer and the ttype is in the
char vocabulary, compute an ANTLR-valid (possibly escaped) char literal.
|
getTokenDisplayNames | public Set getTokenDisplayNames()(Code) | | Get a list of all token IDs and literals that have an associated
token type.
|
getTokenIDs | public Set getTokenIDs()(Code) | | Get the list of tokens that are IDs like BLOCK and LPAREN
|
getTokenTypes | public IntSet getTokenTypes()(Code) | | Return a set of all possible token or char types for this grammar
|
getTokenTypesWithoutID | public Collection getTokenTypesWithoutID()(Code) | | Return an ordered integer list of token types that have no
corresponding token ID like INT or KEYWORD_BEGIN; for stuff
like 'begin'.
|
getUnescapedStringFromGrammarStringLiteral | public static StringBuffer getUnescapedStringFromGrammarStringLiteral(String literal)(Code) | | ANTLR does not convert escape sequences during the parse phase because
it could not know how to print String/char literals back out when
printing grammars etc... Someone in China might use the real unicode
char in a literal as it will display on their screen; when printing
back out, I could not know whether to display or use a unicode escape.
This routine converts a string literal with possible escape sequences
into a pure string of 16-bit char values. Escapes and unicode \u0000
specs are converted to pure chars. return in a buffer; people may
want to walk/manipulate further.
The NFA construction routine must know the actual char values.
|
getWatchNFAConversion | public boolean getWatchNFAConversion()(Code) | | |
importTokenVocabulary | public int importTokenVocabulary(Grammar importFromGr)(Code) | | Pull your token definitions from an existing grammar in memory.
You must use Grammar() ctor then this method then setGrammarContent()
to make this work. This is useful primarily for testing and
interpreting grammars. Return the max token type found.
|
importTokenVocabulary | public int importTokenVocabulary(String vocabName)(Code) | | Load a vocab file .tokens and return max token type found.
|
initTokenSymbolTables | protected void initTokenSymbolTables()(Code) | | |
isBuiltFromString | public boolean isBuiltFromString()(Code) | | |
isEmptyRule | public boolean isEmptyRule(GrammarAST block)(Code) | | Rules like "a : ;" and "a : {...} ;" should not generate
try/catch blocks for RecognitionException. To detect this
it's probably ok to just look for any reference to an atom
that can match some input. W/o that, the rule is unlikey to have
any else.
|
isValidSet | public boolean isValidSet(TreeToNFAConverter nfabuilder, GrammarAST t)(Code) | | Given set tree like ( SET A B ) in lexer, check that A and B
are both valid sets themselves, else we must tree like a BLOCK
|
referenceRuleLabelPredefinedAttribute | public void referenceRuleLabelPredefinedAttribute(String ruleName)(Code) | | To yield smaller, more readable code, track which rules have their
predefined attributes accessed. If the rule has no user-defined
return values, then don't generate the return value scope classes
etc... Make the rule have void return value. Don't track for lexer
rules.
|
removeUselessLabels | protected void removeUselessLabels(Map ruleToElementLabelPairMap)(Code) | | A label on a rule is useless if the rule has no return value, no
tree or template output, and it is not referenced in an action.
|
setDecisionBlockAST | public void setDecisionBlockAST(int decision, GrammarAST blockAST)(Code) | | |
setDecisionNFA | public void setDecisionNFA(int decision, NFAState state)(Code) | | |
setGrammarContent | public void setGrammarContent(String grammarString) throws antlr.RecognitionException, antlr.TokenStreamException(Code) | | |
setGrammarContent | public void setGrammarContent(Reader r) throws antlr.RecognitionException, antlr.TokenStreamException(Code) | | |
setLookaheadDFA | public void setLookaheadDFA(int decision, DFA lookaheadDFA)(Code) | | Set the lookahead DFA for a particular decision. This means
that the appropriate AST node must updated to have the new lookahead
DFA. This method could be used to properly set the DFAs without
using the createLookaheadDFAs() method. You could do this
Grammar g = new Grammar("...");
g.setLookahead(1, dfa1);
g.setLookahead(2, dfa2);
...
|
setOption | public String setOption(String key, Object value, antlr.Token optionsStartToken)(Code) | | Save the option key/value pair and process it; return the key
or null if invalid option.
|
setOptions | public void setOptions(Map options, antlr.Token optionsStartToken)(Code) | | |
setWatchNFAConversion | public void setWatchNFAConversion(boolean watchNFAConversion)(Code) | | |
|
|