Source Code Cross Referenced for NFAToDFAConverter.java in  » Parser » antlr-3.0.1 » org » antlr » analysis » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Parser » antlr 3.0.1 » org.antlr.analysis 
Source Cross Referenced  Class Diagram Java Document (Java Doc) 


0001:        /*
0002:         [The "BSD licence"]
0003:         Copyright (c) 2005-2006 Terence Parr
0004:         All rights reserved.
0005:
0006:         Redistribution and use in source and binary forms, with or without
0007:         modification, are permitted provided that the following conditions
0008:         are met:
0009:         1. Redistributions of source code must retain the above copyright
0010:            notice, this list of conditions and the following disclaimer.
0011:         2. Redistributions in binary form must reproduce the above copyright
0012:            notice, this list of conditions and the following disclaimer in the
0013:            documentation and/or other materials provided with the distribution.
0014:         3. The name of the author may not be used to endorse or promote products
0015:            derived from this software without specific prior written permission.
0016:
0017:         THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
0018:         IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
0019:         OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
0020:         IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
0021:         INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
0022:         NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
0023:         DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
0024:         THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
0025:         (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
0026:         THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
0027:         */
0028:        package org.antlr.analysis;
0029:
0030:        import org.antlr.misc.IntSet;
0031:        import org.antlr.misc.OrderedHashSet;
0032:        import org.antlr.misc.Utils;
0033:
0034:        import java.util.*;
0035:
0036:        /** Code that embodies the NFA conversion to DFA. */
0037:        public class NFAToDFAConverter {
0038:            /** A list of DFA states we still need to process during NFA conversion */
0039:            protected List work = new LinkedList();
0040:
0041:            /** While converting NFA, we must track states that
0042:             *  reference other rule's NFAs so we know what to do
0043:             *  at the end of a rule.  We need to know what context invoked
0044:             *  this rule so we can know where to continue looking for NFA
0045:             *  states.  I'm tracking a context tree (record of rule invocation
0046:             *  stack trace) for each alternative that could be predicted.
0047:             */
0048:            protected NFAContext[] contextTrees;
0049:
0050:            /** We are converting which DFA? */
0051:            protected DFA dfa;
0052:
0053:            public static boolean debug = false;
0054:
0055:            /** Should ANTLR launch multiple threads to convert NFAs to DFAs?
0056:             *  With a 2-CPU box, I note that it's about the same single or
0057:             *  multithreaded.  Both CPU meters are going even when single-threaded
0058:             *  so I assume the GC is killing us.  Could be the compiler.  When I
0059:             *  run java -Xint mode, I get about 15% speed improvement with multiple
0060:             *  threads.
0061:             */
0062:            public static boolean SINGLE_THREADED_NFA_CONVERSION = true;
0063:
0064:            public NFAToDFAConverter(DFA dfa) {
0065:                this .dfa = dfa;
0066:                NFAState nfaStartState = dfa.getNFADecisionStartState();
0067:                int nAlts = dfa.getNumberOfAlts();
0068:                initContextTrees(nAlts);
0069:            }
0070:
0071:            public void convert() {
0072:                dfa.conversionStartTime = System.currentTimeMillis();
0073:
0074:                // create the DFA start state
0075:                dfa.startState = computeStartState();
0076:
0077:                // while more DFA states to check, process them
0078:                while (work.size() > 0
0079:                        && !dfa.probe.analysisAborted()
0080:                        && !dfa.probe.nonLLStarDecision
0081:                        && !dfa.nfa.grammar
0082:                                .NFAToDFAConversionExternallyAborted()) {
0083:                    DFAState d = (DFAState) work.get(0);
0084:                    if (dfa.nfa.grammar.getWatchNFAConversion()) {
0085:                        System.out.println("convert DFA state " + d.stateNumber
0086:                                + " (" + d.getNFAConfigurations().size()
0087:                                + " nfa states)");
0088:                    }
0089:                    int k = dfa.getUserMaxLookahead();
0090:                    if (k > 0 && k == d.getLookaheadDepth()) {
0091:                        // we've hit max lookahead, make this a stop state
0092:                        //System.out.println("stop state @k="+k+" (terminated early)");
0093:                        resolveNonDeterminisms(d);
0094:                        // Check to see if we need to add any semantic predicate transitions
0095:                        if (d.isResolvedWithPredicates()) {
0096:                            addPredicateTransitions(d);
0097:                        } else {
0098:                            d.setAcceptState(true); // must convert to accept state at k
0099:                        }
0100:                    } else {
0101:                        findNewDFAStatesAndAddDFATransitions(d);
0102:                    }
0103:                    work.remove(0); // done with it; remove from work list
0104:                }
0105:
0106:                // walk all accept states and find the synpreds
0107:                // I used to do this in the code generator, but that is too late.
0108:                // This converter tries to avoid computing DFA for decisions in
0109:                // syntactic predicates that are not ever used such as those
0110:                // created by autobacktrack mode.
0111:                int nAlts = dfa.getNumberOfAlts();
0112:                for (int i = 1; i <= nAlts; i++) {
0113:                    DFAState a = dfa.getAcceptState(i);
0114:                    if (a != null) {
0115:                        Set synpreds = a
0116:                                .getSyntacticPredicatesInNFAConfigurations();
0117:                        if (synpreds != null) {
0118:                            // add all the predicates we find (should be just one, right?)
0119:                            for (Iterator it = synpreds.iterator(); it
0120:                                    .hasNext();) {
0121:                                SemanticContext semctx = (SemanticContext) it
0122:                                        .next();
0123:                                // System.out.println("synpreds: "+semctx);
0124:                                dfa.nfa.grammar.synPredUsedInDFA(dfa, semctx);
0125:                            }
0126:                        }
0127:                    }
0128:                }
0129:
0130:            }
0131:
0132:            /** From this first NFA state of a decision, create a DFA.
0133:             *  Walk each alt in decision and compute closure from the start of that
0134:             *  rule, making sure that the closure does not include other alts within
0135:             *  that same decision.  The idea is to associate a specific alt number
0136:             *  with the starting closure so we can trace the alt number for all states
0137:             *  derived from this.  At a stop state in the DFA, we can return this alt
0138:             *  number, indicating which alt is predicted.
0139:             *
0140:             *  If this DFA is derived from an loop back NFA state, then the first
0141:             *  transition is actually the exit branch of the loop.  Rather than make
0142:             *  this alternative one, let's make this alt n+1 where n is the number of
0143:             *  alts in this block.  This is nice to keep the alts of the block 1..n;
0144:             *  helps with error messages.
0145:             *
0146:             *  I handle nongreedy in findNewDFAStatesAndAddDFATransitions
0147:             *  when nongreedy and EOT transition.  Make state with EOT emanating
0148:             *  from it the accept state.
0149:             */
0150:            protected DFAState computeStartState() {
0151:                NFAState alt = dfa.decisionNFAStartState;
0152:                DFAState startState = dfa.newState();
0153:                int i = 0;
0154:                int altNum = 1;
0155:                while (alt != null) {
0156:                    // find the set of NFA states reachable without consuming
0157:                    // any input symbols for each alt.  Keep adding to same
0158:                    // overall closure that will represent the DFA start state,
0159:                    // but track the alt number
0160:                    NFAContext initialContext = contextTrees[i];
0161:                    // if first alt is derived from loopback/exit branch of loop,
0162:                    // make alt=n+1 for n alts instead of 1
0163:                    if (i == 0
0164:                            && dfa.getNFADecisionStartState().decisionStateType == NFAState.LOOPBACK) {
0165:                        int numAltsIncludingExitBranch = dfa.nfa.grammar
0166:                                .getNumberOfAltsForDecisionNFA(dfa.decisionNFAStartState);
0167:                        altNum = numAltsIncludingExitBranch;
0168:                        closure((NFAState) alt.transition(0).target, altNum,
0169:                                initialContext,
0170:                                SemanticContext.EMPTY_SEMANTIC_CONTEXT,
0171:                                startState, true);
0172:                        altNum = 1; // make next alt the first
0173:                    } else {
0174:                        closure((NFAState) alt.transition(0).target, altNum,
0175:                                initialContext,
0176:                                SemanticContext.EMPTY_SEMANTIC_CONTEXT,
0177:                                startState, true);
0178:                        altNum++;
0179:                    }
0180:                    i++;
0181:
0182:                    // move to next alternative
0183:                    if (alt.transition(1) == null) {
0184:                        break;
0185:                    }
0186:                    alt = (NFAState) alt.transition(1).target;
0187:                }
0188:
0189:                // now DFA start state has the complete closure for the decision
0190:                // but we have tracked which alt is associated with which
0191:                // NFA states.
0192:                dfa.addState(startState); // make sure dfa knows about this state
0193:                work.add(startState);
0194:                return startState;
0195:            }
0196:
0197:            /** From this node, add a d--a-->t transition for all
0198:             *  labels 'a' where t is a DFA node created
0199:             *  from the set of NFA states reachable from any NFA
0200:             *  state in DFA state d.
0201:             */
0202:            protected void findNewDFAStatesAndAddDFATransitions(DFAState d) {
0203:                //System.out.println("work on DFA state "+d);
0204:                OrderedHashSet labels = d.getReachableLabels();
0205:                /*
0206:                System.out.println("reachable="+labels.toString());
0207:                System.out.println("|reachable|/|nfaconfigs|="+
0208:                		labels.size()+"/"+d.getNFAConfigurations().size()+"="+
0209:                		labels.size()/(float)d.getNFAConfigurations().size());
0210:                 */
0211:
0212:                // normally EOT is the "default" clause and decisions just
0213:                // choose that last clause when nothing else matches.  DFA conversion
0214:                // continues searching for a unique sequence that predicts the
0215:                // various alts or until it finds EOT.  So this rule
0216:                //
0217:                // DUH : ('x'|'y')* "xy!";
0218:                //
0219:                // does not need a greedy indicator.  The following rule works fine too
0220:                //
0221:                // A : ('x')+ ;
0222:                //
0223:                // When the follow branch could match what is in the loop, by default,
0224:                // the nondeterminism is resolved in favor of the loop.  You don't
0225:                // get a warning because the only way to get this condition is if
0226:                // the DFA conversion hits the end of the token.  In that case,
0227:                // we're not *sure* what will happen next, but it could be anything.
0228:                // Anyway, EOT is the default case which means it will never be matched
0229:                // as resolution goes to the lowest alt number.  Exit branches are
0230:                // always alt n+1 for n alts in a block.
0231:                //
0232:                // When a loop is nongreedy and we find an EOT transition, the DFA
0233:                // state should become an accept state, predicting exit of loop.  It's
0234:                // just reversing the resolution of ambiguity.
0235:                // TODO: should this be done in the resolveAmbig method?
0236:                Label EOTLabel = new Label(Label.EOT);
0237:                boolean containsEOT = labels.contains(EOTLabel);
0238:                if (!dfa.isGreedy() && containsEOT) {
0239:                    convertToEOTAcceptState(d);
0240:                    return; // no more work to do on this accept state
0241:                }
0242:
0243:                // if in filter mode for lexer, want to match shortest not longest
0244:                // string so if we see an EOT edge emanating from this state, then
0245:                // convert this state to an accept state.  This only counts for
0246:                // The Tokens rule as all other decisions must continue to look for
0247:                // longest match.
0248:                // [Taking back out a few days later on Jan 17, 2006.  This could
0249:                //  be an option for the future, but this was wrong soluion for
0250:                //  filtering.]
0251:                /*
0252:                if ( dfa.nfa.grammar.type==Grammar.LEXER && containsEOT ) {
0253:                	String filterOption = (String)dfa.nfa.grammar.getOption("filter");
0254:                	boolean filterMode = filterOption!=null && filterOption.equals("true");
0255:                	if ( filterMode && d.dfa.isTokensRuleDecision() ) {
0256:                		DFAState t = reach(d, EOTLabel);
0257:                		if ( t.getNFAConfigurations().size()>0 ) {
0258:                			convertToEOTAcceptState(d);
0259:                			//System.out.println("state "+d+" has EOT target "+t.stateNumber);
0260:                			return;
0261:                		}
0262:                	}
0263:                }
0264:                 */
0265:
0266:                int numberOfEdgesEmanating = 0;
0267:                Map targetToLabelMap = new HashMap();
0268:                // for each label that could possibly emanate from NFAStates of d
0269:                // (abort if we find any closure operation on a configuration of d
0270:                //  that finds multiple alts with recursion, non-LL(*), as we cannot
0271:                //  trust any reach operations from d since we are blind to some
0272:                //  paths.  Leave state a dead-end and try to resolve with preds)
0273:                for (int i = 0; !d.abortedDueToMultipleRecursiveAlts
0274:                        && i < labels.size(); i++) {
0275:                    Label label = (Label) labels.get(i);
0276:                    DFAState t = reach(d, label);
0277:                    if (debug) {
0278:                        System.out.println("DFA state after reach " + d + "-"
0279:                                + label.toString(dfa.nfa.grammar) + "->" + t);
0280:                    }
0281:                    if (t == null) {
0282:                        // nothing was reached by label due to conflict resolution
0283:                        // EOT also seems to be in here occasionally probably due
0284:                        // to an end-of-rule state seeing it even though we'll pop
0285:                        // an invoking state off the state; don't bother to conflict
0286:                        // as this labels set is a covering approximation only.
0287:                        continue;
0288:                    }
0289:                    if (t.getUniqueAlt() == NFA.INVALID_ALT_NUMBER) {
0290:                        // Only compute closure if a unique alt number is not known.
0291:                        // If a unique alternative is mentioned among all NFA
0292:                        // configurations then there is no possibility of needing to look
0293:                        // beyond this state; also no possibility of a nondeterminism.
0294:                        // This optimization May 22, 2006 just dropped -Xint time
0295:                        // for analysis of Java grammar from 11.5s to 2s!  Wow.
0296:                        closure(t); // add any NFA states reachable via epsilon
0297:                    }
0298:
0299:                    /*
0300:                    System.out.println("DFA state after closure "+d+"-"+
0301:                    				   label.toString(dfa.nfa.grammar)+
0302:                    				   "->"+t);
0303:                     */
0304:
0305:                    // add if not in DFA yet even if its closure aborts due to non-LL(*);
0306:                    // getting to the state is ok, we just can't see where to go next--it's
0307:                    // a blind alley.
0308:                    DFAState targetState = addDFAStateToWorkList(t);
0309:
0310:                    numberOfEdgesEmanating += addTransition(d, label,
0311:                            targetState, targetToLabelMap);
0312:
0313:                    // lookahead of target must be one larger than d's k
0314:                    targetState.setLookaheadDepth(d.getLookaheadDepth() + 1);
0315:
0316:                    // closure(t) might have aborted, but addDFAStateToWorkList will try
0317:                    // to resolve t with predicates.  If that fails, must give an error
0318:                    // Note: this is tested on the target of d not d.
0319:                    if (t.abortedDueToMultipleRecursiveAlts
0320:                            && !t.isResolvedWithPredicates()) {
0321:                        // no predicates to resolve non-LL(*) decision, report
0322:                        t.dfa.probe.reportNonLLStarDecision(t.dfa);
0323:                    }
0324:                }
0325:
0326:                //System.out.println("DFA after reach / closures:\n"+dfa);
0327:
0328:                if (!d.isResolvedWithPredicates()
0329:                        && numberOfEdgesEmanating == 0) {
0330:                    // TODO: can fixed lookahead hit a dangling state case?
0331:                    // TODO: yes, with left recursion
0332:                    // TODO: alter DANGLING err template to have input to that state
0333:                    //System.err.println("dangling state alts: "+d.getAltSet());
0334:                    dfa.probe.reportDanglingState(d);
0335:                    // turn off all configurations except for those associated with
0336:                    // min alt number; somebody has to win else some input will not
0337:                    // predict any alt.
0338:                    int minAlt = resolveByPickingMinAlt(d, null);
0339:                    convertToAcceptState(d, minAlt); // force it to be an accept state
0340:                }
0341:
0342:                // Check to see if we need to add any semantic predicate transitions
0343:                if (d.isResolvedWithPredicates()) {
0344:                    addPredicateTransitions(d);
0345:                }
0346:            }
0347:
0348:            /** Add a transition from state d to targetState with label in normal case.
0349:             *  if COLLAPSE_ALL_INCIDENT_EDGES, however, try to merge all edges from
0350:             *  d to targetState; this means merging their labels.  Another optimization
0351:             *  is to reduce to a single EOT edge any set of edges from d to targetState
0352:             *  where there exists an EOT state.  EOT is like the wildcard so don't
0353:             *  bother to test any other edges.  Example:
0354:             *
0355:             *  NUM_INT
0356:             *    : '1'..'9' ('0'..'9')* ('l'|'L')?
0357:             *    | '0' ('x'|'X') ('0'..'9'|'a'..'f'|'A'..'F')+ ('l'|'L')?
0358:             *    | '0' ('0'..'7')* ('l'|'L')?
0359:             *    ;
0360:             *
0361:             *  The normal decision to predict alts 1, 2, 3 is:
0362:             *
0363:             *  if ( (input.LA(1)>='1' && input.LA(1)<='9') ) {
0364:             *       alt7=1;
0365:             *  }
0366:             *  else if ( input.LA(1)=='0' ) {
0367:             *      if ( input.LA(2)=='X'||input.LA(2)=='x' ) {
0368:             *          alt7=2;
0369:             *      }
0370:             *      else if ( (input.LA(2)>='0' && input.LA(2)<='7') ) {
0371:             *           alt7=3;
0372:             *      }
0373:             *      else if ( input.LA(2)=='L'||input.LA(2)=='l' ) {
0374:             *           alt7=3;
0375:             *      }
0376:             *      else {
0377:             *           alt7=3;
0378:             *      }
0379:             *  }
0380:             *  else error
0381:             *
0382:             *  Clearly, alt 3 is predicted with extra work since it tests 0..7
0383:             *  and [lL] before finally realizing that any character is actually
0384:             *  ok at k=2.
0385:             *
0386:             *  A better decision is as follows:
0387:             *
0388:             *  if ( (input.LA(1)>='1' && input.LA(1)<='9') ) {
0389:             *      alt7=1;
0390:             *  }
0391:             *  else if ( input.LA(1)=='0' ) {
0392:             *      if ( input.LA(2)=='X'||input.LA(2)=='x' ) {
0393:             *          alt7=2;
0394:             *      }
0395:             *      else {
0396:             *          alt7=3;
0397:             *      }
0398:             *  }
0399:             *
0400:             *  The DFA originally has 3 edges going to the state the predicts alt 3,
0401:             *  but upon seeing the EOT edge (the "else"-clause), this method
0402:             *  replaces the old merged label (which would have (0..7|l|L)) with EOT.
0403:             *  The code generator then leaves alt 3 predicted with a simple else-
0404:             *  clause. :)
0405:             *
0406:             *  The only time the EOT optimization makes no sense is in the Tokens
0407:             *  rule.  We want EOT to truly mean you have matched an entire token
0408:             *  so don't bother actually rewinding to execute that rule unless there
0409:             *  are actions in that rule.  For now, since I am not preventing
0410:             *  backtracking from Tokens rule, I will simply allow the optimization.
0411:             */
0412:            protected static int addTransition(DFAState d, Label label,
0413:                    DFAState targetState, Map targetToLabelMap) {
0414:                //System.out.println(d.stateNumber+"-"+label.toString(dfa.nfa.grammar)+"->"+targetState.stateNumber);
0415:                int n = 0;
0416:                if (DFAOptimizer.COLLAPSE_ALL_PARALLEL_EDGES) {
0417:                    // track which targets we've hit
0418:                    Integer tI = Utils.integer(targetState.stateNumber);
0419:                    Transition oldTransition = (Transition) targetToLabelMap
0420:                            .get(tI);
0421:                    if (oldTransition != null) {
0422:                        //System.out.println("extra transition to "+tI+" upon "+label.toString(dfa.nfa.grammar));
0423:                        // already seen state d to target transition, just add label
0424:                        // to old label unless EOT
0425:                        if (label.getAtom() == Label.EOT) {
0426:                            // merge with EOT means old edge can go away
0427:                            oldTransition.label = new Label(Label.EOT);
0428:                        } else {
0429:                            // don't add anything to EOT, it's essentially the wildcard
0430:                            if (oldTransition.label.getAtom() != Label.EOT) {
0431:                                // ok, not EOT, add in this label to old label
0432:                                oldTransition.label.add(label);
0433:                            }
0434:                            //System.out.println("label updated to be "+oldTransition.label.toString(dfa.nfa.grammar));
0435:                        }
0436:                    } else {
0437:                        // make a transition from d to t upon 'a'
0438:                        n = 1;
0439:                        label = (Label) label.clone(); // clone in case we alter later
0440:                        int transitionIndex = d.addTransition(targetState,
0441:                                label);
0442:                        Transition trans = d.getTransition(transitionIndex);
0443:                        // track target/transition pairs
0444:                        targetToLabelMap.put(tI, trans);
0445:                    }
0446:                } else {
0447:                    n = 1;
0448:                    d.addTransition(targetState, label);
0449:                }
0450:                return n;
0451:            }
0452:
0453:            /** For all NFA states (configurations) merged in d,
0454:             *  compute the epsilon closure; that is, find all NFA states reachable
0455:             *  from the NFA states in d via purely epsilon transitions.
0456:             */
0457:            public void closure(DFAState d) {
0458:                if (debug) {
0459:                    System.out.println("closure(" + d + ")");
0460:                }
0461:                Set configs = new HashSet();
0462:                // Because we are adding to the configurations in closure
0463:                // must clone initial list so we know when to stop doing closure
0464:                // TODO: expensive, try to get around this alloc / copy
0465:                configs.addAll(d.getNFAConfigurations());
0466:                // for each NFA configuration in d (abort if we detect non-LL(*) state)
0467:                Iterator iter = configs.iterator();
0468:                while (!d.abortedDueToMultipleRecursiveAlts && iter.hasNext()) {
0469:                    NFAConfiguration c = (NFAConfiguration) iter.next();
0470:                    if (c.singleAtomTransitionEmanating) {
0471:                        continue; // ignore NFA states w/o epsilon transitions
0472:                    }
0473:                    //System.out.println("go do reach for NFA state "+c.state);
0474:                    // figure out reachable NFA states from each of d's nfa states
0475:                    // via epsilon transitions
0476:                    closure(dfa.nfa.getState(c.state), c.alt, c.context,
0477:                            c.semanticContext, d, false);
0478:                }
0479:                d.closureBusy = null; // wack all that memory used during closure
0480:            }
0481:
0482:            /** Where can we get from NFA state p traversing only epsilon transitions?
0483:             *  Add new NFA states + context to DFA state d.  Also add semantic
0484:             *  predicates to semantic context if collectPredicates is set.  We only
0485:             *  collect predicates at hoisting depth 0, meaning before any token/char
0486:             *  have been recognized.  This corresponds, during analysis, to the
0487:             *  initial DFA start state construction closure() invocation.
0488:             *
0489:             *  There are four cases of interest (the last being the usual transition):
0490:             *
0491:             *   1. Traverse an edge that takes us to the start state of another
0492:             *      rule, r.  We must push this state so that if the DFA
0493:             *      conversion hits the end of rule r, then it knows to continue
0494:             *      the conversion at state following state that "invoked" r. By
0495:             *      construction, there is a single transition emanating from a rule
0496:             *      ref node.
0497:             *
0498:             *   2. Reach an NFA state associated with the end of a rule, r, in the
0499:             *      grammar from which it was built.  We must add an implicit (i.e.,
0500:             *      don't actually add an epsilon transition) epsilon transition
0501:             *      from r's end state to the NFA state following the NFA state
0502:             *      that transitioned to rule r's start state.  Because there are
0503:             *      many states that could reach r, the context for a rule invocation
0504:             *      is part of a call tree not a simple stack.  When we fall off end
0505:             *      of rule, "pop" a state off the call tree and add that state's
0506:             *      "following" node to d's NFA configuration list.  The context
0507:             *      for this new addition will be the new "stack top" in the call tree.
0508:             *
0509:             *   3. Like case 2, we reach an NFA state associated with the end of a
0510:             *      rule, r, in the grammar from which NFA was built.  In this case,
0511:             *      however, we realize that during this NFA->DFA conversion, no state
0512:             *      invoked the current rule's NFA.  There is no choice but to add
0513:             *      all NFA states that follow references to r's start state.  This is
0514:             *      analogous to computing the FOLLOW(r) in the LL(k) world.  By
0515:             *      construction, even rule stop state has a chain of nodes emanating
0516:             *      from it that points to every possible following node.  This case
0517:             *      is conveniently handled then by the 4th case.
0518:             *
0519:             *   4. Normal case.  If p can reach another NFA state q, then add
0520:             *      q to d's configuration list, copying p's context for q's context.
0521:             *      If there is a semantic predicate on the transition, then AND it
0522:             *      with any existing semantic context.
0523:             *
0524:             *   Current state p is always added to d's configuration list as it's part
0525:             *   of the closure as well.
0526:             *
0527:             *  When is a closure operation in a cycle condition?  While it is
0528:             *  very possible to have the same NFA state mentioned twice
0529:             *  within the same DFA state, there are two situations that
0530:             *  would lead to nontermination of closure operation:
0531:             *
0532:             *  o   Whenever closure reaches a configuration where the same state
0533:             *      with same or a suffix context already exists.  This catches
0534:             *      the IF-THEN-ELSE tail recursion cycle and things like
0535:             *
0536:             *      a : A a | B ;
0537:             *
0538:             *      the context will be $ (empty stack).
0539:             *
0540:             *      We have to check
0541:             *      larger context stacks because of (...)+ loops.  For
0542:             *      example, the context of a (...)+ can be nonempty if the
0543:             *      surrounding rule is invoked by another rule:
0544:             *
0545:             *      a : b A | X ;
0546:             *      b : (B|)+ ;  // nondeterministic by the way
0547:             *
0548:             *      The context of the (B|)+ loop is "invoked from item
0549:             *      a : . b A ;" and then the empty alt of the loop can reach back
0550:             *      to itself.  The context stack will have one "return
0551:             *      address" element and so we must check for same state, same
0552:             *      context for arbitrary context stacks.
0553:             *
0554:             *      Idea: If we've seen this configuration before during closure, stop.
0555:             *      We also need to avoid reaching same state with conflicting context.
0556:             *      Ultimately analysis would stop and we'd find the conflict, but we
0557:             *      should stop the computation.  Previously I only checked for
0558:             *      exact config.  Need to check for same state, suffix context
0559:             * 		not just exact context.
0560:             *
0561:             *  o   Whenever closure reaches a configuration where state p
0562:             *      is present in its own context stack.  This means that
0563:             *      p is a rule invocation state and the target rule has
0564:             *      been called before.  NFAContext.MAX_RECURSIVE_INVOCATIONS
0565:             *      (See the comment there also) determines how many times
0566:             *      it's possible to recurse; clearly we cannot recurse forever.
0567:             *      Some grammars such as the following actually require at
0568:             *      least one recursive call to correctly compute the lookahead:
0569:             *
0570:             *      a : L ID R
0571:             *        | b
0572:             *        ;
0573:             *      b : ID
0574:             *        | L a R
0575:             *        ;
0576:             *
0577:             *      Input L ID R is ambiguous but to figure this out, ANTLR
0578:             *      needs to go a->b->a->b to find the L ID sequence.
0579:             *
0580:             *      Do not allow closure to add a configuration that would
0581:             *      allow too much recursion.
0582:             *
0583:             *      This case also catches infinite left recursion.
0584:             */
0585:            public void closure(NFAState p, int alt, NFAContext context,
0586:                    SemanticContext semanticContext, DFAState d,
0587:                    boolean collectPredicates) {
0588:                if (debug) {
0589:                    System.out.println("closure at NFA state " + p.stateNumber
0590:                            + "|" + alt + " filling DFA state " + d.stateNumber
0591:                            + " with context " + context);
0592:                }
0593:
0594:                if (d.abortedDueToMultipleRecursiveAlts) {
0595:                    // keep walking back out, we're in the process of terminating
0596:                    // this closure operation on NFAState p contained with DFAState d
0597:                    return;
0598:                }
0599:
0600:                /* NOTE SURE WE NEED THIS FAILSAFE NOW 11/8/2006 and it complicates
0601:                   MY ALGORITHM TO HAVE TO ABORT ENTIRE DFA CONVERSION
0602:                 */
0603:                if (DFA.MAX_TIME_PER_DFA_CREATION > 0
0604:                        && System.currentTimeMillis()
0605:                                - d.dfa.conversionStartTime >= DFA.MAX_TIME_PER_DFA_CREATION) {
0606:                    // report and back your way out; we've blown up somehow
0607:                    dfa.probe.reportEarlyTermination();
0608:                    return;
0609:                }
0610:
0611:                NFAConfiguration proposedNFAConfiguration = new NFAConfiguration(
0612:                        p.stateNumber, alt, context, semanticContext);
0613:
0614:                // Avoid infinite recursion
0615:                if (closureIsBusy(d, proposedNFAConfiguration)) {
0616:                    if (debug) {
0617:                        System.out
0618:                                .println("avoid visiting exact closure computation NFA config: "
0619:                                        + proposedNFAConfiguration);
0620:                        System.out.println("state is " + d.dfa.decisionNumber
0621:                                + "." + d);
0622:                    }
0623:                    return;
0624:                }
0625:
0626:                // set closure to be busy for this NFA configuration
0627:                d.closureBusy.add(proposedNFAConfiguration);
0628:
0629:                // p itself is always in closure
0630:                d.addNFAConfiguration(p, proposedNFAConfiguration);
0631:
0632:                // Case 1: are we a reference to another rule?
0633:                Transition transition0 = p.transition(0);
0634:                if (transition0 instanceof  RuleClosureTransition) {
0635:                    int depth = context
0636:                            .recursionDepthEmanatingFromState(p.stateNumber);
0637:                    // Detect recursion by more than a single alt, which indicates
0638:                    // that the decision's lookahead language is non-regular; terminate
0639:                    if (depth == 1 && d.dfa.getUserMaxLookahead() == 0) { // k=* only
0640:                        //if ( depth >= NFAContext.MAX_SAME_RULE_INVOCATIONS_PER_NFA_CONFIG_STACK ) {
0641:                        d.dfa.recursiveAltSet.add(alt); // indicate that this alt is recursive
0642:                        if (d.dfa.recursiveAltSet.size() > 1) {
0643:                            //System.out.println("recursive alts: "+d.dfa.recursiveAltSet.toString());
0644:                            d.abortedDueToMultipleRecursiveAlts = true;
0645:                            return;
0646:                        }
0647:                        /*
0648:                        System.out.println("alt "+alt+" in rule "+p.enclosingRule+" dec "+d.dfa.decisionNumber+
0649:                        	" ctx: "+context);
0650:                        System.out.println("d="+d);
0651:                         */
0652:                    }
0653:                    // Detect an attempt to recurse too high
0654:                    // if this context has hit the max recursions for p.stateNumber,
0655:                    // don't allow it to enter p.stateNumber again
0656:                    if (depth >= NFAContext.MAX_SAME_RULE_INVOCATIONS_PER_NFA_CONFIG_STACK) {
0657:                        /*
0658:                        System.out.println("OVF state "+d);
0659:                        System.out.println("proposed "+proposedNFAConfiguration);
0660:                         */
0661:                        d.dfa.probe.reportRecursiveOverflow(d,
0662:                                proposedNFAConfiguration);
0663:                        d.abortedDueToRecursionOverflow = true;
0664:                        return;
0665:                    }
0666:
0667:                    // otherwise, it's cool to (re)enter target of this rule ref
0668:                    RuleClosureTransition ref = (RuleClosureTransition) transition0;
0669:                    // first create a new context and push onto call tree,
0670:                    // recording the fact that we are invoking a rule and
0671:                    // from which state (case 2 below will get the following state
0672:                    // via the RuleClosureTransition emanating from the invoking state
0673:                    // pushed on the stack).
0674:                    // Reset the context to reflect the fact we invoked rule
0675:                    NFAContext newContext = new NFAContext(context, p);
0676:                    // System.out.print("invoking rule "+nfa.getGrammar().getRuleName(ref.getRuleIndex()));
0677:                    // System.out.println(" context="+context);
0678:                    // traverse epsilon edge to new rule
0679:                    NFAState ruleTarget = (NFAState) ref.target;
0680:                    closure(ruleTarget, alt, newContext, semanticContext, d,
0681:                            collectPredicates);
0682:                }
0683:                // Case 2: end of rule state, context (i.e., an invoker) exists
0684:                else if (p.isAcceptState() && context.parent != null) {
0685:                    NFAState whichStateInvokedRule = context.invokingState;
0686:                    RuleClosureTransition edgeToRule = (RuleClosureTransition) whichStateInvokedRule
0687:                            .transition(0);
0688:                    NFAState continueState = edgeToRule.getFollowState();
0689:                    NFAContext newContext = context.parent; // "pop" invoking state
0690:                    closure(continueState, alt, newContext, semanticContext, d,
0691:                            collectPredicates);
0692:                }
0693:                /*
0694:                11/27/2005: I tried adding this but it highlighted that
0695:                lexer rules needed to be called from Tokens not just ref'd directly
0696:                so their contexts are different for F : I '.' ;  I : '0' ;  otherwise
0697:                we get an ambiguity.  The context of state following '0' has same
0698:                NFA state with [6 $] and [$] hence they conflict.  We need to get
0699:                the other stack call in there.
0700:                else if ( dfa.nfa.grammar.type == Grammar.LEXER &&
0701:                	      p.isAcceptState() &&
0702:                	context.invokingState.enclosingRule.equals("Tokens") )
0703:                {
0704:                	// hit the end of a lexer rule when no one has invoked that rule
0705:                	// (this will be the case if Tokens rule analysis reaches the
0706:                	// stop state of a token in its alt list).
0707:                	// Must not follow the FOLLOW links; must return
0708:                	return;
0709:                }
0710:                 */
0711:                // Case 3: end of rule state, nobody invoked this rule (no context)
0712:                //    Fall thru to be handled by case 4 automagically.
0713:                // Case 4: ordinary NFA->DFA conversion case: simple epsilon transition
0714:                else {
0715:                    // recurse down any epsilon transitions
0716:                    if (transition0 != null && transition0.isEpsilon()) {
0717:                        closure((NFAState) transition0.target, alt, context,
0718:                                semanticContext, d, collectPredicates);
0719:                    } else if (transition0 != null
0720:                            && transition0.isSemanticPredicate()) {
0721:                        // continue closure here too, but add the sem pred to ctx
0722:                        SemanticContext newSemanticContext = semanticContext;
0723:                        if (collectPredicates) {
0724:                            // AND the previous semantic context with new pred
0725:                            SemanticContext labelContext = transition0.label
0726:                                    .getSemanticContext();
0727:                            // do not hoist syn preds from other rules; only get if in
0728:                            // starting state's rule (i.e., context is empty)
0729:                            int walkAlt = dfa.decisionNFAStartState
0730:                                    .translateDisplayAltToWalkAlt(dfa, alt);
0731:                            NFAState altLeftEdge = dfa.nfa.grammar
0732:                                    .getNFAStateForAltOfDecision(
0733:                                            dfa.decisionNFAStartState, walkAlt);
0734:                            /*
0735:                            System.out.println("state "+p.stateNumber+" alt "+alt+" walkAlt "+walkAlt+" trans to "+transition0.target);
0736:                            System.out.println("DFA start state "+dfa.decisionNFAStartState.stateNumber);
0737:                            System.out.println("alt left edge "+altLeftEdge.stateNumber+
0738:                            	", epsilon target "+
0739:                            	altLeftEdge.transition(0).target.stateNumber);
0740:                             */
0741:                            if (!labelContext.isSyntacticPredicate()
0742:                                    || p == altLeftEdge.transition(0).target) {
0743:                                //System.out.println("&"+labelContext+" enclosingRule="+p.enclosingRule);
0744:                                newSemanticContext = SemanticContext.and(
0745:                                        semanticContext, labelContext);
0746:                            }
0747:                        }
0748:                        closure((NFAState) transition0.target, alt, context,
0749:                                newSemanticContext, d, collectPredicates);
0750:                    }
0751:                    Transition transition1 = p.transition(1);
0752:                    if (transition1 != null && transition1.isEpsilon()) {
0753:                        closure((NFAState) transition1.target, alt, context,
0754:                                semanticContext, d, collectPredicates);
0755:                    }
0756:                }
0757:
0758:                // don't remove "busy" flag as we want to prevent all
0759:                // references to same config of state|alt|ctx|semCtx even
0760:                // if resulting from another NFA state
0761:            }
0762:
0763:            /** A closure operation should abort if that computation has already
0764:             *  been done or a computation with a conflicting context has already
0765:             *  been done.  If proposed NFA config's state and alt are the same
0766:             *  there is potentially a problem.  If the stack context is identical
0767:             *  then clearly the exact same computation is proposed.  If a context
0768:             *  is a suffix of the other, then again the computation is in an
0769:             *  identical context.  ?$ and ??$ are considered the same stack.
0770:             *  We have to walk configurations linearly doing the comparison instead
0771:             *  of a set for exact matches.
0772:             *
0773:             *  We cannot use a set hash table for this lookup as contexts that are
0774:             *  suffixes could be !equal() but their hashCode()s would be different;
0775:             *  that's a problem for a HashSet.  This costs a lot actually, it
0776:             *  takes about 490ms vs 355ms for Java grammar's analysis phase when
0777:             *  I moved away from hash lookup.  Argh!  Still it's small.  For newbie
0778:             *  generated grammars though this really speeds things up because it
0779:             *  avoids chasing its tail during closure operations on highly left-
0780:             *  recursive grammars.
0781:             *
0782:             *  Ok, backing this out to use exact match again for speed.  We will
0783:             *  always detect the conflict later when checking for context suffixes...
0784:             *  I was just trying to prevent unnecessary closures for random crap
0785:             *  submitted by newbies.  Instead now I check for left-recursive stuff
0786:             *  and terminate before analysis obviates the need to do this more
0787:             *  expensive computation.
0788:             *
0789:             *  If the semantic context is different, then allow new computation.
0790:             */
0791:            public static boolean closureIsBusy(DFAState d,
0792:                    NFAConfiguration proposedNFAConfiguration) {
0793:                // Check epsilon cycle (same state, same alt, same context)
0794:                return d.closureBusy.contains(proposedNFAConfiguration);
0795:                /*
0796:                // Uncomment to get all conflicts not just exact context matches
0797:                for (int i = 0; i < d.closureBusy.size(); i++) {
0798:                	NFAConfiguration c = (NFAConfiguration) d.closureBusy.get(i);
0799:                	if ( proposedNFAConfiguration.state==c.state &&
0800:                		 proposedNFAConfiguration.alt==c.alt &&
0801:                		 proposedNFAConfiguration.semanticContext.equals(c.semanticContext) &&
0802:                		 proposedNFAConfiguration.context.suffix(c.context) )
0803:                	{
0804:                		// if computing closure of start state, we tried to
0805:                		// recompute a closure, must be left recursion.  We got back
0806:                		// to the same computation.  After having consumed no input,
0807:                		// we're back.  Only track rule invocation states
0808:                		if ( (dfa.startState==null ||
0809:                			  d.stateNumber==dfa.startState.stateNumber) &&
0810:                			 p.transition(0) instanceof RuleClosureTransition )
0811:                		{
0812:                			d.dfa.probe.reportLeftRecursion(d, proposedNFAConfiguration);
0813:                		}
0814:                		return true;
0815:                	}
0816:                }
0817:                return false;
0818:                 */
0819:            }
0820:
0821:            /** Given the set of NFA states in DFA state d, find all NFA states
0822:             *  reachable traversing label arcs.  By definition, there can be
0823:             *  only one DFA state reachable by an atom from DFA state d so we must
0824:             *  find and merge all NFA states reachable via label.  Return a new
0825:             *  DFAState that has all of those NFA states with their context (i.e.,
0826:             *  which alt do they predict and where to return to if they fall off
0827:             *  end of a rule).
0828:             *
0829:             *  Because we cannot jump to another rule nor fall off the end of a rule
0830:             *  via a non-epsilon transition, NFA states reachable from d have the
0831:             *  same configuration as the NFA state in d.  So if NFA state 7 in d's
0832:             *  configurations can reach NFA state 13 then 13 will be added to the
0833:             *  new DFAState (labelDFATarget) with the same configuration as state
0834:             *  7 had.
0835:             *
0836:             *  This method does not see EOT transitions off the end of token rule
0837:             *  accept states if the rule was invoked by somebody.
0838:             */
0839:            public DFAState reach(DFAState d, Label label) {
0840:                DFAState labelDFATarget = dfa.newState();
0841:                // for each NFA state in d, add in target states for label
0842:                int intLabel = label.getAtom();
0843:                IntSet setLabel = label.getSet();
0844:                Iterator iter = d.getNFAConfigurations().iterator();
0845:                while (iter.hasNext()) {
0846:                    NFAConfiguration c = (NFAConfiguration) iter.next();
0847:                    if (c.resolved || c.resolveWithPredicate) {
0848:                        continue; // the conflict resolver indicates we must leave alone
0849:                    }
0850:                    NFAState p = dfa.nfa.getState(c.state);
0851:                    // by design of the grammar->NFA conversion, only transition 0
0852:                    // may have a non-epsilon edge.
0853:                    Transition edge = p.transition(0);
0854:                    if (edge == null || !c.singleAtomTransitionEmanating) {
0855:                        continue;
0856:                    }
0857:                    Label edgeLabel = edge.label;
0858:
0859:                    // SPECIAL CASE
0860:                    // if it's an EOT transition on end of lexer rule, but context
0861:                    // stack is not empty, then don't see the EOT; the closure
0862:                    // will have added in the proper states following the reference
0863:                    // to this rule in the invoking rule.  In other words, if
0864:                    // somebody called this rule, don't see the EOT emanating from
0865:                    // this accept state.
0866:                    if (c.context.parent != null && edgeLabel.isAtom()
0867:                            && edgeLabel.getAtom() == Label.EOT) {
0868:                        continue;
0869:                    }
0870:
0871:                    // Labels not unique at this point (not until addReachableLabels)
0872:                    // so try simple int label match before general set intersection
0873:                    //System.out.println("comparing "+edgeLabel+" with "+label);
0874:                    boolean matched = (!label.isSet() && edgeLabel.getAtom() == intLabel)
0875:                            || (!edgeLabel.getSet().and(setLabel).isNil());
0876:                    if (matched) {
0877:                        // found a transition with label;
0878:                        // add NFA target to (potentially) new DFA state
0879:                        labelDFATarget.addNFAConfiguration(
0880:                                (NFAState) edge.target, c.alt, c.context,
0881:                                c.semanticContext);
0882:                    }
0883:                }
0884:                if (labelDFATarget.getNFAConfigurations().size() == 0) {
0885:                    // kill; it's empty
0886:                    dfa.setState(labelDFATarget.stateNumber, null);
0887:                    labelDFATarget = null;
0888:                }
0889:                return labelDFATarget;
0890:            }
0891:
0892:            /** Walk the configurations of this DFA state d looking for the
0893:             *  configuration, c, that has a transition on EOT.  State d should
0894:             *  be converted to an accept state predicting the c.alt.  Blast
0895:             *  d's current configuration set and make it just have config c.
0896:             *
0897:             *  TODO: can there be more than one config with EOT transition?
0898:             *  That would mean that two NFA configurations could reach the
0899:             *  end of the token with possibly different predicted alts.
0900:             *  Seems like that would be rare or impossible.  Perhaps convert
0901:             *  this routine to find all such configs and give error if >1.
0902:             */
0903:            protected void convertToEOTAcceptState(DFAState d) {
0904:                Label eot = new Label(Label.EOT);
0905:                Iterator iter = d.getNFAConfigurations().iterator();
0906:                while (iter.hasNext()) {
0907:                    NFAConfiguration c = (NFAConfiguration) iter.next();
0908:                    if (c.resolved || c.resolveWithPredicate) {
0909:                        continue; // the conflict resolver indicates we must leave alone
0910:                    }
0911:                    NFAState p = dfa.nfa.getState(c.state);
0912:                    Transition edge = p.transition(0);
0913:                    Label edgeLabel = edge.label;
0914:                    if (edgeLabel.equals(eot)) {
0915:                        //System.out.println("config with EOT: "+c);
0916:                        d.setAcceptState(true);
0917:                        //System.out.println("d goes from "+d);
0918:                        d.getNFAConfigurations().clear();
0919:                        d.addNFAConfiguration(p, c.alt, c.context,
0920:                                c.semanticContext);
0921:                        //System.out.println("to "+d);
0922:                        return; // assume only one EOT transition
0923:                    }
0924:                }
0925:            }
0926:
0927:            /** Add a new DFA state to the DFA if not already present.
0928:             *  If the DFA state uniquely predicts a single alternative, it
0929:             *  becomes a stop state; don't add to work list.  Further, if
0930:             *  there exists an NFA state predicted by > 1 different alternatives
0931:             *  and with the same syn and sem context, the DFA is nondeterministic for
0932:             *  at least one input sequence reaching that NFA state.
0933:             */
0934:            protected DFAState addDFAStateToWorkList(DFAState d) {
0935:                DFAState existingState = dfa.addState(d);
0936:                if (d != existingState) {
0937:                    // already there...use/return the existing DFA state.
0938:                    // But also set the states[d.stateNumber] to the existing
0939:                    // DFA state because the closureIsBusy must report
0940:                    // infinite recursion on a state before it knows
0941:                    // whether or not the state will already be
0942:                    // found after closure on it finishes.  It could be
0943:                    // refer to a state that will ultimately not make it
0944:                    // into the reachable state space and the error
0945:                    // reporting must be able to compute the path from
0946:                    // start to the error state with infinite recursion
0947:                    dfa.setState(d.stateNumber, existingState);
0948:                    return existingState;
0949:                }
0950:
0951:                // if not there, then examine new state.
0952:
0953:                // resolve syntactic conflicts by choosing a single alt or
0954:                // by using semantic predicates if present.
0955:                resolveNonDeterminisms(d);
0956:
0957:                // If deterministic, don't add this state; it's an accept state
0958:                // Just return as a valid DFA state
0959:                int alt = d.getUniquelyPredictedAlt();
0960:                if (alt != NFA.INVALID_ALT_NUMBER) { // uniquely predicts an alt?
0961:                    d = convertToAcceptState(d, alt);
0962:                    /*
0963:                    System.out.println("convert to accept; DFA "+d.dfa.decisionNumber+" state "+d.stateNumber+" uniquely predicts alt "+
0964:                    	d.getUniquelyPredictedAlt());
0965:                     */
0966:                } else {
0967:                    // unresolved, add to work list to continue NFA conversion
0968:                    work.add(d);
0969:                }
0970:                return d;
0971:            }
0972:
0973:            protected DFAState convertToAcceptState(DFAState d, int alt) {
0974:                // only merge stop states if they are deterministic and no
0975:                // recursion problems and only if they have the same gated pred
0976:                // context!
0977:                // Later, the error reporting may want to trace the path from
0978:                // the start state to the nondet state
0979:                if (DFAOptimizer.MERGE_STOP_STATES
0980:                        && d.getNonDeterministicAlts() == null
0981:                        && !d.abortedDueToRecursionOverflow
0982:                        && !d.abortedDueToMultipleRecursiveAlts) {
0983:                    // check to see if we already have an accept state for this alt
0984:                    // [must do this after we resolve nondeterminisms in general]
0985:                    DFAState acceptStateForAlt = dfa.getAcceptState(alt);
0986:                    if (acceptStateForAlt != null) {
0987:                        // we already have an accept state for alt;
0988:                        // Are their gate sem pred contexts the same?
0989:                        // For now we assume a braindead version: both must not
0990:                        // have gated preds or share exactly same single gated pred.
0991:                        // The equals() method is only defined on Predicate contexts not
0992:                        // OR etc...
0993:                        SemanticContext gatedPreds = d
0994:                                .getGatedPredicatesInNFAConfigurations();
0995:                        SemanticContext existingStateGatedPreds = acceptStateForAlt
0996:                                .getGatedPredicatesInNFAConfigurations();
0997:                        if ((gatedPreds == null && existingStateGatedPreds == null)
0998:                                || ((gatedPreds != null && existingStateGatedPreds != null) && gatedPreds
0999:                                        .equals(existingStateGatedPreds))) {
1000:                            // make this d.statenumber point at old DFA state
1001:                            dfa.setState(d.stateNumber, acceptStateForAlt);
1002:                            dfa.removeState(d); // remove this state from unique DFA state set
1003:                            d = acceptStateForAlt; // use old accept state; throw this one out
1004:                            return d;
1005:                        }
1006:                        // else consider it a new accept state; fall through.
1007:                    }
1008:                    d.setAcceptState(true); // new accept state for alt
1009:                    dfa.setAcceptState(alt, d);
1010:                    return d;
1011:                }
1012:                d.setAcceptState(true); // new accept state for alt
1013:                dfa.setAcceptState(alt, d);
1014:                return d;
1015:            }
1016:
1017:            /** If > 1 NFA configurations within this DFA state have identical
1018:             *  NFA state and context, but differ in their predicted
1019:             *  TODO update for new context suffix stuff 3-9-2005
1020:             *  alternative then a single input sequence predicts multiple alts.
1021:             *  The NFA decision is therefore syntactically indistinguishable
1022:             *  from the left edge upon at least one input sequence.  We may
1023:             *  terminate the NFA to DFA conversion for these paths since no
1024:             *  paths emanating from those NFA states can possibly separate
1025:             *  these conjoined twins once interwined to make things
1026:             *  deterministic (unless there are semantic predicates; see below).
1027:             *
1028:             *  Upon a nondeterministic set of NFA configurations, we should
1029:             *  report a problem to the grammar designer and resolve the issue
1030:             *  by aribitrarily picking the first alternative (this usually
1031:             *  ends up producing the most natural behavior).  Pick the lowest
1032:             *  alt number and just turn off all NFA configurations
1033:             *  associated with the other alts. Rather than remove conflicting
1034:             *  NFA configurations, I set the "resolved" bit so that future
1035:             *  computations will ignore them.  In this way, we maintain the
1036:             *  complete DFA state with all its configurations, but prevent
1037:             *  future DFA conversion operations from pursuing undesirable
1038:             *  paths.  Remember that we want to terminate DFA conversion as
1039:             *  soon as we know the decision is deterministic *or*
1040:             *  nondeterministic.
1041:             *
1042:             *  [BTW, I have convinced myself that there can be at most one
1043:             *  set of nondeterministic configurations in a DFA state.  Only NFA
1044:             *  configurations arising from the same input sequence can appear
1045:             *  in a DFA state.  There is no way to have another complete set
1046:             *  of nondeterministic NFA configurations without another input
1047:             *  sequence, which would reach a different DFA state.  Therefore,
1048:             *  the two nondeterministic NFA configuration sets cannot collide
1049:             *  in the same DFA state.]
1050:             *
1051:             *  Consider DFA state {(s|1),(s|2),(s|3),(t|3),(v|4)} where (s|a)
1052:             *  is state 's' and alternative 'a'.  Here, configuration set
1053:             *  {(s|1),(s|2),(s|3)} predicts 3 different alts.  Configurations
1054:             *  (s|2) and (s|3) are "resolved", leaving {(s|1),(t|3),(v|4)} as
1055:             *  items that must still be considered by the DFA conversion
1056:             *  algorithm in DFA.findNewDFAStatesAndAddDFATransitions().
1057:             *
1058:             *  Consider the following grammar where alts 1 and 2 are no
1059:             *  problem because of the 2nd lookahead symbol.  Alts 3 and 4 are
1060:             *  identical and will therefore reach the rule end NFA state but
1061:             *  predicting 2 different alts (no amount of future lookahead
1062:             *  will render them deterministic/separable):
1063:             *
1064:             *  a : A B
1065:             *    | A C
1066:             *    | A
1067:             *    | A
1068:             *    ;
1069:             *
1070:             *  Here is a (slightly reduced) NFA of this grammar:
1071:             *
1072:             *  (1)-A->(2)-B->(end)-EOF->(8)
1073:             *   |              ^
1074:             *  (2)-A->(3)-C----|
1075:             *   |              ^
1076:             *  (4)-A->(5)------|
1077:             *   |              ^
1078:             *  (6)-A->(7)------|
1079:             *
1080:             *  where (n) is NFA state n.  To begin DFA conversion, the start
1081:             *  state is created:
1082:             *
1083:             *  {(1|1),(2|2),(4|3),(6|4)}
1084:             *
1085:             *  Upon A, all NFA configurations lead to new NFA states yielding
1086:             *  new DFA state:
1087:             *
1088:             *  {(2|1),(3|2),(5|3),(7|4),(end|3),(end|4)}
1089:             *
1090:             *  where the configurations with state end in them are added
1091:             *  during the epsilon closure operation.  State end predicts both
1092:             *  alts 3 and 4.  An error is reported, the latter configuration is
1093:             *  flagged as resolved leaving the DFA state as:
1094:             *
1095:             *  {(2|1),(3|2),(5|3),(7|4|resolved),(end|3),(end|4|resolved)}
1096:             *
1097:             *  As NFA configurations are added to a DFA state during its
1098:             *  construction, the reachable set of labels is computed.  Here
1099:             *  reachable is {B,C,EOF} because there is at least one NFA state
1100:             *  in the DFA state that can transition upon those symbols.
1101:             *
1102:             *  The final DFA looks like:
1103:             *
1104:             *  {(1|1),(2|2),(4|3),(6|4)}
1105:             *              |
1106:             *              v
1107:             *  {(2|1),(3|2),(5|3),(7|4),(end|3),(end|4)} -B-> (end|1)
1108:             *              |                        |
1109:             *              C                        ----EOF-> (8,3)
1110:             *              |
1111:             *              v
1112:             *           (end|2)
1113:             *
1114:             *  Upon AB, alt 1 is predicted.  Upon AC, alt 2 is predicted.
1115:             *  Upon A EOF, alt 3 is predicted.  Alt 4 is not a viable
1116:             *  alternative.
1117:             *
1118:             *  The algorithm is essentially to walk all the configurations
1119:             *  looking for a conflict of the form (s|i) and (s|j) for i!=j.
1120:             *  Use a hash table to track state+context pairs for collisions
1121:             *  so that we have O(n) to walk the n configurations looking for
1122:             *  a conflict.  Upon every conflict, track the alt number so
1123:             *  we have a list of all nondeterministically predicted alts. Also
1124:             *  track the minimum alt.  Next go back over the configurations, setting
1125:             *  the "resolved" bit for any that have an alt that is a member of
1126:             *  the nondeterministic set.  This will effectively remove any alts
1127:             *  but the one we want from future consideration.
1128:             *
1129:             *  See resolveWithSemanticPredicates()
1130:             *
1131:             *  AMBIGUOUS TOKENS
1132:             *
1133:             *  With keywords and ID tokens, there is an inherit ambiguity in that
1134:             *  "int" can be matched by ID also.  Each lexer rule has an EOT
1135:             *  transition emanating from it which is used whenever the end of
1136:             *  a rule is reached and another token rule did not invoke it.  EOT
1137:             *  is the only thing that can be seen next.  If two rules are identical
1138:             *  like "int" and "int" then the 2nd def is unreachable and you'll get
1139:             *  a warning.  We prevent a warning though for the keyword/ID issue as
1140:             *  ID is still reachable.  This can be a bit weird.  '+' rule then a
1141:             *  '+'|'+=' rule will fail to match '+' for the 2nd rule.
1142:             *
1143:             *  If all NFA states in this DFA state are targets of EOT transitions,
1144:             *  (and there is more than one state plus no unique alt is predicted)
1145:             *  then DFA conversion will leave this state as a dead state as nothing
1146:             *  can be reached from this state.  To resolve the ambiguity, just do
1147:             *  what flex and friends do: pick the first rule (alt in this case) to
1148:             *  win.  This means you should put keywords before the ID rule.
1149:             *  If the DFA state has only one NFA state then there is no issue:
1150:             *  it uniquely predicts one alt. :)  Problem
1151:             *  states will look like this during conversion:
1152:             *
1153:             *  DFA 1:{9|1, 19|2, 14|3, 20|2, 23|2, 24|2, ...}-<EOT>->5:{41|3, 42|2}
1154:             *
1155:             *  Worse, when you have two identical literal rules, you will see 3 alts
1156:             *  in the EOT state (one for ID and one each for the identical rules).
1157:             */
1158:            public void resolveNonDeterminisms(DFAState d) {
1159:                if (debug) {
1160:                    System.out
1161:                            .println("resolveNonDeterminisms " + d.toString());
1162:                }
1163:                boolean conflictingLexerRules = false;
1164:                Set nondeterministicAlts = d.getNonDeterministicAlts();
1165:                if (debug && nondeterministicAlts != null) {
1166:                    System.out.println("nondet alts=" + nondeterministicAlts);
1167:                }
1168:
1169:                // CHECK FOR AMBIGUOUS EOT (if |allAlts|>1 and EOT state, resolve)
1170:                // grab any config to see if EOT state; any other configs must
1171:                // transition on EOT to get to this DFA state as well so all
1172:                // states in d must be targets of EOT.  These are the end states
1173:                // created in NFAFactory.build_EOFState
1174:                NFAConfiguration anyConfig;
1175:                Iterator itr = d.nfaConfigurations.iterator();
1176:                anyConfig = (NFAConfiguration) itr.next();
1177:                NFAState anyState = dfa.nfa.getState(anyConfig.state);
1178:                // if d is target of EOT and more than one predicted alt
1179:                // indicate that d is nondeterministic on all alts otherwise
1180:                // it looks like state has no problem
1181:                if (anyState.isEOTTargetState()) {
1182:                    Set allAlts = d.getAltSet();
1183:                    // is more than 1 alt predicted?
1184:                    if (allAlts != null && allAlts.size() > 1) {
1185:                        nondeterministicAlts = allAlts;
1186:                        // track Tokens rule issues differently than other decisions
1187:                        if (d.dfa.isTokensRuleDecision()) {
1188:                            dfa.probe.reportLexerRuleNondeterminism(d, allAlts);
1189:                            //System.out.println("Tokens rule DFA state "+d+" nondeterministic");
1190:                            conflictingLexerRules = true;
1191:                        }
1192:                    }
1193:                }
1194:
1195:                // if no problems return unless we aborted work on d to avoid inf recursion
1196:                if (!d.abortedDueToRecursionOverflow
1197:                        && nondeterministicAlts == null) {
1198:                    return; // no problems, return
1199:                }
1200:
1201:                // if we're not a conflicting lexer rule and we didn't abort, report ambig
1202:                // We should get a report for abort so don't give another
1203:                if (!d.abortedDueToRecursionOverflow && !conflictingLexerRules) {
1204:                    // TODO: with k=x option set, this is called twice for same state
1205:                    dfa.probe.reportNondeterminism(d, nondeterministicAlts);
1206:                    // TODO: how to turn off when it's only the FOLLOW that is
1207:                    // conflicting.  This used to shut off even alts i,j < n
1208:                    // conflict warnings. :(
1209:                    /*
1210:                    if ( dfa.isGreedy() ) {
1211:                    	// if nongreedy then they have said to let it fall out of loop
1212:                    	// don't report the problem
1213:                    	dfa.probe.reportNondeterminism(d);
1214:                    }
1215:                    else {
1216:                    	// TODO: remove when sure it's cool
1217:                    	dfa.probe.reportNondeterminism(d);
1218:                    	System.out.println("temp warning: warning suppressed for nongreedy loop");
1219:                    }
1220:                     */
1221:                }
1222:
1223:                // ATTEMPT TO RESOLVE WITH SEMANTIC PREDICATES
1224:                boolean resolved = tryToResolveWithSemanticPredicates(d,
1225:                        nondeterministicAlts);
1226:                if (resolved) {
1227:                    d.resolvedWithPredicates = true;
1228:                    dfa.probe
1229:                            .reportNondeterminismResolvedWithSemanticPredicate(d);
1230:                    return;
1231:                }
1232:
1233:                // RESOLVE SYNTACTIC CONFLICT BY REMOVING ALL BUT ONE ALT
1234:                resolveByChoosingFirstAlt(d, nondeterministicAlts);
1235:
1236:                //System.out.println("state "+d.stateNumber+" resolved to alt "+winningAlt);
1237:            }
1238:
1239:            protected int resolveByChoosingFirstAlt(DFAState d,
1240:                    Set nondeterministicAlts) {
1241:                int winningAlt = 0;
1242:                if (dfa.isGreedy()) {
1243:                    winningAlt = resolveByPickingMinAlt(d, nondeterministicAlts);
1244:                } else {
1245:                    // If nongreedy, the exit alt shout win, but only if it's
1246:                    // involved in the nondeterminism!
1247:                    /*
1248:                    System.out.println("resolving exit alt for decision="+
1249:                    	dfa.decisionNumber+" state="+d);
1250:                    System.out.println("nondet="+nondeterministicAlts);
1251:                    System.out.println("exit alt "+exitAlt);
1252:                     */
1253:                    int exitAlt = dfa.getNumberOfAlts();
1254:                    if (nondeterministicAlts.contains(Utils.integer(exitAlt))) {
1255:                        // if nongreedy and exit alt is one of those nondeterministic alts
1256:                        // predicted, resolve in favor of what follows block
1257:                        winningAlt = resolveByPickingExitAlt(d,
1258:                                nondeterministicAlts);
1259:                    } else {
1260:                        winningAlt = resolveByPickingMinAlt(d,
1261:                                nondeterministicAlts);
1262:                    }
1263:                }
1264:                return winningAlt;
1265:            }
1266:
1267:            /** Turn off all configurations associated with the
1268:             *  set of incoming nondeterministic alts except the min alt number.
1269:             *  There may be many alts among the configurations but only turn off
1270:             *  the ones with problems (other than the min alt of course).
1271:             *
1272:             *  If nondeterministicAlts is null then turn off all configs 'cept those
1273:             *  associated with the minimum alt.
1274:             *
1275:             *  Return the min alt found.
1276:             */
1277:            protected int resolveByPickingMinAlt(DFAState d,
1278:                    Set nondeterministicAlts) {
1279:                int min = Integer.MAX_VALUE;
1280:                if (nondeterministicAlts != null) {
1281:                    min = getMinAlt(nondeterministicAlts);
1282:                } else {
1283:                    // else walk the actual configurations to find the min
1284:                    min = getMinAlt(d);
1285:                }
1286:
1287:                turnOffOtherAlts(d, min, nondeterministicAlts);
1288:
1289:                return min;
1290:            }
1291:
1292:            /** Resolve state d by choosing exit alt, which is same value as the
1293:             *  number of alternatives.  Return that exit alt.
1294:             */
1295:            protected int resolveByPickingExitAlt(DFAState d,
1296:                    Set nondeterministicAlts) {
1297:                int exitAlt = dfa.getNumberOfAlts();
1298:                turnOffOtherAlts(d, exitAlt, nondeterministicAlts);
1299:                return exitAlt;
1300:            }
1301:
1302:            /** turn off all states associated with alts other than the good one
1303:             *  (as long as they are one of the nondeterministic ones)
1304:             */
1305:            protected static void turnOffOtherAlts(DFAState d, int min,
1306:                    Set nondeterministicAlts) {
1307:                Iterator iter = d.nfaConfigurations.iterator();
1308:                NFAConfiguration configuration;
1309:                while (iter.hasNext()) {
1310:                    configuration = (NFAConfiguration) iter.next();
1311:                    if (configuration.alt != min) {
1312:                        if (nondeterministicAlts == null
1313:                                || nondeterministicAlts.contains(Utils
1314:                                        .integer(configuration.alt))) {
1315:                            configuration.resolved = true;
1316:                        }
1317:                    }
1318:                }
1319:            }
1320:
1321:            protected static int getMinAlt(DFAState d) {
1322:                int min = Integer.MAX_VALUE;
1323:                Iterator iter = d.nfaConfigurations.iterator();
1324:                NFAConfiguration configuration;
1325:                while (iter.hasNext()) {
1326:                    configuration = (NFAConfiguration) iter.next();
1327:                    if (configuration.alt < min) {
1328:                        min = configuration.alt;
1329:                    }
1330:                }
1331:                return min;
1332:            }
1333:
1334:            protected static int getMinAlt(Set nondeterministicAlts) {
1335:                int min = Integer.MAX_VALUE;
1336:                Iterator iter = nondeterministicAlts.iterator();
1337:                while (iter.hasNext()) {
1338:                    Integer altI = (Integer) iter.next();
1339:                    int alt = altI.intValue();
1340:                    if (alt < min) {
1341:                        min = alt;
1342:                    }
1343:                }
1344:                return min;
1345:            }
1346:
1347:            /** See if a set of nondeterministic alternatives can be disambiguated
1348:             *  with the semantic predicate contexts of the alternatives.
1349:             *
1350:             *  Without semantic predicates, syntactic conflicts are resolved
1351:             *  by simply choosing the first viable alternative.  In the
1352:             *  presence of semantic predicates, you can resolve the issue by
1353:             *  evaluating boolean expressions at run time.  During analysis,
1354:             *  this amounts to suppressing grammar error messages to the
1355:             *  developer.  NFA configurations are always marked as "to be
1356:             *  resolved with predicates" so that
1357:             *  DFA.findNewDFAStatesAndAddDFATransitions() will know to ignore
1358:             *  these configurations and add predicate transitions to the DFA
1359:             *  after adding token/char labels.
1360:             *
1361:             *  During analysis, we can simply make sure that for n
1362:             *  ambiguously predicted alternatives there are at least n-1
1363:             *  unique predicate sets.  The nth alternative can be predicted
1364:             *  with "not" the "or" of all other predicates.  NFA configurations without
1365:             *  predicates are assumed to have the default predicate of
1366:             *  "true" from a user point of view.  When true is combined via || with
1367:             *  another predicate, the predicate is a tautology and must be removed
1368:             *  from consideration for disambiguation:
1369:             *
1370:             *  a : b | B ; // hoisting p1||true out of rule b, yields no predicate
1371:             *  b : {p1}? B | B ;
1372:             *
1373:             *  This is done down in getPredicatesPerNonDeterministicAlt().
1374:             */
1375:            protected boolean tryToResolveWithSemanticPredicates(DFAState d,
1376:                    Set nondeterministicAlts) {
1377:                Map altToPredMap = getPredicatesPerNonDeterministicAlt(d,
1378:                        nondeterministicAlts);
1379:
1380:                if (altToPredMap.size() == 0) {
1381:                    return false;
1382:                }
1383:
1384:                //System.out.println("nondeterministic alts with predicates: "+altToPredMap);
1385:                dfa.probe.reportAltPredicateContext(d, altToPredMap);
1386:
1387:                if (nondeterministicAlts.size() - altToPredMap.size() > 1) {
1388:                    // too few predicates to resolve; just return
1389:                    // TODO: actually do we need to gen error here?
1390:                    return false;
1391:                }
1392:
1393:                // Handle case where 1 predicate is missing
1394:                // Case 1. Semantic predicates
1395:                // If the missing pred is on nth alt, !(union of other preds)==true
1396:                // so we can avoid that computation.  If naked alt is ith, then must
1397:                // test it with !(union) since semantic predicated alts are order
1398:                // independent
1399:                // Case 2: Syntactic predicates
1400:                // The naked alt is always assumed to be true as the order of
1401:                // alts is the order of precedence.  The naked alt will be a tautology
1402:                // anyway as it's !(union of other preds).  This implies
1403:                // that there is no such thing as noviable alt for synpred edges
1404:                // emanating from a DFA state.
1405:                if (altToPredMap.size() == nondeterministicAlts.size() - 1) {
1406:                    // if there are n-1 predicates for n nondeterministic alts, can fix
1407:                    org.antlr.misc.BitSet ndSet = org.antlr.misc.BitSet
1408:                            .of(nondeterministicAlts);
1409:                    org.antlr.misc.BitSet predSet = org.antlr.misc.BitSet
1410:                            .of(altToPredMap);
1411:                    int nakedAlt = ndSet.subtract(predSet).getSingleElement();
1412:                    SemanticContext nakedAltPred = null;
1413:                    if (nakedAlt == max(nondeterministicAlts)) {
1414:                        // the naked alt is the last nondet alt and will be the default clause
1415:                        nakedAltPred = new SemanticContext.TruePredicate();
1416:                    } else {
1417:                        // pretend naked alternative is covered with !(union other preds)
1418:                        // unless it's a synpred since those have precedence same
1419:                        // as alt order
1420:                        SemanticContext unionOfPredicatesFromAllAlts = getUnionOfPredicates(altToPredMap);
1421:                        //System.out.println("all predicates "+unionOfPredicatesFromAllAlts);
1422:                        if (unionOfPredicatesFromAllAlts.isSyntacticPredicate()) {
1423:                            nakedAltPred = new SemanticContext.TruePredicate();
1424:                        } else {
1425:                            nakedAltPred = SemanticContext
1426:                                    .not(unionOfPredicatesFromAllAlts);
1427:                        }
1428:                    }
1429:
1430:                    //System.out.println("covering naked alt="+nakedAlt+" with "+nakedAltPred);
1431:
1432:                    altToPredMap.put(Utils.integer(nakedAlt), nakedAltPred);
1433:                    // set all config with alt=nakedAlt to have the computed predicate
1434:                    Iterator iter = d.nfaConfigurations.iterator();
1435:                    NFAConfiguration configuration;
1436:                    while (iter.hasNext()) {
1437:                        configuration = (NFAConfiguration) iter.next();
1438:                        if (configuration.alt == nakedAlt) {
1439:                            configuration.semanticContext = nakedAltPred;
1440:                        }
1441:                    }
1442:                }
1443:
1444:                if (altToPredMap.size() == nondeterministicAlts.size()) {
1445:                    // RESOLVE CONFLICT by picking one NFA configuration for each alt
1446:                    // and setting its resolvedWithPredicate flag
1447:                    // First, prevent a recursion warning on this state due to
1448:                    // pred resolution
1449:                    if (d.abortedDueToRecursionOverflow) {
1450:                        d.dfa.probe.removeRecursiveOverflowState(d);
1451:                    }
1452:                    Iterator iter = d.nfaConfigurations.iterator();
1453:                    NFAConfiguration configuration;
1454:                    while (iter.hasNext()) {
1455:                        configuration = (NFAConfiguration) iter.next();
1456:                        SemanticContext semCtx = (SemanticContext) altToPredMap
1457:                                .get(Utils.integer(configuration.alt));
1458:                        if (semCtx != null) {
1459:                            // resolve (first found) with pred
1460:                            // and remove alt from problem list
1461:                            configuration.resolveWithPredicate = true;
1462:                            configuration.semanticContext = semCtx; // reset to combined
1463:                            altToPredMap.remove(Utils
1464:                                    .integer(configuration.alt));
1465:                            // notify grammar that we've used the preds contained in semCtx
1466:                            if (semCtx.isSyntacticPredicate()) {
1467:                                dfa.nfa.grammar.synPredUsedInDFA(dfa, semCtx);
1468:                            }
1469:                        } else if (nondeterministicAlts.contains(Utils
1470:                                .integer(configuration.alt))) {
1471:                            // resolve all configurations for nondeterministic alts
1472:                            // for which there is no predicate context by turning it off
1473:                            configuration.resolved = true;
1474:                        }
1475:                    }
1476:                    return true;
1477:                }
1478:
1479:                return false; // couldn't fix the problem with predicates
1480:            }
1481:
1482:            /** Return a mapping from nondeterministc alt to combined list of predicates.
1483:             *  If both (s|i|semCtx1) and (t|i|semCtx2) exist, then the proper predicate
1484:             *  for alt i is semCtx1||semCtx2 because you have arrived at this single
1485:             *  DFA state via two NFA paths, both of which have semantic predicates.
1486:             *  We ignore deterministic alts because syntax alone is sufficient
1487:             *  to predict those.  Do not include their predicates.
1488:             *
1489:             *  Alts with no predicate are assumed to have {true}? pred.
1490:             *
1491:             *  When combining via || with "true", all predicates are removed from
1492:             *  consideration since the expression will always be true and hence
1493:             *  not tell us how to resolve anything.  So, if any NFA configuration
1494:             *  in this DFA state does not have a semantic context, the alt cannot
1495:             *  be resolved with a predicate.
1496:             */
1497:            protected Map getPredicatesPerNonDeterministicAlt(DFAState d,
1498:                    Set nondeterministicAlts) {
1499:                // map alt to combined SemanticContext
1500:                Map altToPredicateContextMap = new HashMap();
1501:                // init the alt to predicate set map
1502:                Map altToSetOfContextsMap = new HashMap();
1503:                for (Iterator it = nondeterministicAlts.iterator(); it
1504:                        .hasNext();) {
1505:                    Integer altI = (Integer) it.next();
1506:                    altToSetOfContextsMap.put(altI, new HashSet());
1507:                }
1508:                Set altToIncompletePredicateContextSet = new HashSet();
1509:                Iterator iter = d.nfaConfigurations.iterator();
1510:                NFAConfiguration configuration;
1511:                // for each configuration, create a unique set of predicates
1512:                // Also, track the alts with at least one uncovered configuration
1513:                // (one w/o a predicate); tracks tautologies like p1||true
1514:                while (iter.hasNext()) {
1515:                    configuration = (NFAConfiguration) iter.next();
1516:                    Integer altI = Utils.integer(configuration.alt);
1517:                    // if alt is nondeterministic, combine its predicates
1518:                    if (nondeterministicAlts.contains(altI)) {
1519:                        // if there is a predicate for this NFA configuration, OR in
1520:                        if (configuration.semanticContext != SemanticContext.EMPTY_SEMANTIC_CONTEXT) {
1521:                            /*
1522:                            SemanticContext altsExistingPred =(SemanticContext)
1523:                            		altToPredicateContextMap.get(Utils.integer(configuration.alt));
1524:                            if ( altsExistingPred!=null ) {
1525:                            	// must merge all predicates from configs with same alt
1526:                            	SemanticContext combinedContext =
1527:                            			SemanticContext.or(
1528:                            					altsExistingPred,
1529:                            					configuration.semanticContext);
1530:                            	System.out.println(altsExistingPred+" OR "+
1531:                            					   configuration.semanticContext+
1532:                            					   "="+combinedContext);
1533:                            	altToPredicateContextMap.put(
1534:                            			Utils.integer(configuration.alt),
1535:                            			combinedContext
1536:                            	);
1537:                            }
1538:                            else {
1539:                            	// not seen before, just add it
1540:                            	altToPredicateContextMap.put(
1541:                            			Utils.integer(configuration.alt),
1542:                            			configuration.semanticContext
1543:                            	);
1544:                            }
1545:                             */
1546:                            Set predSet = (Set) altToSetOfContextsMap.get(altI);
1547:                            predSet.add(configuration.semanticContext);
1548:                        } else {
1549:                            // if no predicate, but it's part of nondeterministic alt
1550:                            // then at least one path exists not covered by a predicate.
1551:                            // must remove predicate for this alt; track incomplete alts
1552:                            altToIncompletePredicateContextSet.add(altI);
1553:                        }
1554:                    }
1555:                }
1556:
1557:                // For each alt, OR together all unique predicates associated with
1558:                // all configurations
1559:                // Also, track the list of incompletely covered alts: those alts
1560:                // with at least 1 predicate and at least one configuration w/o a
1561:                // predicate. We want this in order to report to the decision probe.
1562:                List incompletelyCoveredAlts = new ArrayList();
1563:                for (Iterator it = nondeterministicAlts.iterator(); it
1564:                        .hasNext();) {
1565:                    Integer altI = (Integer) it.next();
1566:                    Set predSet = (Set) altToSetOfContextsMap.get(altI);
1567:                    if (altToIncompletePredicateContextSet.contains(altI)) {
1568:                        SemanticContext insufficientPred = (SemanticContext) altToPredicateContextMap
1569:                                .get(altI);
1570:                        if (predSet.size() > 0) {
1571:                            incompletelyCoveredAlts.add(altI);
1572:                        }
1573:                        continue;
1574:                    }
1575:                    SemanticContext combinedContext = null;
1576:                    for (Iterator itrSet = predSet.iterator(); itrSet.hasNext();) {
1577:                        SemanticContext ctx = (SemanticContext) itrSet.next();
1578:                        combinedContext = SemanticContext.or(combinedContext,
1579:                                ctx);
1580:                    }
1581:                    altToPredicateContextMap.put(altI, combinedContext);
1582:                }
1583:
1584:                // remove any predicates from incompletely covered alts
1585:                /*
1586:                iter = altToIncompletePredicateContextSet.iterator();
1587:                List incompletelyCoveredAlts = new ArrayList();
1588:                while (iter.hasNext()) {
1589:                	Integer alt = (Integer) iter.next();
1590:                	SemanticContext insufficientPred =(SemanticContext)
1591:                			altToPredicateContextMap.get(alt);
1592:                	if ( insufficientPred!=null ) {
1593:                		incompletelyCoveredAlts.add(alt);
1594:                	}
1595:                	altToPredicateContextMap.remove(alt);
1596:                }
1597:                 */
1598:
1599:                if (incompletelyCoveredAlts.size() > 0) {
1600:                    dfa.probe.reportIncompletelyCoveredAlts(d,
1601:                            incompletelyCoveredAlts);
1602:                }
1603:
1604:                return altToPredicateContextMap;
1605:            }
1606:
1607:            /** OR together all predicates from the alts.  Note that the predicate
1608:             *  for an alt could itself be a combination of predicates.
1609:             */
1610:            protected static SemanticContext getUnionOfPredicates(
1611:                    Map altToPredMap) {
1612:                Iterator iter;
1613:                SemanticContext unionOfPredicatesFromAllAlts = null;
1614:                iter = altToPredMap.values().iterator();
1615:                while (iter.hasNext()) {
1616:                    SemanticContext semCtx = (SemanticContext) iter.next();
1617:                    if (unionOfPredicatesFromAllAlts == null) {
1618:                        unionOfPredicatesFromAllAlts = semCtx;
1619:                    } else {
1620:                        unionOfPredicatesFromAllAlts = SemanticContext.or(
1621:                                unionOfPredicatesFromAllAlts, semCtx);
1622:                    }
1623:                }
1624:                return unionOfPredicatesFromAllAlts;
1625:            }
1626:
1627:            /** for each NFA config in d, look for "predicate required" sign set
1628:             *  during nondeterminism resolution.
1629:             *
1630:             *  Add the predicate edges sorted by the alternative number; I'm fairly
1631:             *  sure that I could walk the configs backwards so they are added to
1632:             *  the predDFATarget in the right order, but it's best to make sure.
1633:             *  Predicates succeed in the order they are specifed.  Alt i wins
1634:             *  over alt i+1 if both predicates are true.
1635:             */
1636:            protected void addPredicateTransitions(DFAState d) {
1637:                List configsWithPreds = new ArrayList();
1638:                // get a list of all configs with predicates
1639:                Iterator iter = d.getNFAConfigurations().iterator();
1640:                while (iter.hasNext()) {
1641:                    NFAConfiguration c = (NFAConfiguration) iter.next();
1642:                    if (c.resolveWithPredicate) {
1643:                        configsWithPreds.add(c);
1644:                    }
1645:                }
1646:                // Sort ascending according to alt; alt i has higher precedence than i+1
1647:                Collections.sort(configsWithPreds, new Comparator() {
1648:                    public int compare(Object a, Object b) {
1649:                        NFAConfiguration ca = (NFAConfiguration) a;
1650:                        NFAConfiguration cb = (NFAConfiguration) b;
1651:                        if (ca.alt < cb.alt)
1652:                            return -1;
1653:                        else if (ca.alt > cb.alt)
1654:                            return 1;
1655:                        return 0;
1656:                    }
1657:                });
1658:                List predConfigsSortedByAlt = configsWithPreds;
1659:                // Now, we can add edges emanating from d for these preds in right order
1660:                for (int i = 0; i < predConfigsSortedByAlt.size(); i++) {
1661:                    NFAConfiguration c = (NFAConfiguration) predConfigsSortedByAlt
1662:                            .get(i);
1663:                    DFAState predDFATarget = d.dfa.getAcceptState(c.alt);
1664:                    if (predDFATarget == null) {
1665:                        predDFATarget = dfa.newState(); // create if not there.
1666:                        // create a new DFA state that is a target of the predicate from d
1667:                        predDFATarget.addNFAConfiguration(dfa.nfa
1668:                                .getState(c.state), c.alt, c.context,
1669:                                c.semanticContext);
1670:                        predDFATarget.setAcceptState(true);
1671:                        DFAState existingState = dfa.addState(predDFATarget);
1672:                        if (predDFATarget != existingState) {
1673:                            // already there...use/return the existing DFA state that
1674:                            // is a target of this predicate.  Make this state number
1675:                            // point at the existing state
1676:                            dfa.setState(predDFATarget.stateNumber,
1677:                                    existingState);
1678:                            predDFATarget = existingState;
1679:                        }
1680:                    }
1681:                    // add a transition to pred target from d
1682:                    d
1683:                            .addTransition(predDFATarget, new Label(
1684:                                    c.semanticContext));
1685:                }
1686:            }
1687:
1688:            protected void initContextTrees(int numberOfAlts) {
1689:                contextTrees = new NFAContext[numberOfAlts];
1690:                for (int i = 0; i < contextTrees.length; i++) {
1691:                    int alt = i + 1;
1692:                    // add a dummy root node so that an NFA configuration can
1693:                    // always point at an NFAContext.  If a context refers to this
1694:                    // node then it implies there is no call stack for
1695:                    // that configuration
1696:                    contextTrees[i] = new NFAContext(null, null);
1697:                }
1698:            }
1699:
1700:            public static int max(Set s) {
1701:                if (s == null) {
1702:                    return Integer.MIN_VALUE;
1703:                }
1704:                int i = 0;
1705:                int m = 0;
1706:                for (Iterator it = s.iterator(); it.hasNext();) {
1707:                    i++;
1708:                    Integer I = (Integer) it.next();
1709:                    if (i == 1) { // init m with first value
1710:                        m = I.intValue();
1711:                        continue;
1712:                    }
1713:                    if (I.intValue() > m) {
1714:                        m = I.intValue();
1715:                    }
1716:                }
1717:                return m;
1718:            }
1719:        }
www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.