Source Code Cross Referenced for URI.java in  » XML » xerces-2_9_1 » org » apache » xerces » util » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » XML » xerces 2_9_1 » org.apache.xerces.util 
Source Cross Referenced  Class Diagram Java Document (Java Doc) 


0001:        /*
0002:         * Licensed to the Apache Software Foundation (ASF) under one or more
0003:         * contributor license agreements.  See the NOTICE file distributed with
0004:         * this work for additional information regarding copyright ownership.
0005:         * The ASF licenses this file to You under the Apache License, Version 2.0
0006:         * (the "License"); you may not use this file except in compliance with
0007:         * the License.  You may obtain a copy of the License at
0008:         * 
0009:         *      http://www.apache.org/licenses/LICENSE-2.0
0010:         * 
0011:         * Unless required by applicable law or agreed to in writing, software
0012:         * distributed under the License is distributed on an "AS IS" BASIS,
0013:         * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
0014:         * See the License for the specific language governing permissions and
0015:         * limitations under the License.
0016:         */
0017:
0018:        package org.apache.xerces.util;
0019:
0020:        import java.io.IOException;
0021:        import java.io.Serializable;
0022:
0023:        /**********************************************************************
0024:         * A class to represent a Uniform Resource Identifier (URI). This class
0025:         * is designed to handle the parsing of URIs and provide access to
0026:         * the various components (scheme, host, port, userinfo, path, query
0027:         * string and fragment) that may constitute a URI.
0028:         * <p>
0029:         * Parsing of a URI specification is done according to the URI
0030:         * syntax described in 
0031:         * <a href="http://www.ietf.org/rfc/rfc2396.txt?number=2396">RFC 2396</a>,
0032:         * and amended by
0033:         * <a href="http://www.ietf.org/rfc/rfc2732.txt?number=2732">RFC 2732</a>. 
0034:         * <p>
0035:         * Every absolute URI consists of a scheme, followed by a colon (':'), 
0036:         * followed by a scheme-specific part. For URIs that follow the 
0037:         * "generic URI" syntax, the scheme-specific part begins with two 
0038:         * slashes ("//") and may be followed by an authority segment (comprised 
0039:         * of user information, host, and port), path segment, query segment 
0040:         * and fragment. Note that RFC 2396 no longer specifies the use of the 
0041:         * parameters segment and excludes the "user:password" syntax as part of 
0042:         * the authority segment. If "user:password" appears in a URI, the entire 
0043:         * user/password string is stored as userinfo.
0044:         * <p>
0045:         * For URIs that do not follow the "generic URI" syntax (e.g. mailto),
0046:         * the entire scheme-specific part is treated as the "path" portion
0047:         * of the URI.
0048:         * <p>
0049:         * Note that, unlike the java.net.URL class, this class does not provide
0050:         * any built-in network access functionality nor does it provide any
0051:         * scheme-specific functionality (for example, it does not know a
0052:         * default port for a specific scheme). Rather, it only knows the
0053:         * grammar and basic set of operations that can be applied to a URI.
0054:         *
0055:         * @version  $Id: URI.java 447241 2006-09-18 05:12:57Z mrglavas $
0056:         *
0057:         **********************************************************************/
0058:        public class URI implements  Serializable {
0059:
0060:            /*******************************************************************
0061:             * MalformedURIExceptions are thrown in the process of building a URI
0062:             * or setting fields on a URI when an operation would result in an
0063:             * invalid URI specification.
0064:             *
0065:             ********************************************************************/
0066:            public static class MalformedURIException extends IOException {
0067:
0068:                /** Serialization version. */
0069:                static final long serialVersionUID = -6695054834342951930L;
0070:
0071:                /******************************************************************
0072:                 * Constructs a <code>MalformedURIException</code> with no specified
0073:                 * detail message.
0074:                 ******************************************************************/
0075:                public MalformedURIException() {
0076:                    super ();
0077:                }
0078:
0079:                /*****************************************************************
0080:                 * Constructs a <code>MalformedURIException</code> with the
0081:                 * specified detail message.
0082:                 *
0083:                 * @param p_msg the detail message.
0084:                 ******************************************************************/
0085:                public MalformedURIException(String p_msg) {
0086:                    super (p_msg);
0087:                }
0088:            }
0089:
0090:            /** Serialization version. */
0091:            static final long serialVersionUID = 1601921774685357214L;
0092:
0093:            private static final byte[] fgLookupTable = new byte[128];
0094:
0095:            /**
0096:             * Character Classes
0097:             */
0098:
0099:            /** reserved characters ;/?:@&=+$,[] */
0100:            //RFC 2732 added '[' and ']' as reserved characters
0101:            private static final int RESERVED_CHARACTERS = 0x01;
0102:
0103:            /** URI punctuation mark characters: -_.!~*'() - these, combined with
0104:                alphanumerics, constitute the "unreserved" characters */
0105:            private static final int MARK_CHARACTERS = 0x02;
0106:
0107:            /** scheme can be composed of alphanumerics and these characters: +-. */
0108:            private static final int SCHEME_CHARACTERS = 0x04;
0109:
0110:            /** userinfo can be composed of unreserved, escaped and these
0111:                characters: ;:&=+$, */
0112:            private static final int USERINFO_CHARACTERS = 0x08;
0113:
0114:            /** ASCII letter characters */
0115:            private static final int ASCII_ALPHA_CHARACTERS = 0x10;
0116:
0117:            /** ASCII digit characters */
0118:            private static final int ASCII_DIGIT_CHARACTERS = 0x20;
0119:
0120:            /** ASCII hex characters */
0121:            private static final int ASCII_HEX_CHARACTERS = 0x40;
0122:
0123:            /** Path characters */
0124:            private static final int PATH_CHARACTERS = 0x80;
0125:
0126:            /** Mask for alpha-numeric characters */
0127:            private static final int MASK_ALPHA_NUMERIC = ASCII_ALPHA_CHARACTERS
0128:                    | ASCII_DIGIT_CHARACTERS;
0129:
0130:            /** Mask for unreserved characters */
0131:            private static final int MASK_UNRESERVED_MASK = MASK_ALPHA_NUMERIC
0132:                    | MARK_CHARACTERS;
0133:
0134:            /** Mask for URI allowable characters except for % */
0135:            private static final int MASK_URI_CHARACTER = MASK_UNRESERVED_MASK
0136:                    | RESERVED_CHARACTERS;
0137:
0138:            /** Mask for scheme characters */
0139:            private static final int MASK_SCHEME_CHARACTER = MASK_ALPHA_NUMERIC
0140:                    | SCHEME_CHARACTERS;
0141:
0142:            /** Mask for userinfo characters */
0143:            private static final int MASK_USERINFO_CHARACTER = MASK_UNRESERVED_MASK
0144:                    | USERINFO_CHARACTERS;
0145:
0146:            /** Mask for path characters */
0147:            private static final int MASK_PATH_CHARACTER = MASK_UNRESERVED_MASK
0148:                    | PATH_CHARACTERS;
0149:
0150:            static {
0151:                // Add ASCII Digits and ASCII Hex Numbers
0152:                for (int i = '0'; i <= '9'; ++i) {
0153:                    fgLookupTable[i] |= ASCII_DIGIT_CHARACTERS
0154:                            | ASCII_HEX_CHARACTERS;
0155:                }
0156:
0157:                // Add ASCII Letters and ASCII Hex Numbers
0158:                for (int i = 'A'; i <= 'F'; ++i) {
0159:                    fgLookupTable[i] |= ASCII_ALPHA_CHARACTERS
0160:                            | ASCII_HEX_CHARACTERS;
0161:                    fgLookupTable[i + 0x00000020] |= ASCII_ALPHA_CHARACTERS
0162:                            | ASCII_HEX_CHARACTERS;
0163:                }
0164:
0165:                // Add ASCII Letters
0166:                for (int i = 'G'; i <= 'Z'; ++i) {
0167:                    fgLookupTable[i] |= ASCII_ALPHA_CHARACTERS;
0168:                    fgLookupTable[i + 0x00000020] |= ASCII_ALPHA_CHARACTERS;
0169:                }
0170:
0171:                // Add Reserved Characters
0172:                fgLookupTable[';'] |= RESERVED_CHARACTERS;
0173:                fgLookupTable['/'] |= RESERVED_CHARACTERS;
0174:                fgLookupTable['?'] |= RESERVED_CHARACTERS;
0175:                fgLookupTable[':'] |= RESERVED_CHARACTERS;
0176:                fgLookupTable['@'] |= RESERVED_CHARACTERS;
0177:                fgLookupTable['&'] |= RESERVED_CHARACTERS;
0178:                fgLookupTable['='] |= RESERVED_CHARACTERS;
0179:                fgLookupTable['+'] |= RESERVED_CHARACTERS;
0180:                fgLookupTable['$'] |= RESERVED_CHARACTERS;
0181:                fgLookupTable[','] |= RESERVED_CHARACTERS;
0182:                fgLookupTable['['] |= RESERVED_CHARACTERS;
0183:                fgLookupTable[']'] |= RESERVED_CHARACTERS;
0184:
0185:                // Add Mark Characters
0186:                fgLookupTable['-'] |= MARK_CHARACTERS;
0187:                fgLookupTable['_'] |= MARK_CHARACTERS;
0188:                fgLookupTable['.'] |= MARK_CHARACTERS;
0189:                fgLookupTable['!'] |= MARK_CHARACTERS;
0190:                fgLookupTable['~'] |= MARK_CHARACTERS;
0191:                fgLookupTable['*'] |= MARK_CHARACTERS;
0192:                fgLookupTable['\''] |= MARK_CHARACTERS;
0193:                fgLookupTable['('] |= MARK_CHARACTERS;
0194:                fgLookupTable[')'] |= MARK_CHARACTERS;
0195:
0196:                // Add Scheme Characters
0197:                fgLookupTable['+'] |= SCHEME_CHARACTERS;
0198:                fgLookupTable['-'] |= SCHEME_CHARACTERS;
0199:                fgLookupTable['.'] |= SCHEME_CHARACTERS;
0200:
0201:                // Add Userinfo Characters
0202:                fgLookupTable[';'] |= USERINFO_CHARACTERS;
0203:                fgLookupTable[':'] |= USERINFO_CHARACTERS;
0204:                fgLookupTable['&'] |= USERINFO_CHARACTERS;
0205:                fgLookupTable['='] |= USERINFO_CHARACTERS;
0206:                fgLookupTable['+'] |= USERINFO_CHARACTERS;
0207:                fgLookupTable['$'] |= USERINFO_CHARACTERS;
0208:                fgLookupTable[','] |= USERINFO_CHARACTERS;
0209:
0210:                // Add Path Characters
0211:                fgLookupTable[';'] |= PATH_CHARACTERS;
0212:                fgLookupTable['/'] |= PATH_CHARACTERS;
0213:                fgLookupTable[':'] |= PATH_CHARACTERS;
0214:                fgLookupTable['@'] |= PATH_CHARACTERS;
0215:                fgLookupTable['&'] |= PATH_CHARACTERS;
0216:                fgLookupTable['='] |= PATH_CHARACTERS;
0217:                fgLookupTable['+'] |= PATH_CHARACTERS;
0218:                fgLookupTable['$'] |= PATH_CHARACTERS;
0219:                fgLookupTable[','] |= PATH_CHARACTERS;
0220:            }
0221:
0222:            /** Stores the scheme (usually the protocol) for this URI. */
0223:            private String m_scheme = null;
0224:
0225:            /** If specified, stores the userinfo for this URI; otherwise null */
0226:            private String m_userinfo = null;
0227:
0228:            /** If specified, stores the host for this URI; otherwise null */
0229:            private String m_host = null;
0230:
0231:            /** If specified, stores the port for this URI; otherwise -1 */
0232:            private int m_port = -1;
0233:
0234:            /** If specified, stores the registry based authority for this URI; otherwise -1 */
0235:            private String m_regAuthority = null;
0236:
0237:            /** If specified, stores the path for this URI; otherwise null */
0238:            private String m_path = null;
0239:
0240:            /** If specified, stores the query string for this URI; otherwise
0241:                null.  */
0242:            private String m_queryString = null;
0243:
0244:            /** If specified, stores the fragment for this URI; otherwise null */
0245:            private String m_fragment = null;
0246:
0247:            private static boolean DEBUG = false;
0248:
0249:            /**
0250:             * Construct a new and uninitialized URI.
0251:             */
0252:            public URI() {
0253:            }
0254:
0255:            /**
0256:             * Construct a new URI from another URI. All fields for this URI are
0257:             * set equal to the fields of the URI passed in.
0258:             *
0259:             * @param p_other the URI to copy (cannot be null)
0260:             */
0261:            public URI(URI p_other) {
0262:                initialize(p_other);
0263:            }
0264:
0265:            /**
0266:             * Construct a new URI from a URI specification string. If the
0267:             * specification follows the "generic URI" syntax, (two slashes
0268:             * following the first colon), the specification will be parsed
0269:             * accordingly - setting the scheme, userinfo, host,port, path, query
0270:             * string and fragment fields as necessary. If the specification does
0271:             * not follow the "generic URI" syntax, the specification is parsed
0272:             * into a scheme and scheme-specific part (stored as the path) only.
0273:             *
0274:             * @param p_uriSpec the URI specification string (cannot be null or
0275:             *                  empty)
0276:             *
0277:             * @exception MalformedURIException if p_uriSpec violates any syntax
0278:             *                                   rules
0279:             */
0280:            public URI(String p_uriSpec) throws MalformedURIException {
0281:                this ((URI) null, p_uriSpec);
0282:            }
0283:
0284:            /**
0285:             * Construct a new URI from a URI specification string. If the
0286:             * specification follows the "generic URI" syntax, (two slashes
0287:             * following the first colon), the specification will be parsed
0288:             * accordingly - setting the scheme, userinfo, host,port, path, query
0289:             * string and fragment fields as necessary. If the specification does
0290:             * not follow the "generic URI" syntax, the specification is parsed
0291:             * into a scheme and scheme-specific part (stored as the path) only.
0292:             * Construct a relative URI if boolean is assigned to "true"
0293:             * and p_uriSpec is not valid absolute URI, instead of throwing an exception. 
0294:             * 
0295:             * @param p_uriSpec the URI specification string (cannot be null or
0296:             *                  empty)
0297:             * @param allowNonAbsoluteURI true to permit non-absolute URIs, 
0298:             *                            false otherwise.
0299:             *
0300:             * @exception MalformedURIException if p_uriSpec violates any syntax
0301:             *                                   rules
0302:             */
0303:            public URI(String p_uriSpec, boolean allowNonAbsoluteURI)
0304:                    throws MalformedURIException {
0305:                this ((URI) null, p_uriSpec, allowNonAbsoluteURI);
0306:            }
0307:
0308:            /**
0309:             * Construct a new URI from a base URI and a URI specification string.
0310:             * The URI specification string may be a relative URI.
0311:             *
0312:             * @param p_base the base URI (cannot be null if p_uriSpec is null or
0313:             *               empty)
0314:             * @param p_uriSpec the URI specification string (cannot be null or
0315:             *                  empty if p_base is null)
0316:             *
0317:             * @exception MalformedURIException if p_uriSpec violates any syntax
0318:             *                                  rules
0319:             */
0320:            public URI(URI p_base, String p_uriSpec)
0321:                    throws MalformedURIException {
0322:                initialize(p_base, p_uriSpec);
0323:            }
0324:
0325:            /**
0326:             * Construct a new URI from a base URI and a URI specification string.
0327:             * The URI specification string may be a relative URI.
0328:             * Construct a relative URI if boolean is assigned to "true"
0329:             * and p_uriSpec is not valid absolute URI and p_base is null
0330:             * instead of throwing an exception. 
0331:             *
0332:             * @param p_base the base URI (cannot be null if p_uriSpec is null or
0333:             *               empty)
0334:             * @param p_uriSpec the URI specification string (cannot be null or
0335:             *                  empty if p_base is null)
0336:             * @param allowNonAbsoluteURI true to permit non-absolute URIs, 
0337:             *                            false otherwise.
0338:             *
0339:             * @exception MalformedURIException if p_uriSpec violates any syntax
0340:             *                                  rules
0341:             */
0342:            public URI(URI p_base, String p_uriSpec, boolean allowNonAbsoluteURI)
0343:                    throws MalformedURIException {
0344:                initialize(p_base, p_uriSpec, allowNonAbsoluteURI);
0345:            }
0346:
0347:            /**
0348:             * Construct a new URI that does not follow the generic URI syntax.
0349:             * Only the scheme and scheme-specific part (stored as the path) are
0350:             * initialized.
0351:             *
0352:             * @param p_scheme the URI scheme (cannot be null or empty)
0353:             * @param p_schemeSpecificPart the scheme-specific part (cannot be
0354:             *                             null or empty)
0355:             *
0356:             * @exception MalformedURIException if p_scheme violates any
0357:             *                                  syntax rules
0358:             */
0359:            public URI(String p_scheme, String p_schemeSpecificPart)
0360:                    throws MalformedURIException {
0361:                if (p_scheme == null || p_scheme.trim().length() == 0) {
0362:                    throw new MalformedURIException(
0363:                            "Cannot construct URI with null/empty scheme!");
0364:                }
0365:                if (p_schemeSpecificPart == null
0366:                        || p_schemeSpecificPart.trim().length() == 0) {
0367:                    throw new MalformedURIException(
0368:                            "Cannot construct URI with null/empty scheme-specific part!");
0369:                }
0370:                setScheme(p_scheme);
0371:                setPath(p_schemeSpecificPart);
0372:            }
0373:
0374:            /**
0375:             * Construct a new URI that follows the generic URI syntax from its
0376:             * component parts. Each component is validated for syntax and some
0377:             * basic semantic checks are performed as well.  See the individual
0378:             * setter methods for specifics.
0379:             *
0380:             * @param p_scheme the URI scheme (cannot be null or empty)
0381:             * @param p_host the hostname, IPv4 address or IPv6 reference for the URI
0382:             * @param p_path the URI path - if the path contains '?' or '#',
0383:             *               then the query string and/or fragment will be
0384:             *               set from the path; however, if the query and
0385:             *               fragment are specified both in the path and as
0386:             *               separate parameters, an exception is thrown
0387:             * @param p_queryString the URI query string (cannot be specified
0388:             *                      if path is null)
0389:             * @param p_fragment the URI fragment (cannot be specified if path
0390:             *                   is null)
0391:             *
0392:             * @exception MalformedURIException if any of the parameters violates
0393:             *                                  syntax rules or semantic rules
0394:             */
0395:            public URI(String p_scheme, String p_host, String p_path,
0396:                    String p_queryString, String p_fragment)
0397:                    throws MalformedURIException {
0398:                this (p_scheme, null, p_host, -1, p_path, p_queryString,
0399:                        p_fragment);
0400:            }
0401:
0402:            /**
0403:             * Construct a new URI that follows the generic URI syntax from its
0404:             * component parts. Each component is validated for syntax and some
0405:             * basic semantic checks are performed as well.  See the individual
0406:             * setter methods for specifics.
0407:             *
0408:             * @param p_scheme the URI scheme (cannot be null or empty)
0409:             * @param p_userinfo the URI userinfo (cannot be specified if host
0410:             *                   is null)
0411:             * @param p_host the hostname, IPv4 address or IPv6 reference for the URI
0412:             * @param p_port the URI port (may be -1 for "unspecified"; cannot
0413:             *               be specified if host is null)
0414:             * @param p_path the URI path - if the path contains '?' or '#',
0415:             *               then the query string and/or fragment will be
0416:             *               set from the path; however, if the query and
0417:             *               fragment are specified both in the path and as
0418:             *               separate parameters, an exception is thrown
0419:             * @param p_queryString the URI query string (cannot be specified
0420:             *                      if path is null)
0421:             * @param p_fragment the URI fragment (cannot be specified if path
0422:             *                   is null)
0423:             *
0424:             * @exception MalformedURIException if any of the parameters violates
0425:             *                                  syntax rules or semantic rules
0426:             */
0427:            public URI(String p_scheme, String p_userinfo, String p_host,
0428:                    int p_port, String p_path, String p_queryString,
0429:                    String p_fragment) throws MalformedURIException {
0430:                if (p_scheme == null || p_scheme.trim().length() == 0) {
0431:                    throw new MalformedURIException("Scheme is required!");
0432:                }
0433:
0434:                if (p_host == null) {
0435:                    if (p_userinfo != null) {
0436:                        throw new MalformedURIException(
0437:                                "Userinfo may not be specified if host is not specified!");
0438:                    }
0439:                    if (p_port != -1) {
0440:                        throw new MalformedURIException(
0441:                                "Port may not be specified if host is not specified!");
0442:                    }
0443:                }
0444:
0445:                if (p_path != null) {
0446:                    if (p_path.indexOf('?') != -1 && p_queryString != null) {
0447:                        throw new MalformedURIException(
0448:                                "Query string cannot be specified in path and query string!");
0449:                    }
0450:
0451:                    if (p_path.indexOf('#') != -1 && p_fragment != null) {
0452:                        throw new MalformedURIException(
0453:                                "Fragment cannot be specified in both the path and fragment!");
0454:                    }
0455:                }
0456:
0457:                setScheme(p_scheme);
0458:                setHost(p_host);
0459:                setPort(p_port);
0460:                setUserinfo(p_userinfo);
0461:                setPath(p_path);
0462:                setQueryString(p_queryString);
0463:                setFragment(p_fragment);
0464:            }
0465:
0466:            /**
0467:             * Initialize all fields of this URI from another URI.
0468:             *
0469:             * @param p_other the URI to copy (cannot be null)
0470:             */
0471:            private void initialize(URI p_other) {
0472:                m_scheme = p_other.getScheme();
0473:                m_userinfo = p_other.getUserinfo();
0474:                m_host = p_other.getHost();
0475:                m_port = p_other.getPort();
0476:                m_regAuthority = p_other.getRegBasedAuthority();
0477:                m_path = p_other.getPath();
0478:                m_queryString = p_other.getQueryString();
0479:                m_fragment = p_other.getFragment();
0480:            }
0481:
0482:            /**
0483:             * Initializes this URI from a base URI and a URI specification string.
0484:             * See RFC 2396 Section 4 and Appendix B for specifications on parsing
0485:             * the URI and Section 5 for specifications on resolving relative URIs
0486:             * and relative paths.
0487:             *
0488:             * @param p_base the base URI (may be null if p_uriSpec is an absolute
0489:             *               URI)
0490:             * @param p_uriSpec the URI spec string which may be an absolute or
0491:             *                  relative URI (can only be null/empty if p_base
0492:             *                  is not null)
0493:             * @param allowNonAbsoluteURI true to permit non-absolute URIs, 
0494:             *                         in case of relative URI, false otherwise.
0495:             *
0496:             * @exception MalformedURIException if p_base is null and p_uriSpec
0497:             *                                  is not an absolute URI or if
0498:             *                                  p_uriSpec violates syntax rules
0499:             */
0500:            private void initialize(URI p_base, String p_uriSpec,
0501:                    boolean allowNonAbsoluteURI) throws MalformedURIException {
0502:
0503:                String uriSpec = p_uriSpec;
0504:                int uriSpecLen = (uriSpec != null) ? uriSpec.length() : 0;
0505:
0506:                if (p_base == null && uriSpecLen == 0) {
0507:                    if (allowNonAbsoluteURI) {
0508:                        m_path = "";
0509:                        return;
0510:                    }
0511:                    throw new MalformedURIException(
0512:                            "Cannot initialize URI with empty parameters.");
0513:                }
0514:
0515:                // just make a copy of the base if spec is empty
0516:                if (uriSpecLen == 0) {
0517:                    initialize(p_base);
0518:                    return;
0519:                }
0520:
0521:                int index = 0;
0522:
0523:                // Check for scheme, which must be before '/', '?' or '#'.
0524:                int colonIdx = uriSpec.indexOf(':');
0525:                if (colonIdx != -1) {
0526:                    final int searchFrom = colonIdx - 1;
0527:                    // search backwards starting from character before ':'.
0528:                    int slashIdx = uriSpec.lastIndexOf('/', searchFrom);
0529:                    int queryIdx = uriSpec.lastIndexOf('?', searchFrom);
0530:                    int fragmentIdx = uriSpec.lastIndexOf('#', searchFrom);
0531:
0532:                    if (colonIdx == 0 || slashIdx != -1 || queryIdx != -1
0533:                            || fragmentIdx != -1) {
0534:                        // A standalone base is a valid URI according to spec
0535:                        if (colonIdx == 0
0536:                                || (p_base == null && fragmentIdx != 0 && !allowNonAbsoluteURI)) {
0537:                            throw new MalformedURIException(
0538:                                    "No scheme found in URI.");
0539:                        }
0540:                    } else {
0541:                        initializeScheme(uriSpec);
0542:                        index = m_scheme.length() + 1;
0543:
0544:                        // Neither 'scheme:' or 'scheme:#fragment' are valid URIs.
0545:                        if (colonIdx == uriSpecLen - 1
0546:                                || uriSpec.charAt(colonIdx + 1) == '#') {
0547:                            throw new MalformedURIException(
0548:                                    "Scheme specific part cannot be empty.");
0549:                        }
0550:                    }
0551:                } else if (p_base == null && uriSpec.indexOf('#') != 0
0552:                        && !allowNonAbsoluteURI) {
0553:                    throw new MalformedURIException("No scheme found in URI.");
0554:                }
0555:
0556:                // Two slashes means we may have authority, but definitely means we're either
0557:                // matching net_path or abs_path. These two productions are ambiguous in that
0558:                // every net_path (except those containing an IPv6Reference) is an abs_path. 
0559:                // RFC 2396 resolves this ambiguity by applying a greedy left most matching rule. 
0560:                // Try matching net_path first, and if that fails we don't have authority so 
0561:                // then attempt to match abs_path.
0562:                //
0563:                // net_path = "//" authority [ abs_path ]
0564:                // abs_path = "/"  path_segments
0565:                if (((index + 1) < uriSpecLen)
0566:                        && (uriSpec.charAt(index) == '/' && uriSpec
0567:                                .charAt(index + 1) == '/')) {
0568:                    index += 2;
0569:                    int startPos = index;
0570:
0571:                    // Authority will be everything up to path, query or fragment
0572:                    char testChar = '\0';
0573:                    while (index < uriSpecLen) {
0574:                        testChar = uriSpec.charAt(index);
0575:                        if (testChar == '/' || testChar == '?'
0576:                                || testChar == '#') {
0577:                            break;
0578:                        }
0579:                        index++;
0580:                    }
0581:
0582:                    // Attempt to parse authority. If the section is an empty string
0583:                    // this is a valid server based authority, so set the host to this
0584:                    // value.
0585:                    if (index > startPos) {
0586:                        // If we didn't find authority we need to back up. Attempt to
0587:                        // match against abs_path next.
0588:                        if (!initializeAuthority(uriSpec.substring(startPos,
0589:                                index))) {
0590:                            index = startPos - 2;
0591:                        }
0592:                    } else {
0593:                        m_host = "";
0594:                    }
0595:                }
0596:
0597:                initializePath(uriSpec, index);
0598:
0599:                // Resolve relative URI to base URI - see RFC 2396 Section 5.2
0600:                // In some cases, it might make more sense to throw an exception
0601:                // (when scheme is specified is the string spec and the base URI
0602:                // is also specified, for example), but we're just following the
0603:                // RFC specifications
0604:                if (p_base != null) {
0605:                    absolutize(p_base);
0606:                }
0607:            }
0608:
0609:            /**
0610:             * Initializes this URI from a base URI and a URI specification string.
0611:             * See RFC 2396 Section 4 and Appendix B for specifications on parsing
0612:             * the URI and Section 5 for specifications on resolving relative URIs
0613:             * and relative paths.
0614:             *
0615:             * @param p_base the base URI (may be null if p_uriSpec is an absolute
0616:             *               URI)
0617:             * @param p_uriSpec the URI spec string which may be an absolute or
0618:             *                  relative URI (can only be null/empty if p_base
0619:             *                  is not null)
0620:             *
0621:             * @exception MalformedURIException if p_base is null and p_uriSpec
0622:             *                                  is not an absolute URI or if
0623:             *                                  p_uriSpec violates syntax rules
0624:             */
0625:            private void initialize(URI p_base, String p_uriSpec)
0626:                    throws MalformedURIException {
0627:
0628:                String uriSpec = p_uriSpec;
0629:                int uriSpecLen = (uriSpec != null) ? uriSpec.length() : 0;
0630:
0631:                if (p_base == null && uriSpecLen == 0) {
0632:                    throw new MalformedURIException(
0633:                            "Cannot initialize URI with empty parameters.");
0634:                }
0635:
0636:                // just make a copy of the base if spec is empty
0637:                if (uriSpecLen == 0) {
0638:                    initialize(p_base);
0639:                    return;
0640:                }
0641:
0642:                int index = 0;
0643:
0644:                // Check for scheme, which must be before '/', '?' or '#'.
0645:                int colonIdx = uriSpec.indexOf(':');
0646:                if (colonIdx != -1) {
0647:                    final int searchFrom = colonIdx - 1;
0648:                    // search backwards starting from character before ':'.
0649:                    int slashIdx = uriSpec.lastIndexOf('/', searchFrom);
0650:                    int queryIdx = uriSpec.lastIndexOf('?', searchFrom);
0651:                    int fragmentIdx = uriSpec.lastIndexOf('#', searchFrom);
0652:
0653:                    if (colonIdx == 0 || slashIdx != -1 || queryIdx != -1
0654:                            || fragmentIdx != -1) {
0655:                        // A standalone base is a valid URI according to spec
0656:                        if (colonIdx == 0
0657:                                || (p_base == null && fragmentIdx != 0)) {
0658:                            throw new MalformedURIException(
0659:                                    "No scheme found in URI.");
0660:                        }
0661:                    } else {
0662:                        initializeScheme(uriSpec);
0663:                        index = m_scheme.length() + 1;
0664:
0665:                        // Neither 'scheme:' or 'scheme:#fragment' are valid URIs.
0666:                        if (colonIdx == uriSpecLen - 1
0667:                                || uriSpec.charAt(colonIdx + 1) == '#') {
0668:                            throw new MalformedURIException(
0669:                                    "Scheme specific part cannot be empty.");
0670:                        }
0671:                    }
0672:                } else if (p_base == null && uriSpec.indexOf('#') != 0) {
0673:                    throw new MalformedURIException("No scheme found in URI.");
0674:                }
0675:
0676:                // Two slashes means we may have authority, but definitely means we're either
0677:                // matching net_path or abs_path. These two productions are ambiguous in that
0678:                // every net_path (except those containing an IPv6Reference) is an abs_path. 
0679:                // RFC 2396 resolves this ambiguity by applying a greedy left most matching rule. 
0680:                // Try matching net_path first, and if that fails we don't have authority so 
0681:                // then attempt to match abs_path.
0682:                //
0683:                // net_path = "//" authority [ abs_path ]
0684:                // abs_path = "/"  path_segments
0685:                if (((index + 1) < uriSpecLen)
0686:                        && (uriSpec.charAt(index) == '/' && uriSpec
0687:                                .charAt(index + 1) == '/')) {
0688:                    index += 2;
0689:                    int startPos = index;
0690:
0691:                    // Authority will be everything up to path, query or fragment
0692:                    char testChar = '\0';
0693:                    while (index < uriSpecLen) {
0694:                        testChar = uriSpec.charAt(index);
0695:                        if (testChar == '/' || testChar == '?'
0696:                                || testChar == '#') {
0697:                            break;
0698:                        }
0699:                        index++;
0700:                    }
0701:
0702:                    // Attempt to parse authority. If the section is an empty string
0703:                    // this is a valid server based authority, so set the host to this
0704:                    // value.
0705:                    if (index > startPos) {
0706:                        // If we didn't find authority we need to back up. Attempt to
0707:                        // match against abs_path next.
0708:                        if (!initializeAuthority(uriSpec.substring(startPos,
0709:                                index))) {
0710:                            index = startPos - 2;
0711:                        }
0712:                    } else {
0713:                        m_host = "";
0714:                    }
0715:                }
0716:
0717:                initializePath(uriSpec, index);
0718:
0719:                // Resolve relative URI to base URI - see RFC 2396 Section 5.2
0720:                // In some cases, it might make more sense to throw an exception
0721:                // (when scheme is specified is the string spec and the base URI
0722:                // is also specified, for example), but we're just following the
0723:                // RFC specifications
0724:                if (p_base != null) {
0725:                    absolutize(p_base);
0726:                }
0727:            }
0728:
0729:            /**
0730:             * Absolutize URI with given base URI.
0731:             *
0732:             * @param p_base base URI for absolutization
0733:             */
0734:            public void absolutize(URI p_base) {
0735:
0736:                // check to see if this is the current doc - RFC 2396 5.2 #2
0737:                // note that this is slightly different from the RFC spec in that
0738:                // we don't include the check for query string being null
0739:                // - this handles cases where the urispec is just a query
0740:                // string or a fragment (e.g. "?y" or "#s") -
0741:                // see <http://www.ics.uci.edu/~fielding/url/test1.html> which
0742:                // identified this as a bug in the RFC
0743:                if (m_path.length() == 0 && m_scheme == null && m_host == null
0744:                        && m_regAuthority == null) {
0745:                    m_scheme = p_base.getScheme();
0746:                    m_userinfo = p_base.getUserinfo();
0747:                    m_host = p_base.getHost();
0748:                    m_port = p_base.getPort();
0749:                    m_regAuthority = p_base.getRegBasedAuthority();
0750:                    m_path = p_base.getPath();
0751:
0752:                    if (m_queryString == null) {
0753:                        m_queryString = p_base.getQueryString();
0754:
0755:                        if (m_fragment == null) {
0756:                            m_fragment = p_base.getFragment();
0757:                        }
0758:                    }
0759:                    return;
0760:                }
0761:
0762:                // check for scheme - RFC 2396 5.2 #3
0763:                // if we found a scheme, it means absolute URI, so we're done
0764:                if (m_scheme == null) {
0765:                    m_scheme = p_base.getScheme();
0766:                } else {
0767:                    return;
0768:                }
0769:
0770:                // check for authority - RFC 2396 5.2 #4
0771:                // if we found a host, then we've got a network path, so we're done
0772:                if (m_host == null && m_regAuthority == null) {
0773:                    m_userinfo = p_base.getUserinfo();
0774:                    m_host = p_base.getHost();
0775:                    m_port = p_base.getPort();
0776:                    m_regAuthority = p_base.getRegBasedAuthority();
0777:                } else {
0778:                    return;
0779:                }
0780:
0781:                // check for absolute path - RFC 2396 5.2 #5
0782:                if (m_path.length() > 0 && m_path.startsWith("/")) {
0783:                    return;
0784:                }
0785:
0786:                // if we get to this point, we need to resolve relative path
0787:                // RFC 2396 5.2 #6
0788:                String path = "";
0789:                String basePath = p_base.getPath();
0790:
0791:                // 6a - get all but the last segment of the base URI path
0792:                if (basePath != null && basePath.length() > 0) {
0793:                    int lastSlash = basePath.lastIndexOf('/');
0794:                    if (lastSlash != -1) {
0795:                        path = basePath.substring(0, lastSlash + 1);
0796:                    }
0797:                } else if (m_path.length() > 0) {
0798:                    path = "/";
0799:                }
0800:
0801:                // 6b - append the relative URI path
0802:                path = path.concat(m_path);
0803:
0804:                // 6c - remove all "./" where "." is a complete path segment
0805:                int index = -1;
0806:                while ((index = path.indexOf("/./")) != -1) {
0807:                    path = path.substring(0, index + 1).concat(
0808:                            path.substring(index + 3));
0809:                }
0810:
0811:                // 6d - remove "." if path ends with "." as a complete path segment
0812:                if (path.endsWith("/.")) {
0813:                    path = path.substring(0, path.length() - 1);
0814:                }
0815:
0816:                // 6e - remove all "<segment>/../" where "<segment>" is a complete
0817:                // path segment not equal to ".."
0818:                index = 1;
0819:                int segIndex = -1;
0820:                String tempString = null;
0821:
0822:                while ((index = path.indexOf("/../", index)) > 0) {
0823:                    tempString = path.substring(0, path.indexOf("/../"));
0824:                    segIndex = tempString.lastIndexOf('/');
0825:                    if (segIndex != -1) {
0826:                        if (!tempString.substring(segIndex).equals("..")) {
0827:                            path = path.substring(0, segIndex + 1).concat(
0828:                                    path.substring(index + 4));
0829:                            index = segIndex;
0830:                        } else {
0831:                            index += 4;
0832:                        }
0833:                    } else {
0834:                        index += 4;
0835:                    }
0836:                }
0837:
0838:                // 6f - remove ending "<segment>/.." where "<segment>" is a
0839:                // complete path segment
0840:                if (path.endsWith("/..")) {
0841:                    tempString = path.substring(0, path.length() - 3);
0842:                    segIndex = tempString.lastIndexOf('/');
0843:                    if (segIndex != -1) {
0844:                        path = path.substring(0, segIndex + 1);
0845:                    }
0846:                }
0847:                m_path = path;
0848:            }
0849:
0850:            /**
0851:             * Initialize the scheme for this URI from a URI string spec.
0852:             *
0853:             * @param p_uriSpec the URI specification (cannot be null)
0854:             *
0855:             * @exception MalformedURIException if URI does not have a conformant
0856:             *                                  scheme
0857:             */
0858:            private void initializeScheme(String p_uriSpec)
0859:                    throws MalformedURIException {
0860:                int uriSpecLen = p_uriSpec.length();
0861:                int index = 0;
0862:                String scheme = null;
0863:                char testChar = '\0';
0864:
0865:                while (index < uriSpecLen) {
0866:                    testChar = p_uriSpec.charAt(index);
0867:                    if (testChar == ':' || testChar == '/' || testChar == '?'
0868:                            || testChar == '#') {
0869:                        break;
0870:                    }
0871:                    index++;
0872:                }
0873:                scheme = p_uriSpec.substring(0, index);
0874:
0875:                if (scheme.length() == 0) {
0876:                    throw new MalformedURIException("No scheme found in URI.");
0877:                } else {
0878:                    setScheme(scheme);
0879:                }
0880:            }
0881:
0882:            /**
0883:             * Initialize the authority (either server or registry based)
0884:             * for this URI from a URI string spec.
0885:             *
0886:             * @param p_uriSpec the URI specification (cannot be null)
0887:             * 
0888:             * @return true if the given string matched server or registry
0889:             * based authority
0890:             */
0891:            private boolean initializeAuthority(String p_uriSpec) {
0892:
0893:                int index = 0;
0894:                int start = 0;
0895:                int end = p_uriSpec.length();
0896:
0897:                char testChar = '\0';
0898:                String userinfo = null;
0899:
0900:                // userinfo is everything up to @
0901:                if (p_uriSpec.indexOf('@', start) != -1) {
0902:                    while (index < end) {
0903:                        testChar = p_uriSpec.charAt(index);
0904:                        if (testChar == '@') {
0905:                            break;
0906:                        }
0907:                        index++;
0908:                    }
0909:                    userinfo = p_uriSpec.substring(start, index);
0910:                    index++;
0911:                }
0912:
0913:                // host is everything up to last ':', or up to 
0914:                // and including ']' if followed by ':'.
0915:                String host = null;
0916:                start = index;
0917:                boolean hasPort = false;
0918:                if (index < end) {
0919:                    if (p_uriSpec.charAt(start) == '[') {
0920:                        int bracketIndex = p_uriSpec.indexOf(']', start);
0921:                        index = (bracketIndex != -1) ? bracketIndex : end;
0922:                        if (index + 1 < end
0923:                                && p_uriSpec.charAt(index + 1) == ':') {
0924:                            ++index;
0925:                            hasPort = true;
0926:                        } else {
0927:                            index = end;
0928:                        }
0929:                    } else {
0930:                        int colonIndex = p_uriSpec.lastIndexOf(':', end);
0931:                        index = (colonIndex > start) ? colonIndex : end;
0932:                        hasPort = (index != end);
0933:                    }
0934:                }
0935:                host = p_uriSpec.substring(start, index);
0936:                int port = -1;
0937:                if (host.length() > 0) {
0938:                    // port
0939:                    if (hasPort) {
0940:                        index++;
0941:                        start = index;
0942:                        while (index < end) {
0943:                            index++;
0944:                        }
0945:                        String portStr = p_uriSpec.substring(start, index);
0946:                        if (portStr.length() > 0) {
0947:                            // REVISIT: Remove this code.
0948:                            /** for (int i = 0; i < portStr.length(); i++) {
0949:                              if (!isDigit(portStr.charAt(i))) {
0950:                                throw new MalformedURIException(
0951:                                     portStr +
0952:                                     " is invalid. Port should only contain digits!");
0953:                              }
0954:                            }**/
0955:                            // REVISIT: Remove this code.
0956:                            // Store port value as string instead of integer.
0957:                            try {
0958:                                port = Integer.parseInt(portStr);
0959:                                if (port == -1)
0960:                                    --port;
0961:                            } catch (NumberFormatException nfe) {
0962:                                port = -2;
0963:                            }
0964:                        }
0965:                    }
0966:                }
0967:
0968:                if (isValidServerBasedAuthority(host, port, userinfo)) {
0969:                    m_host = host;
0970:                    m_port = port;
0971:                    m_userinfo = userinfo;
0972:                    return true;
0973:                }
0974:                // Note: Registry based authority is being removed from a
0975:                // new spec for URI which would obsolete RFC 2396. If the
0976:                // spec is added to XML errata, processing of reg_name
0977:                // needs to be removed. - mrglavas.
0978:                else if (isValidRegistryBasedAuthority(p_uriSpec)) {
0979:                    m_regAuthority = p_uriSpec;
0980:                    return true;
0981:                }
0982:                return false;
0983:            }
0984:
0985:            /**
0986:             * Determines whether the components host, port, and user info
0987:             * are valid as a server authority.
0988:             * 
0989:             * @param host the host component of authority
0990:             * @param port the port number component of authority
0991:             * @param userinfo the user info component of authority
0992:             * 
0993:             * @return true if the given host, port, and userinfo compose
0994:             * a valid server authority
0995:             */
0996:            private boolean isValidServerBasedAuthority(String host, int port,
0997:                    String userinfo) {
0998:
0999:                // Check if the host is well formed.
1000:                if (!isWellFormedAddress(host)) {
1001:                    return false;
1002:                }
1003:
1004:                // Check that port is well formed if it exists.
1005:                // REVISIT: There's no restriction on port value ranges, but
1006:                // perform the same check as in setPort to be consistent. Pass
1007:                // in a string to this method instead of an integer.
1008:                if (port < -1 || port > 65535) {
1009:                    return false;
1010:                }
1011:
1012:                // Check that userinfo is well formed if it exists.
1013:                if (userinfo != null) {
1014:                    // Userinfo can contain alphanumerics, mark characters, escaped
1015:                    // and ';',':','&','=','+','$',','
1016:                    int index = 0;
1017:                    int end = userinfo.length();
1018:                    char testChar = '\0';
1019:                    while (index < end) {
1020:                        testChar = userinfo.charAt(index);
1021:                        if (testChar == '%') {
1022:                            if (index + 2 >= end
1023:                                    || !isHex(userinfo.charAt(index + 1))
1024:                                    || !isHex(userinfo.charAt(index + 2))) {
1025:                                return false;
1026:                            }
1027:                            index += 2;
1028:                        } else if (!isUserinfoCharacter(testChar)) {
1029:                            return false;
1030:                        }
1031:                        ++index;
1032:                    }
1033:                }
1034:                return true;
1035:            }
1036:
1037:            /**
1038:             * Determines whether the given string is a registry based authority.
1039:             * 
1040:             * @param authority the authority component of a URI
1041:             * 
1042:             * @return true if the given string is a registry based authority
1043:             */
1044:            private boolean isValidRegistryBasedAuthority(String authority) {
1045:                int index = 0;
1046:                int end = authority.length();
1047:                char testChar;
1048:
1049:                while (index < end) {
1050:                    testChar = authority.charAt(index);
1051:
1052:                    // check for valid escape sequence
1053:                    if (testChar == '%') {
1054:                        if (index + 2 >= end
1055:                                || !isHex(authority.charAt(index + 1))
1056:                                || !isHex(authority.charAt(index + 2))) {
1057:                            return false;
1058:                        }
1059:                        index += 2;
1060:                    }
1061:                    // can check against path characters because the set
1062:                    // is the same except for '/' which we've already excluded.
1063:                    else if (!isPathCharacter(testChar)) {
1064:                        return false;
1065:                    }
1066:                    ++index;
1067:                }
1068:                return true;
1069:            }
1070:
1071:            /**
1072:             * Initialize the path for this URI from a URI string spec.
1073:             *
1074:             * @param p_uriSpec the URI specification (cannot be null)
1075:             * @param p_nStartIndex the index to begin scanning from
1076:             *
1077:             * @exception MalformedURIException if p_uriSpec violates syntax rules
1078:             */
1079:            private void initializePath(String p_uriSpec, int p_nStartIndex)
1080:                    throws MalformedURIException {
1081:                if (p_uriSpec == null) {
1082:                    throw new MalformedURIException(
1083:                            "Cannot initialize path from null string!");
1084:                }
1085:
1086:                int index = p_nStartIndex;
1087:                int start = p_nStartIndex;
1088:                int end = p_uriSpec.length();
1089:                char testChar = '\0';
1090:
1091:                // path - everything up to query string or fragment
1092:                if (start < end) {
1093:                    // RFC 2732 only allows '[' and ']' to appear in the opaque part.
1094:                    if (getScheme() == null || p_uriSpec.charAt(start) == '/') {
1095:
1096:                        // Scan path.
1097:                        // abs_path = "/"  path_segments
1098:                        // rel_path = rel_segment [ abs_path ]
1099:                        while (index < end) {
1100:                            testChar = p_uriSpec.charAt(index);
1101:
1102:                            // check for valid escape sequence
1103:                            if (testChar == '%') {
1104:                                if (index + 2 >= end
1105:                                        || !isHex(p_uriSpec.charAt(index + 1))
1106:                                        || !isHex(p_uriSpec.charAt(index + 2))) {
1107:                                    throw new MalformedURIException(
1108:                                            "Path contains invalid escape sequence!");
1109:                                }
1110:                                index += 2;
1111:                            }
1112:                            // Path segments cannot contain '[' or ']' since pchar
1113:                            // production was not changed by RFC 2732.
1114:                            else if (!isPathCharacter(testChar)) {
1115:                                if (testChar == '?' || testChar == '#') {
1116:                                    break;
1117:                                }
1118:                                throw new MalformedURIException(
1119:                                        "Path contains invalid character: "
1120:                                                + testChar);
1121:                            }
1122:                            ++index;
1123:                        }
1124:                    } else {
1125:
1126:                        // Scan opaque part.
1127:                        // opaque_part = uric_no_slash *uric
1128:                        while (index < end) {
1129:                            testChar = p_uriSpec.charAt(index);
1130:
1131:                            if (testChar == '?' || testChar == '#') {
1132:                                break;
1133:                            }
1134:
1135:                            // check for valid escape sequence
1136:                            if (testChar == '%') {
1137:                                if (index + 2 >= end
1138:                                        || !isHex(p_uriSpec.charAt(index + 1))
1139:                                        || !isHex(p_uriSpec.charAt(index + 2))) {
1140:                                    throw new MalformedURIException(
1141:                                            "Opaque part contains invalid escape sequence!");
1142:                                }
1143:                                index += 2;
1144:                            }
1145:                            // If the scheme specific part is opaque, it can contain '['
1146:                            // and ']'. uric_no_slash wasn't modified by RFC 2732, which
1147:                            // I've interpreted as an error in the spec, since the 
1148:                            // production should be equivalent to (uric - '/'), and uric
1149:                            // contains '[' and ']'. - mrglavas
1150:                            else if (!isURICharacter(testChar)) {
1151:                                throw new MalformedURIException(
1152:                                        "Opaque part contains invalid character: "
1153:                                                + testChar);
1154:                            }
1155:                            ++index;
1156:                        }
1157:                    }
1158:                }
1159:                m_path = p_uriSpec.substring(start, index);
1160:
1161:                // query - starts with ? and up to fragment or end
1162:                if (testChar == '?') {
1163:                    index++;
1164:                    start = index;
1165:                    while (index < end) {
1166:                        testChar = p_uriSpec.charAt(index);
1167:                        if (testChar == '#') {
1168:                            break;
1169:                        }
1170:                        if (testChar == '%') {
1171:                            if (index + 2 >= end
1172:                                    || !isHex(p_uriSpec.charAt(index + 1))
1173:                                    || !isHex(p_uriSpec.charAt(index + 2))) {
1174:                                throw new MalformedURIException(
1175:                                        "Query string contains invalid escape sequence!");
1176:                            }
1177:                            index += 2;
1178:                        } else if (!isURICharacter(testChar)) {
1179:                            throw new MalformedURIException(
1180:                                    "Query string contains invalid character: "
1181:                                            + testChar);
1182:                        }
1183:                        index++;
1184:                    }
1185:                    m_queryString = p_uriSpec.substring(start, index);
1186:                }
1187:
1188:                // fragment - starts with #
1189:                if (testChar == '#') {
1190:                    index++;
1191:                    start = index;
1192:                    while (index < end) {
1193:                        testChar = p_uriSpec.charAt(index);
1194:
1195:                        if (testChar == '%') {
1196:                            if (index + 2 >= end
1197:                                    || !isHex(p_uriSpec.charAt(index + 1))
1198:                                    || !isHex(p_uriSpec.charAt(index + 2))) {
1199:                                throw new MalformedURIException(
1200:                                        "Fragment contains invalid escape sequence!");
1201:                            }
1202:                            index += 2;
1203:                        } else if (!isURICharacter(testChar)) {
1204:                            throw new MalformedURIException(
1205:                                    "Fragment contains invalid character: "
1206:                                            + testChar);
1207:                        }
1208:                        index++;
1209:                    }
1210:                    m_fragment = p_uriSpec.substring(start, index);
1211:                }
1212:            }
1213:
1214:            /**
1215:             * Get the scheme for this URI.
1216:             *
1217:             * @return the scheme for this URI
1218:             */
1219:            public String getScheme() {
1220:                return m_scheme;
1221:            }
1222:
1223:            /**
1224:             * Get the scheme-specific part for this URI (everything following the
1225:             * scheme and the first colon). See RFC 2396 Section 5.2 for spec.
1226:             *
1227:             * @return the scheme-specific part for this URI
1228:             */
1229:            public String getSchemeSpecificPart() {
1230:                StringBuffer schemespec = new StringBuffer();
1231:
1232:                if (m_host != null || m_regAuthority != null) {
1233:                    schemespec.append("//");
1234:
1235:                    // Server based authority.
1236:                    if (m_host != null) {
1237:
1238:                        if (m_userinfo != null) {
1239:                            schemespec.append(m_userinfo);
1240:                            schemespec.append('@');
1241:                        }
1242:
1243:                        schemespec.append(m_host);
1244:
1245:                        if (m_port != -1) {
1246:                            schemespec.append(':');
1247:                            schemespec.append(m_port);
1248:                        }
1249:                    }
1250:                    // Registry based authority.
1251:                    else {
1252:                        schemespec.append(m_regAuthority);
1253:                    }
1254:                }
1255:
1256:                if (m_path != null) {
1257:                    schemespec.append((m_path));
1258:                }
1259:
1260:                if (m_queryString != null) {
1261:                    schemespec.append('?');
1262:                    schemespec.append(m_queryString);
1263:                }
1264:
1265:                if (m_fragment != null) {
1266:                    schemespec.append('#');
1267:                    schemespec.append(m_fragment);
1268:                }
1269:
1270:                return schemespec.toString();
1271:            }
1272:
1273:            /**
1274:             * Get the userinfo for this URI.
1275:             *
1276:             * @return the userinfo for this URI (null if not specified).
1277:             */
1278:            public String getUserinfo() {
1279:                return m_userinfo;
1280:            }
1281:
1282:            /**
1283:             * Get the host for this URI.
1284:             *
1285:             * @return the host for this URI (null if not specified).
1286:             */
1287:            public String getHost() {
1288:                return m_host;
1289:            }
1290:
1291:            /**
1292:             * Get the port for this URI.
1293:             *
1294:             * @return the port for this URI (-1 if not specified).
1295:             */
1296:            public int getPort() {
1297:                return m_port;
1298:            }
1299:
1300:            /**
1301:             * Get the registry based authority for this URI.
1302:             * 
1303:             * @return the registry based authority (null if not specified).
1304:             */
1305:            public String getRegBasedAuthority() {
1306:                return m_regAuthority;
1307:            }
1308:
1309:            /**
1310:             * Get the authority for this URI.
1311:             * 
1312:             * @return the authority
1313:             */
1314:            public String getAuthority() {
1315:                StringBuffer authority = new StringBuffer();
1316:                if (m_host != null || m_regAuthority != null) {
1317:                    authority.append("//");
1318:
1319:                    // Server based authority.
1320:                    if (m_host != null) {
1321:
1322:                        if (m_userinfo != null) {
1323:                            authority.append(m_userinfo);
1324:                            authority.append('@');
1325:                        }
1326:
1327:                        authority.append(m_host);
1328:
1329:                        if (m_port != -1) {
1330:                            authority.append(':');
1331:                            authority.append(m_port);
1332:                        }
1333:                    }
1334:                    // Registry based authority.
1335:                    else {
1336:                        authority.append(m_regAuthority);
1337:                    }
1338:                }
1339:                return authority.toString();
1340:            }
1341:
1342:            /**
1343:             * Get the path for this URI (optionally with the query string and
1344:             * fragment).
1345:             *
1346:             * @param p_includeQueryString if true (and query string is not null),
1347:             *                             then a "?" followed by the query string
1348:             *                             will be appended
1349:             * @param p_includeFragment if true (and fragment is not null),
1350:             *                             then a "#" followed by the fragment
1351:             *                             will be appended
1352:             *
1353:             * @return the path for this URI possibly including the query string
1354:             *         and fragment
1355:             */
1356:            public String getPath(boolean p_includeQueryString,
1357:                    boolean p_includeFragment) {
1358:                StringBuffer pathString = new StringBuffer(m_path);
1359:
1360:                if (p_includeQueryString && m_queryString != null) {
1361:                    pathString.append('?');
1362:                    pathString.append(m_queryString);
1363:                }
1364:
1365:                if (p_includeFragment && m_fragment != null) {
1366:                    pathString.append('#');
1367:                    pathString.append(m_fragment);
1368:                }
1369:                return pathString.toString();
1370:            }
1371:
1372:            /**
1373:             * Get the path for this URI. Note that the value returned is the path
1374:             * only and does not include the query string or fragment.
1375:             *
1376:             * @return the path for this URI.
1377:             */
1378:            public String getPath() {
1379:                return m_path;
1380:            }
1381:
1382:            /**
1383:             * Get the query string for this URI.
1384:             *
1385:             * @return the query string for this URI. Null is returned if there
1386:             *         was no "?" in the URI spec, empty string if there was a
1387:             *         "?" but no query string following it.
1388:             */
1389:            public String getQueryString() {
1390:                return m_queryString;
1391:            }
1392:
1393:            /**
1394:             * Get the fragment for this URI.
1395:             *
1396:             * @return the fragment for this URI. Null is returned if there
1397:             *         was no "#" in the URI spec, empty string if there was a
1398:             *         "#" but no fragment following it.
1399:             */
1400:            public String getFragment() {
1401:                return m_fragment;
1402:            }
1403:
1404:            /**
1405:             * Set the scheme for this URI. The scheme is converted to lowercase
1406:             * before it is set.
1407:             *
1408:             * @param p_scheme the scheme for this URI (cannot be null)
1409:             *
1410:             * @exception MalformedURIException if p_scheme is not a conformant
1411:             *                                  scheme name
1412:             */
1413:            public void setScheme(String p_scheme) throws MalformedURIException {
1414:                if (p_scheme == null) {
1415:                    throw new MalformedURIException(
1416:                            "Cannot set scheme from null string!");
1417:                }
1418:                if (!isConformantSchemeName(p_scheme)) {
1419:                    throw new MalformedURIException(
1420:                            "The scheme is not conformant.");
1421:                }
1422:
1423:                m_scheme = p_scheme.toLowerCase();
1424:            }
1425:
1426:            /**
1427:             * Set the userinfo for this URI. If a non-null value is passed in and
1428:             * the host value is null, then an exception is thrown.
1429:             *
1430:             * @param p_userinfo the userinfo for this URI
1431:             *
1432:             * @exception MalformedURIException if p_userinfo contains invalid
1433:             *                                  characters
1434:             */
1435:            public void setUserinfo(String p_userinfo)
1436:                    throws MalformedURIException {
1437:                if (p_userinfo == null) {
1438:                    m_userinfo = null;
1439:                    return;
1440:                } else {
1441:                    if (m_host == null) {
1442:                        throw new MalformedURIException(
1443:                                "Userinfo cannot be set when host is null!");
1444:                    }
1445:
1446:                    // userinfo can contain alphanumerics, mark characters, escaped
1447:                    // and ';',':','&','=','+','$',','
1448:                    int index = 0;
1449:                    int end = p_userinfo.length();
1450:                    char testChar = '\0';
1451:                    while (index < end) {
1452:                        testChar = p_userinfo.charAt(index);
1453:                        if (testChar == '%') {
1454:                            if (index + 2 >= end
1455:                                    || !isHex(p_userinfo.charAt(index + 1))
1456:                                    || !isHex(p_userinfo.charAt(index + 2))) {
1457:                                throw new MalformedURIException(
1458:                                        "Userinfo contains invalid escape sequence!");
1459:                            }
1460:                        } else if (!isUserinfoCharacter(testChar)) {
1461:                            throw new MalformedURIException(
1462:                                    "Userinfo contains invalid character:"
1463:                                            + testChar);
1464:                        }
1465:                        index++;
1466:                    }
1467:                }
1468:                m_userinfo = p_userinfo;
1469:            }
1470:
1471:            /**
1472:             * <p>Set the host for this URI. If null is passed in, the userinfo
1473:             * field is also set to null and the port is set to -1.</p>
1474:             * 
1475:             * <p>Note: This method overwrites registry based authority if it
1476:             * previously existed in this URI.</p>
1477:             *
1478:             * @param p_host the host for this URI
1479:             *
1480:             * @exception MalformedURIException if p_host is not a valid IP
1481:             *                                  address or DNS hostname.
1482:             */
1483:            public void setHost(String p_host) throws MalformedURIException {
1484:                if (p_host == null || p_host.length() == 0) {
1485:                    if (p_host != null) {
1486:                        m_regAuthority = null;
1487:                    }
1488:                    m_host = p_host;
1489:                    m_userinfo = null;
1490:                    m_port = -1;
1491:                    return;
1492:                } else if (!isWellFormedAddress(p_host)) {
1493:                    throw new MalformedURIException(
1494:                            "Host is not a well formed address!");
1495:                }
1496:                m_host = p_host;
1497:                m_regAuthority = null;
1498:            }
1499:
1500:            /**
1501:             * Set the port for this URI. -1 is used to indicate that the port is
1502:             * not specified, otherwise valid port numbers are  between 0 and 65535.
1503:             * If a valid port number is passed in and the host field is null,
1504:             * an exception is thrown.
1505:             *
1506:             * @param p_port the port number for this URI
1507:             *
1508:             * @exception MalformedURIException if p_port is not -1 and not a
1509:             *                                  valid port number
1510:             */
1511:            public void setPort(int p_port) throws MalformedURIException {
1512:                if (p_port >= 0 && p_port <= 65535) {
1513:                    if (m_host == null) {
1514:                        throw new MalformedURIException(
1515:                                "Port cannot be set when host is null!");
1516:                    }
1517:                } else if (p_port != -1) {
1518:                    throw new MalformedURIException("Invalid port number!");
1519:                }
1520:                m_port = p_port;
1521:            }
1522:
1523:            /**
1524:             * <p>Sets the registry based authority for this URI.</p>
1525:             * 
1526:             * <p>Note: This method overwrites server based authority
1527:             * if it previously existed in this URI.</p>
1528:             * 
1529:             * @param authority the registry based authority for this URI
1530:             * 
1531:             * @exception MalformedURIException it authority is not a
1532:             * well formed registry based authority
1533:             */
1534:            public void setRegBasedAuthority(String authority)
1535:                    throws MalformedURIException {
1536:
1537:                if (authority == null) {
1538:                    m_regAuthority = null;
1539:                    return;
1540:                }
1541:                // reg_name = 1*( unreserved | escaped | "$" | "," | 
1542:                //            ";" | ":" | "@" | "&" | "=" | "+" )
1543:                else if (authority.length() < 1
1544:                        || !isValidRegistryBasedAuthority(authority)
1545:                        || authority.indexOf('/') != -1) {
1546:                    throw new MalformedURIException(
1547:                            "Registry based authority is not well formed.");
1548:                }
1549:                m_regAuthority = authority;
1550:                m_host = null;
1551:                m_userinfo = null;
1552:                m_port = -1;
1553:            }
1554:
1555:            /**
1556:             * Set the path for this URI. If the supplied path is null, then the
1557:             * query string and fragment are set to null as well. If the supplied
1558:             * path includes a query string and/or fragment, these fields will be
1559:             * parsed and set as well. Note that, for URIs following the "generic
1560:             * URI" syntax, the path specified should start with a slash.
1561:             * For URIs that do not follow the generic URI syntax, this method
1562:             * sets the scheme-specific part.
1563:             *
1564:             * @param p_path the path for this URI (may be null)
1565:             *
1566:             * @exception MalformedURIException if p_path contains invalid
1567:             *                                  characters
1568:             */
1569:            public void setPath(String p_path) throws MalformedURIException {
1570:                if (p_path == null) {
1571:                    m_path = null;
1572:                    m_queryString = null;
1573:                    m_fragment = null;
1574:                } else {
1575:                    initializePath(p_path, 0);
1576:                }
1577:            }
1578:
1579:            /**
1580:             * Append to the end of the path of this URI. If the current path does
1581:             * not end in a slash and the path to be appended does not begin with
1582:             * a slash, a slash will be appended to the current path before the
1583:             * new segment is added. Also, if the current path ends in a slash
1584:             * and the new segment begins with a slash, the extra slash will be
1585:             * removed before the new segment is appended.
1586:             *
1587:             * @param p_addToPath the new segment to be added to the current path
1588:             *
1589:             * @exception MalformedURIException if p_addToPath contains syntax
1590:             *                                  errors
1591:             */
1592:            public void appendPath(String p_addToPath)
1593:                    throws MalformedURIException {
1594:                if (p_addToPath == null || p_addToPath.trim().length() == 0) {
1595:                    return;
1596:                }
1597:
1598:                if (!isURIString(p_addToPath)) {
1599:                    throw new MalformedURIException(
1600:                            "Path contains invalid character!");
1601:                }
1602:
1603:                if (m_path == null || m_path.trim().length() == 0) {
1604:                    if (p_addToPath.startsWith("/")) {
1605:                        m_path = p_addToPath;
1606:                    } else {
1607:                        m_path = "/" + p_addToPath;
1608:                    }
1609:                } else if (m_path.endsWith("/")) {
1610:                    if (p_addToPath.startsWith("/")) {
1611:                        m_path = m_path.concat(p_addToPath.substring(1));
1612:                    } else {
1613:                        m_path = m_path.concat(p_addToPath);
1614:                    }
1615:                } else {
1616:                    if (p_addToPath.startsWith("/")) {
1617:                        m_path = m_path.concat(p_addToPath);
1618:                    } else {
1619:                        m_path = m_path.concat("/" + p_addToPath);
1620:                    }
1621:                }
1622:            }
1623:
1624:            /**
1625:             * Set the query string for this URI. A non-null value is valid only
1626:             * if this is an URI conforming to the generic URI syntax and
1627:             * the path value is not null.
1628:             *
1629:             * @param p_queryString the query string for this URI
1630:             *
1631:             * @exception MalformedURIException if p_queryString is not null and this
1632:             *                                  URI does not conform to the generic
1633:             *                                  URI syntax or if the path is null
1634:             */
1635:            public void setQueryString(String p_queryString)
1636:                    throws MalformedURIException {
1637:                if (p_queryString == null) {
1638:                    m_queryString = null;
1639:                } else if (!isGenericURI()) {
1640:                    throw new MalformedURIException(
1641:                            "Query string can only be set for a generic URI!");
1642:                } else if (getPath() == null) {
1643:                    throw new MalformedURIException(
1644:                            "Query string cannot be set when path is null!");
1645:                } else if (!isURIString(p_queryString)) {
1646:                    throw new MalformedURIException(
1647:                            "Query string contains invalid character!");
1648:                } else {
1649:                    m_queryString = p_queryString;
1650:                }
1651:            }
1652:
1653:            /**
1654:             * Set the fragment for this URI. A non-null value is valid only
1655:             * if this is a URI conforming to the generic URI syntax and
1656:             * the path value is not null.
1657:             *
1658:             * @param p_fragment the fragment for this URI
1659:             *
1660:             * @exception MalformedURIException if p_fragment is not null and this
1661:             *                                  URI does not conform to the generic
1662:             *                                  URI syntax or if the path is null
1663:             */
1664:            public void setFragment(String p_fragment)
1665:                    throws MalformedURIException {
1666:                if (p_fragment == null) {
1667:                    m_fragment = null;
1668:                } else if (!isGenericURI()) {
1669:                    throw new MalformedURIException(
1670:                            "Fragment can only be set for a generic URI!");
1671:                } else if (getPath() == null) {
1672:                    throw new MalformedURIException(
1673:                            "Fragment cannot be set when path is null!");
1674:                } else if (!isURIString(p_fragment)) {
1675:                    throw new MalformedURIException(
1676:                            "Fragment contains invalid character!");
1677:                } else {
1678:                    m_fragment = p_fragment;
1679:                }
1680:            }
1681:
1682:            /**
1683:             * Determines if the passed-in Object is equivalent to this URI.
1684:             *
1685:             * @param p_test the Object to test for equality.
1686:             *
1687:             * @return true if p_test is a URI with all values equal to this
1688:             *         URI, false otherwise
1689:             */
1690:            public boolean equals(Object p_test) {
1691:                if (p_test instanceof  URI) {
1692:                    URI testURI = (URI) p_test;
1693:                    if (((m_scheme == null && testURI.m_scheme == null) || (m_scheme != null
1694:                            && testURI.m_scheme != null && m_scheme
1695:                            .equals(testURI.m_scheme)))
1696:                            && ((m_userinfo == null && testURI.m_userinfo == null) || (m_userinfo != null
1697:                                    && testURI.m_userinfo != null && m_userinfo
1698:                                    .equals(testURI.m_userinfo)))
1699:                            && ((m_host == null && testURI.m_host == null) || (m_host != null
1700:                                    && testURI.m_host != null && m_host
1701:                                    .equals(testURI.m_host)))
1702:                            && m_port == testURI.m_port
1703:                            && ((m_path == null && testURI.m_path == null) || (m_path != null
1704:                                    && testURI.m_path != null && m_path
1705:                                    .equals(testURI.m_path)))
1706:                            && ((m_queryString == null && testURI.m_queryString == null) || (m_queryString != null
1707:                                    && testURI.m_queryString != null && m_queryString
1708:                                    .equals(testURI.m_queryString)))
1709:                            && ((m_fragment == null && testURI.m_fragment == null) || (m_fragment != null
1710:                                    && testURI.m_fragment != null && m_fragment
1711:                                    .equals(testURI.m_fragment)))) {
1712:                        return true;
1713:                    }
1714:                }
1715:                return false;
1716:            }
1717:
1718:            /**
1719:             * Get the URI as a string specification. See RFC 2396 Section 5.2.
1720:             *
1721:             * @return the URI string specification
1722:             */
1723:            public String toString() {
1724:                StringBuffer uriSpecString = new StringBuffer();
1725:
1726:                if (m_scheme != null) {
1727:                    uriSpecString.append(m_scheme);
1728:                    uriSpecString.append(':');
1729:                }
1730:                uriSpecString.append(getSchemeSpecificPart());
1731:                return uriSpecString.toString();
1732:            }
1733:
1734:            /**
1735:             * Get the indicator as to whether this URI uses the "generic URI"
1736:             * syntax.
1737:             *
1738:             * @return true if this URI uses the "generic URI" syntax, false
1739:             *         otherwise
1740:             */
1741:            public boolean isGenericURI() {
1742:                // presence of the host (whether valid or empty) means
1743:                // double-slashes which means generic uri
1744:                return (m_host != null);
1745:            }
1746:
1747:            /**
1748:             * Returns whether this URI represents an absolute URI.
1749:             *
1750:             * @return true if this URI represents an absolute URI, false
1751:             *         otherwise
1752:             */
1753:            public boolean isAbsoluteURI() {
1754:                // presence of the scheme means absolute uri
1755:                return (m_scheme != null);
1756:            }
1757:
1758:            /**
1759:             * Determine whether a scheme conforms to the rules for a scheme name.
1760:             * A scheme is conformant if it starts with an alphanumeric, and
1761:             * contains only alphanumerics, '+','-' and '.'.
1762:             *
1763:             * @return true if the scheme is conformant, false otherwise
1764:             */
1765:            public static boolean isConformantSchemeName(String p_scheme) {
1766:                if (p_scheme == null || p_scheme.trim().length() == 0) {
1767:                    return false;
1768:                }
1769:
1770:                if (!isAlpha(p_scheme.charAt(0))) {
1771:                    return false;
1772:                }
1773:
1774:                char testChar;
1775:                int schemeLength = p_scheme.length();
1776:                for (int i = 1; i < schemeLength; ++i) {
1777:                    testChar = p_scheme.charAt(i);
1778:                    if (!isSchemeCharacter(testChar)) {
1779:                        return false;
1780:                    }
1781:                }
1782:
1783:                return true;
1784:            }
1785:
1786:            /**
1787:             * Determine whether a string is syntactically capable of representing
1788:             * a valid IPv4 address, IPv6 reference or the domain name of a network host. 
1789:             * A valid IPv4 address consists of four decimal digit groups separated by a
1790:             * '.'. Each group must consist of one to three digits. See RFC 2732 Section 3,
1791:             * and RFC 2373 Section 2.2, for the definition of IPv6 references. A hostname 
1792:             * consists of domain labels (each of which must begin and end with an alphanumeric 
1793:             * but may contain '-') separated & by a '.'. See RFC 2396 Section 3.2.2.
1794:             *
1795:             * @return true if the string is a syntactically valid IPv4 address, 
1796:             * IPv6 reference or hostname
1797:             */
1798:            public static boolean isWellFormedAddress(String address) {
1799:                if (address == null) {
1800:                    return false;
1801:                }
1802:
1803:                int addrLength = address.length();
1804:                if (addrLength == 0) {
1805:                    return false;
1806:                }
1807:
1808:                // Check if the host is a valid IPv6reference.
1809:                if (address.startsWith("[")) {
1810:                    return isWellFormedIPv6Reference(address);
1811:                }
1812:
1813:                // Cannot start with a '.', '-', or end with a '-'.
1814:                if (address.startsWith(".") || address.startsWith("-")
1815:                        || address.endsWith("-")) {
1816:                    return false;
1817:                }
1818:
1819:                // rightmost domain label starting with digit indicates IP address
1820:                // since top level domain label can only start with an alpha
1821:                // see RFC 2396 Section 3.2.2
1822:                int index = address.lastIndexOf('.');
1823:                if (address.endsWith(".")) {
1824:                    index = address.substring(0, index).lastIndexOf('.');
1825:                }
1826:
1827:                if (index + 1 < addrLength
1828:                        && isDigit(address.charAt(index + 1))) {
1829:                    return isWellFormedIPv4Address(address);
1830:                } else {
1831:                    // hostname      = *( domainlabel "." ) toplabel [ "." ]
1832:                    // domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
1833:                    // toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
1834:
1835:                    // RFC 2396 states that hostnames take the form described in 
1836:                    // RFC 1034 (Section 3) and RFC 1123 (Section 2.1). According
1837:                    // to RFC 1034, hostnames are limited to 255 characters.
1838:                    if (addrLength > 255) {
1839:                        return false;
1840:                    }
1841:
1842:                    // domain labels can contain alphanumerics and '-"
1843:                    // but must start and end with an alphanumeric
1844:                    char testChar;
1845:                    int labelCharCount = 0;
1846:
1847:                    for (int i = 0; i < addrLength; i++) {
1848:                        testChar = address.charAt(i);
1849:                        if (testChar == '.') {
1850:                            if (!isAlphanum(address.charAt(i - 1))) {
1851:                                return false;
1852:                            }
1853:                            if (i + 1 < addrLength
1854:                                    && !isAlphanum(address.charAt(i + 1))) {
1855:                                return false;
1856:                            }
1857:                            labelCharCount = 0;
1858:                        } else if (!isAlphanum(testChar) && testChar != '-') {
1859:                            return false;
1860:                        }
1861:                        // RFC 1034: Labels must be 63 characters or less.
1862:                        else if (++labelCharCount > 63) {
1863:                            return false;
1864:                        }
1865:                    }
1866:                }
1867:                return true;
1868:            }
1869:
1870:            /**
1871:             * <p>Determines whether a string is an IPv4 address as defined by 
1872:             * RFC 2373, and under the further constraint that it must be a 32-bit
1873:             * address. Though not expressed in the grammar, in order to satisfy 
1874:             * the 32-bit address constraint, each segment of the address cannot 
1875:             * be greater than 255 (8 bits of information).</p>
1876:             *
1877:             * <p><code>IPv4address = 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT</code></p>
1878:             *
1879:             * @return true if the string is a syntactically valid IPv4 address
1880:             */
1881:            public static boolean isWellFormedIPv4Address(String address) {
1882:
1883:                int addrLength = address.length();
1884:                char testChar;
1885:                int numDots = 0;
1886:                int numDigits = 0;
1887:
1888:                // make sure that 1) we see only digits and dot separators, 2) that
1889:                // any dot separator is preceded and followed by a digit and
1890:                // 3) that we find 3 dots
1891:                //
1892:                // RFC 2732 amended RFC 2396 by replacing the definition 
1893:                // of IPv4address with the one defined by RFC 2373. - mrglavas
1894:                //
1895:                // IPv4address = 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT
1896:                //
1897:                // One to three digits must be in each segment.
1898:                for (int i = 0; i < addrLength; i++) {
1899:                    testChar = address.charAt(i);
1900:                    if (testChar == '.') {
1901:                        if ((i > 0 && !isDigit(address.charAt(i - 1)))
1902:                                || (i + 1 < addrLength && !isDigit(address
1903:                                        .charAt(i + 1)))) {
1904:                            return false;
1905:                        }
1906:                        numDigits = 0;
1907:                        if (++numDots > 3) {
1908:                            return false;
1909:                        }
1910:                    } else if (!isDigit(testChar)) {
1911:                        return false;
1912:                    }
1913:                    // Check that that there are no more than three digits
1914:                    // in this segment.
1915:                    else if (++numDigits > 3) {
1916:                        return false;
1917:                    }
1918:                    // Check that this segment is not greater than 255.
1919:                    else if (numDigits == 3) {
1920:                        char first = address.charAt(i - 2);
1921:                        char second = address.charAt(i - 1);
1922:                        if (!(first < '2' || (first == '2' && (second < '5' || (second == '5' && testChar <= '5'))))) {
1923:                            return false;
1924:                        }
1925:                    }
1926:                }
1927:                return (numDots == 3);
1928:            }
1929:
1930:            /**
1931:             * <p>Determines whether a string is an IPv6 reference as defined
1932:             * by RFC 2732, where IPv6address is defined in RFC 2373. The 
1933:             * IPv6 address is parsed according to Section 2.2 of RFC 2373,
1934:             * with the additional constraint that the address be composed of
1935:             * 128 bits of information.</p>
1936:             *
1937:             * <p><code>IPv6reference = "[" IPv6address "]"</code></p>
1938:             *
1939:             * <p>Note: The BNF expressed in RFC 2373 Appendix B does not 
1940:             * accurately describe section 2.2, and was in fact removed from
1941:             * RFC 3513, the successor of RFC 2373.</p>
1942:             *
1943:             * @return true if the string is a syntactically valid IPv6 reference
1944:             */
1945:            public static boolean isWellFormedIPv6Reference(String address) {
1946:
1947:                int addrLength = address.length();
1948:                int index = 1;
1949:                int end = addrLength - 1;
1950:
1951:                // Check if string is a potential match for IPv6reference.
1952:                if (!(addrLength > 2 && address.charAt(0) == '[' && address
1953:                        .charAt(end) == ']')) {
1954:                    return false;
1955:                }
1956:
1957:                // Counter for the number of 16-bit sections read in the address.
1958:                int[] counter = new int[1];
1959:
1960:                // Scan hex sequence before possible '::' or IPv4 address.
1961:                index = scanHexSequence(address, index, end, counter);
1962:                if (index == -1) {
1963:                    return false;
1964:                }
1965:                // Address must contain 128-bits of information.
1966:                else if (index == end) {
1967:                    return (counter[0] == 8);
1968:                }
1969:
1970:                if (index + 1 < end && address.charAt(index) == ':') {
1971:                    if (address.charAt(index + 1) == ':') {
1972:                        // '::' represents at least one 16-bit group of zeros.
1973:                        if (++counter[0] > 8) {
1974:                            return false;
1975:                        }
1976:                        index += 2;
1977:                        // Trailing zeros will fill out the rest of the address.
1978:                        if (index == end) {
1979:                            return true;
1980:                        }
1981:                    }
1982:                    // If the second character wasn't ':', in order to be valid,
1983:                    // the remainder of the string must match IPv4Address, 
1984:                    // and we must have read exactly 6 16-bit groups.
1985:                    else {
1986:                        return (counter[0] == 6)
1987:                                && isWellFormedIPv4Address(address.substring(
1988:                                        index + 1, end));
1989:                    }
1990:                } else {
1991:                    return false;
1992:                }
1993:
1994:                // 3. Scan hex sequence after '::'.
1995:                int prevCount = counter[0];
1996:                index = scanHexSequence(address, index, end, counter);
1997:
1998:                // We've either reached the end of the string, the address ends in
1999:                // an IPv4 address, or it is invalid. scanHexSequence has already 
2000:                // made sure that we have the right number of bits. 
2001:                return (index == end)
2002:                        || (index != -1 && isWellFormedIPv4Address(address
2003:                                .substring((counter[0] > prevCount) ? index + 1
2004:                                        : index, end)));
2005:            }
2006:
2007:            /**
2008:             * Helper method for isWellFormedIPv6Reference which scans the 
2009:             * hex sequences of an IPv6 address. It returns the index of the 
2010:             * next character to scan in the address, or -1 if the string 
2011:             * cannot match a valid IPv6 address. 
2012:             *
2013:             * @param address the string to be scanned
2014:             * @param index the beginning index (inclusive)
2015:             * @param end the ending index (exclusive)
2016:             * @param counter a counter for the number of 16-bit sections read
2017:             * in the address
2018:             *
2019:             * @return the index of the next character to scan, or -1 if the
2020:             * string cannot match a valid IPv6 address
2021:             */
2022:            private static int scanHexSequence(String address, int index,
2023:                    int end, int[] counter) {
2024:
2025:                char testChar;
2026:                int numDigits = 0;
2027:                int start = index;
2028:
2029:                // Trying to match the following productions:
2030:                // hexseq = hex4 *( ":" hex4)
2031:                // hex4   = 1*4HEXDIG
2032:                for (; index < end; ++index) {
2033:                    testChar = address.charAt(index);
2034:                    if (testChar == ':') {
2035:                        // IPv6 addresses are 128-bit, so there can be at most eight sections.
2036:                        if (numDigits > 0 && ++counter[0] > 8) {
2037:                            return -1;
2038:                        }
2039:                        // This could be '::'.
2040:                        if (numDigits == 0
2041:                                || ((index + 1 < end) && address
2042:                                        .charAt(index + 1) == ':')) {
2043:                            return index;
2044:                        }
2045:                        numDigits = 0;
2046:                    }
2047:                    // This might be invalid or an IPv4address. If it's potentially an IPv4address,
2048:                    // backup to just after the last valid character that matches hexseq.
2049:                    else if (!isHex(testChar)) {
2050:                        if (testChar == '.' && numDigits < 4 && numDigits > 0
2051:                                && counter[0] <= 6) {
2052:                            int back = index - numDigits - 1;
2053:                            return (back >= start) ? back : (back + 1);
2054:                        }
2055:                        return -1;
2056:                    }
2057:                    // There can be at most 4 hex digits per group.
2058:                    else if (++numDigits > 4) {
2059:                        return -1;
2060:                    }
2061:                }
2062:                return (numDigits > 0 && ++counter[0] <= 8) ? end : -1;
2063:            }
2064:
2065:            /**
2066:             * Determine whether a char is a digit.
2067:             *
2068:             * @return true if the char is betweeen '0' and '9', false otherwise
2069:             */
2070:            private static boolean isDigit(char p_char) {
2071:                return p_char >= '0' && p_char <= '9';
2072:            }
2073:
2074:            /**
2075:             * Determine whether a character is a hexadecimal character.
2076:             *
2077:             * @return true if the char is betweeen '0' and '9', 'a' and 'f'
2078:             *         or 'A' and 'F', false otherwise
2079:             */
2080:            private static boolean isHex(char p_char) {
2081:                return (p_char <= 'f' && (fgLookupTable[p_char] & ASCII_HEX_CHARACTERS) != 0);
2082:            }
2083:
2084:            /**
2085:             * Determine whether a char is an alphabetic character: a-z or A-Z
2086:             *
2087:             * @return true if the char is alphabetic, false otherwise
2088:             */
2089:            private static boolean isAlpha(char p_char) {
2090:                return ((p_char >= 'a' && p_char <= 'z') || (p_char >= 'A' && p_char <= 'Z'));
2091:            }
2092:
2093:            /**
2094:             * Determine whether a char is an alphanumeric: 0-9, a-z or A-Z
2095:             *
2096:             * @return true if the char is alphanumeric, false otherwise
2097:             */
2098:            private static boolean isAlphanum(char p_char) {
2099:                return (p_char <= 'z' && (fgLookupTable[p_char] & MASK_ALPHA_NUMERIC) != 0);
2100:            }
2101:
2102:            /**
2103:             * Determine whether a character is a reserved character:
2104:             * ';', '/', '?', ':', '@', '&', '=', '+', '$', ',', '[', or ']'
2105:             *
2106:             * @return true if the string contains any reserved characters
2107:             */
2108:            private static boolean isReservedCharacter(char p_char) {
2109:                return (p_char <= ']' && (fgLookupTable[p_char] & RESERVED_CHARACTERS) != 0);
2110:            }
2111:
2112:            /**
2113:             * Determine whether a char is an unreserved character.
2114:             *
2115:             * @return true if the char is unreserved, false otherwise
2116:             */
2117:            private static boolean isUnreservedCharacter(char p_char) {
2118:                return (p_char <= '~' && (fgLookupTable[p_char] & MASK_UNRESERVED_MASK) != 0);
2119:            }
2120:
2121:            /**
2122:             * Determine whether a char is a URI character (reserved or 
2123:             * unreserved, not including '%' for escaped octets).
2124:             *
2125:             * @return true if the char is a URI character, false otherwise
2126:             */
2127:            private static boolean isURICharacter(char p_char) {
2128:                return (p_char <= '~' && (fgLookupTable[p_char] & MASK_URI_CHARACTER) != 0);
2129:            }
2130:
2131:            /**
2132:             * Determine whether a char is a scheme character.
2133:             *
2134:             * @return true if the char is a scheme character, false otherwise
2135:             */
2136:            private static boolean isSchemeCharacter(char p_char) {
2137:                return (p_char <= 'z' && (fgLookupTable[p_char] & MASK_SCHEME_CHARACTER) != 0);
2138:            }
2139:
2140:            /**
2141:             * Determine whether a char is a userinfo character.
2142:             *
2143:             * @return true if the char is a userinfo character, false otherwise
2144:             */
2145:            private static boolean isUserinfoCharacter(char p_char) {
2146:                return (p_char <= 'z' && (fgLookupTable[p_char] & MASK_USERINFO_CHARACTER) != 0);
2147:            }
2148:
2149:            /**
2150:             * Determine whether a char is a path character.
2151:             * 
2152:             * @return true if the char is a path character, false otherwise
2153:             */
2154:            private static boolean isPathCharacter(char p_char) {
2155:                return (p_char <= '~' && (fgLookupTable[p_char] & MASK_PATH_CHARACTER) != 0);
2156:            }
2157:
2158:            /**
2159:             * Determine whether a given string contains only URI characters (also
2160:             * called "uric" in RFC 2396). uric consist of all reserved
2161:             * characters, unreserved characters and escaped characters.
2162:             *
2163:             * @return true if the string is comprised of uric, false otherwise
2164:             */
2165:            private static boolean isURIString(String p_uric) {
2166:                if (p_uric == null) {
2167:                    return false;
2168:                }
2169:                int end = p_uric.length();
2170:                char testChar = '\0';
2171:                for (int i = 0; i < end; i++) {
2172:                    testChar = p_uric.charAt(i);
2173:                    if (testChar == '%') {
2174:                        if (i + 2 >= end || !isHex(p_uric.charAt(i + 1))
2175:                                || !isHex(p_uric.charAt(i + 2))) {
2176:                            return false;
2177:                        } else {
2178:                            i += 2;
2179:                            continue;
2180:                        }
2181:                    }
2182:                    if (isURICharacter(testChar)) {
2183:                        continue;
2184:                    } else {
2185:                        return false;
2186:                    }
2187:                }
2188:                return true;
2189:            }
2190:        }
www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.