Source Code Cross Referenced for StringTokenizer.java in  » 6.0-JDK-Core » Collections-Jar-Zip-Logging-regex » java » util » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Home
Java Source Code / Java Documentation
1.6.0 JDK Core
2.6.0 JDK Modules
3.6.0 JDK Modules com.sun
4.6.0 JDK Modules com.sun.java
5.6.0 JDK Modules sun
6.6.0 JDK Platform
7.Ajax
8.Apache Harmony Java SE
9.Aspect oriented
10.Authentication Authorization
11.Blogger System
12.Build
13.Byte Code
14.Cache
15.Chart
16.Chat
17.Code Analyzer
18.Collaboration
19.Content Management System
20.Database Client
21.Database DBMS
22.Database JDBC Connection Pool
23.Database ORM
24.Development
25.EJB Server
26.ERP CRM Financial
27.ESB
28.Forum
29.Game
30.GIS
31.Graphic 3D
32.Graphic Library
33.Groupware
34.HTML Parser
35.IDE
36.IDE Eclipse
37.IDE Netbeans
38.Installer
39.Internationalization Localization
40.Inversion of Control
41.Issue Tracking
42.J2EE
43.J2ME
44.JBoss
45.JMS
46.JMX
47.Library
48.Mail Clients
49.Music
50.Net
51.Parser
52.PDF
53.Portal
54.Profiler
55.Project Management
56.Report
57.RSS RDF
58.Rule Engine
59.Science
60.Scripting
61.Search Engine
62.Security
63.Sevlet Container
64.Source Control
65.Swing Library
66.Template Engine
67.Test Coverage
68.Testing
69.UML
70.Web Crawler
71.Web Framework
72.Web Mail
73.Web Server
74.Web Services
75.Web Services apache cxf 2.2.6
76.Web Services AXIS2
77.Wiki Engine
78.Workflow Engines
79.XML
80.XML UI
Java Source Code / Java Documentation » 6.0 JDK Core » Collections Jar Zip Logging regex » java.util 
Source Cross Referenced  Class Diagram Java Document (Java Doc) 


001        /*
002         * Copyright 1994-2004 Sun Microsystems, Inc.  All Rights Reserved.
003         * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
004         *
005         * This code is free software; you can redistribute it and/or modify it
006         * under the terms of the GNU General Public License version 2 only, as
007         * published by the Free Software Foundation.  Sun designates this
008         * particular file as subject to the "Classpath" exception as provided
009         * by Sun in the LICENSE file that accompanied this code.
010         *
011         * This code is distributed in the hope that it will be useful, but WITHOUT
012         * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
013         * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
014         * version 2 for more details (a copy is included in the LICENSE file that
015         * accompanied this code).
016         *
017         * You should have received a copy of the GNU General Public License version
018         * 2 along with this work; if not, write to the Free Software Foundation,
019         * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
020         *
021         * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
022         * CA 95054 USA or visit www.sun.com if you need additional information or
023         * have any questions.
024         */
025
026        package java.util;
027
028        import java.lang.*;
029
030        /**
031         * The string tokenizer class allows an application to break a 
032         * string into tokens. The tokenization method is much simpler than 
033         * the one used by the <code>StreamTokenizer</code> class. The 
034         * <code>StringTokenizer</code> methods do not distinguish among 
035         * identifiers, numbers, and quoted strings, nor do they recognize 
036         * and skip comments. 
037         * <p>
038         * The set of delimiters (the characters that separate tokens) may 
039         * be specified either at creation time or on a per-token basis. 
040         * <p>
041         * An instance of <code>StringTokenizer</code> behaves in one of two 
042         * ways, depending on whether it was created with the 
043         * <code>returnDelims</code> flag having the value <code>true</code> 
044         * or <code>false</code>: 
045         * <ul>
046         * <li>If the flag is <code>false</code>, delimiter characters serve to 
047         *     separate tokens. A token is a maximal sequence of consecutive 
048         *     characters that are not delimiters. 
049         * <li>If the flag is <code>true</code>, delimiter characters are themselves 
050         *     considered to be tokens. A token is thus either one delimiter 
051         *     character, or a maximal sequence of consecutive characters that are 
052         *     not delimiters.
053         * </ul><p>
054         * A <tt>StringTokenizer</tt> object internally maintains a current 
055         * position within the string to be tokenized. Some operations advance this 
056         * current position past the characters processed.<p>
057         * A token is returned by taking a substring of the string that was used to 
058         * create the <tt>StringTokenizer</tt> object.
059         * <p>
060         * The following is one example of the use of the tokenizer. The code:
061         * <blockquote><pre>
062         *     StringTokenizer st = new StringTokenizer("this is a test");
063         *     while (st.hasMoreTokens()) {
064         *         System.out.println(st.nextToken());
065         *     }
066         * </pre></blockquote>
067         * <p>
068         * prints the following output:
069         * <blockquote><pre>
070         *     this
071         *     is
072         *     a
073         *     test
074         * </pre></blockquote>
075         *
076         * <p>
077         * <tt>StringTokenizer</tt> is a legacy class that is retained for
078         * compatibility reasons although its use is discouraged in new code. It is
079         * recommended that anyone seeking this functionality use the <tt>split</tt>
080         * method of <tt>String</tt> or the java.util.regex package instead.
081         * <p>
082         * The following example illustrates how the <tt>String.split</tt>
083         * method can be used to break up a string into its basic tokens:
084         * <blockquote><pre>
085         *     String[] result = "this is a test".split("\\s");
086         *     for (int x=0; x&lt;result.length; x++)
087         *         System.out.println(result[x]);
088         * </pre></blockquote>
089         * <p>
090         * prints the following output:
091         * <blockquote><pre>
092         *     this
093         *     is
094         *     a
095         *     test
096         * </pre></blockquote>
097         *
098         * @author  unascribed
099         * @version 1.41, 05/05/07
100         * @see     java.io.StreamTokenizer
101         * @since   JDK1.0
102         */
103        public class StringTokenizer implements  Enumeration<Object> {
104            private int currentPosition;
105            private int newPosition;
106            private int maxPosition;
107            private String str;
108            private String delimiters;
109            private boolean retDelims;
110            private boolean delimsChanged;
111
112            /**
113             * maxDelimCodePoint stores the value of the delimiter character with the
114             * highest value. It is used to optimize the detection of delimiter
115             * characters.
116             *
117             * It is unlikely to provide any optimization benefit in the
118             * hasSurrogates case because most string characters will be
119             * smaller than the limit, but we keep it so that the two code
120             * paths remain similar.
121             */
122            private int maxDelimCodePoint;
123
124            /**
125             * If delimiters include any surrogates (including surrogate
126             * pairs), hasSurrogates is true and the tokenizer uses the
127             * different code path. This is because String.indexOf(int)
128             * doesn't handle unpaired surrogates as a single character.
129             */
130            private boolean hasSurrogates = false;
131
132            /**
133             * When hasSurrogates is true, delimiters are converted to code
134             * points and isDelimiter(int) is used to determine if the given
135             * codepoint is a delimiter.
136             */
137            private int[] delimiterCodePoints;
138
139            /**
140             * Set maxDelimCodePoint to the highest char in the delimiter set.
141             */
142            private void setMaxDelimCodePoint() {
143                if (delimiters == null) {
144                    maxDelimCodePoint = 0;
145                    return;
146                }
147
148                int m = 0;
149                int c;
150                int count = 0;
151                for (int i = 0; i < delimiters.length(); i += Character
152                        .charCount(c)) {
153                    c = delimiters.charAt(i);
154                    if (c >= Character.MIN_HIGH_SURROGATE
155                            && c <= Character.MAX_LOW_SURROGATE) {
156                        c = delimiters.codePointAt(i);
157                        hasSurrogates = true;
158                    }
159                    if (m < c)
160                        m = c;
161                    count++;
162                }
163                maxDelimCodePoint = m;
164
165                if (hasSurrogates) {
166                    delimiterCodePoints = new int[count];
167                    for (int i = 0, j = 0; i < count; i++, j += Character
168                            .charCount(c)) {
169                        c = delimiters.codePointAt(j);
170                        delimiterCodePoints[i] = c;
171                    }
172                }
173            }
174
175            /**
176             * Constructs a string tokenizer for the specified string. All  
177             * characters in the <code>delim</code> argument are the delimiters 
178             * for separating tokens. 
179             * <p>
180             * If the <code>returnDelims</code> flag is <code>true</code>, then 
181             * the delimiter characters are also returned as tokens. Each 
182             * delimiter is returned as a string of length one. If the flag is 
183             * <code>false</code>, the delimiter characters are skipped and only 
184             * serve as separators between tokens. 
185             * <p>
186             * Note that if <tt>delim</tt> is <tt>null</tt>, this constructor does
187             * not throw an exception. However, trying to invoke other methods on the
188             * resulting <tt>StringTokenizer</tt> may result in a 
189             * <tt>NullPointerException</tt>.
190             *
191             * @param   str            a string to be parsed.
192             * @param   delim          the delimiters.
193             * @param   returnDelims   flag indicating whether to return the delimiters
194             *                         as tokens.
195             * @exception NullPointerException if str is <CODE>null</CODE>
196             */
197            public StringTokenizer(String str, String delim,
198                    boolean returnDelims) {
199                currentPosition = 0;
200                newPosition = -1;
201                delimsChanged = false;
202                this .str = str;
203                maxPosition = str.length();
204                delimiters = delim;
205                retDelims = returnDelims;
206                setMaxDelimCodePoint();
207            }
208
209            /**
210             * Constructs a string tokenizer for the specified string. The 
211             * characters in the <code>delim</code> argument are the delimiters 
212             * for separating tokens. Delimiter characters themselves will not 
213             * be treated as tokens.
214             * <p>
215             * Note that if <tt>delim</tt> is <tt>null</tt>, this constructor does
216             * not throw an exception. However, trying to invoke other methods on the
217             * resulting <tt>StringTokenizer</tt> may result in a
218             * <tt>NullPointerException</tt>.
219             *
220             * @param   str     a string to be parsed.
221             * @param   delim   the delimiters.
222             * @exception NullPointerException if str is <CODE>null</CODE>
223             */
224            public StringTokenizer(String str, String delim) {
225                this (str, delim, false);
226            }
227
228            /**
229             * Constructs a string tokenizer for the specified string. The 
230             * tokenizer uses the default delimiter set, which is 
231             * <code>"&nbsp;&#92;t&#92;n&#92;r&#92;f"</code>: the space character, 
232             * the tab character, the newline character, the carriage-return character,
233             * and the form-feed character. Delimiter characters themselves will 
234             * not be treated as tokens.
235             *
236             * @param   str   a string to be parsed.
237             * @exception NullPointerException if str is <CODE>null</CODE> 
238             */
239            public StringTokenizer(String str) {
240                this (str, " \t\n\r\f", false);
241            }
242
243            /**
244             * Skips delimiters starting from the specified position. If retDelims
245             * is false, returns the index of the first non-delimiter character at or
246             * after startPos. If retDelims is true, startPos is returned.
247             */
248            private int skipDelimiters(int startPos) {
249                if (delimiters == null)
250                    throw new NullPointerException();
251
252                int position = startPos;
253                while (!retDelims && position < maxPosition) {
254                    if (!hasSurrogates) {
255                        char c = str.charAt(position);
256                        if ((c > maxDelimCodePoint)
257                                || (delimiters.indexOf(c) < 0))
258                            break;
259                        position++;
260                    } else {
261                        int c = str.codePointAt(position);
262                        if ((c > maxDelimCodePoint) || !isDelimiter(c)) {
263                            break;
264                        }
265                        position += Character.charCount(c);
266                    }
267                }
268                return position;
269            }
270
271            /**
272             * Skips ahead from startPos and returns the index of the next delimiter
273             * character encountered, or maxPosition if no such delimiter is found.
274             */
275            private int scanToken(int startPos) {
276                int position = startPos;
277                while (position < maxPosition) {
278                    if (!hasSurrogates) {
279                        char c = str.charAt(position);
280                        if ((c <= maxDelimCodePoint)
281                                && (delimiters.indexOf(c) >= 0))
282                            break;
283                        position++;
284                    } else {
285                        int c = str.codePointAt(position);
286                        if ((c <= maxDelimCodePoint) && isDelimiter(c))
287                            break;
288                        position += Character.charCount(c);
289                    }
290                }
291                if (retDelims && (startPos == position)) {
292                    if (!hasSurrogates) {
293                        char c = str.charAt(position);
294                        if ((c <= maxDelimCodePoint)
295                                && (delimiters.indexOf(c) >= 0))
296                            position++;
297                    } else {
298                        int c = str.codePointAt(position);
299                        if ((c <= maxDelimCodePoint) && isDelimiter(c))
300                            position += Character.charCount(c);
301                    }
302                }
303                return position;
304            }
305
306            private boolean isDelimiter(int codePoint) {
307                for (int i = 0; i < delimiterCodePoints.length; i++) {
308                    if (delimiterCodePoints[i] == codePoint) {
309                        return true;
310                    }
311                }
312                return false;
313            }
314
315            /**
316             * Tests if there are more tokens available from this tokenizer's string. 
317             * If this method returns <tt>true</tt>, then a subsequent call to 
318             * <tt>nextToken</tt> with no argument will successfully return a token.
319             *
320             * @return  <code>true</code> if and only if there is at least one token 
321             *          in the string after the current position; <code>false</code> 
322             *          otherwise.
323             */
324            public boolean hasMoreTokens() {
325                /*
326                 * Temporarily store this position and use it in the following
327                 * nextToken() method only if the delimiters haven't been changed in
328                 * that nextToken() invocation.
329                 */
330                newPosition = skipDelimiters(currentPosition);
331                return (newPosition < maxPosition);
332            }
333
334            /**
335             * Returns the next token from this string tokenizer.
336             *
337             * @return     the next token from this string tokenizer.
338             * @exception  NoSuchElementException  if there are no more tokens in this
339             *               tokenizer's string.
340             */
341            public String nextToken() {
342                /* 
343                 * If next position already computed in hasMoreElements() and
344                 * delimiters have changed between the computation and this invocation,
345                 * then use the computed value.
346                 */
347
348                currentPosition = (newPosition >= 0 && !delimsChanged) ? newPosition
349                        : skipDelimiters(currentPosition);
350
351                /* Reset these anyway */
352                delimsChanged = false;
353                newPosition = -1;
354
355                if (currentPosition >= maxPosition)
356                    throw new NoSuchElementException();
357                int start = currentPosition;
358                currentPosition = scanToken(currentPosition);
359                return str.substring(start, currentPosition);
360            }
361
362            /**
363             * Returns the next token in this string tokenizer's string. First, 
364             * the set of characters considered to be delimiters by this 
365             * <tt>StringTokenizer</tt> object is changed to be the characters in 
366             * the string <tt>delim</tt>. Then the next token in the string
367             * after the current position is returned. The current position is 
368             * advanced beyond the recognized token.  The new delimiter set 
369             * remains the default after this call. 
370             *
371             * @param      delim   the new delimiters.
372             * @return     the next token, after switching to the new delimiter set.
373             * @exception  NoSuchElementException  if there are no more tokens in this
374             *               tokenizer's string.
375             * @exception NullPointerException if delim is <CODE>null</CODE>
376             */
377            public String nextToken(String delim) {
378                delimiters = delim;
379
380                /* delimiter string specified, so set the appropriate flag. */
381                delimsChanged = true;
382
383                setMaxDelimCodePoint();
384                return nextToken();
385            }
386
387            /**
388             * Returns the same value as the <code>hasMoreTokens</code>
389             * method. It exists so that this class can implement the
390             * <code>Enumeration</code> interface. 
391             *
392             * @return  <code>true</code> if there are more tokens;
393             *          <code>false</code> otherwise.
394             * @see     java.util.Enumeration
395             * @see     java.util.StringTokenizer#hasMoreTokens()
396             */
397            public boolean hasMoreElements() {
398                return hasMoreTokens();
399            }
400
401            /**
402             * Returns the same value as the <code>nextToken</code> method,
403             * except that its declared return value is <code>Object</code> rather than
404             * <code>String</code>. It exists so that this class can implement the
405             * <code>Enumeration</code> interface. 
406             *
407             * @return     the next token in the string.
408             * @exception  NoSuchElementException  if there are no more tokens in this
409             *               tokenizer's string.
410             * @see        java.util.Enumeration
411             * @see        java.util.StringTokenizer#nextToken()
412             */
413            public Object nextElement() {
414                return nextToken();
415            }
416
417            /**
418             * Calculates the number of times that this tokenizer's 
419             * <code>nextToken</code> method can be called before it generates an 
420             * exception. The current position is not advanced.
421             *
422             * @return  the number of tokens remaining in the string using the current
423             *          delimiter set.
424             * @see     java.util.StringTokenizer#nextToken()
425             */
426            public int countTokens() {
427                int count = 0;
428                int currpos = currentPosition;
429                while (currpos < maxPosition) {
430                    currpos = skipDelimiters(currpos);
431                    if (currpos >= maxPosition)
432                        break;
433                    currpos = scanToken(currpos);
434                    count++;
435                }
436                return count;
437            }
438        }
www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.