Source Code Cross Referenced for CollationElementIterator.java in » 6.0-JDK-Modules » j2me » java » text » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1.	6.0 JDK Core
2.	6.0 JDK Modules
3.	6.0 JDK Modules com.sun
4.	6.0 JDK Modules com.sun.java
5.	6.0 JDK Modules sun
6.	6.0 JDK Platform
7.	Ajax
8.	Apache Harmony Java SE
9.	Aspect oriented
10.	Authentication Authorization
11.	Blogger System
12.	Build
13.	Byte Code
14.	Cache
15.	Chart
16.	Chat
17.	Code Analyzer
18.	Collaboration
19.	Content Management System
20.	Database Client
21.	Database DBMS
22.	Database JDBC Connection Pool
23.	Database ORM
24.	Development
25.	EJB Server geronimo
26.	EJB Server GlassFish
27.	EJB Server JBoss 4.2.1
28.	EJB Server resin 3.1.5
29.	ERP CRM Financial
30.	ESB
31.	Forum
32.	GIS
33.	Graphic Library
34.	Groupware
35.	HTML Parser
36.	IDE
37.	IDE Eclipse
38.	IDE Netbeans
39.	Installer
40.	Internationalization Localization
41.	Inversion of Control
42.	Issue Tracking
43.	J2EE
44.	JBoss
45.	JMS
46.	JMX
47.	Library
48.	Mail Clients
49.	Net
50.	Parser
51.	PDF
52.	Portal
53.	Profiler
54.	Project Management
55.	Report
56.	RSS RDF
57.	Rule Engine
58.	Science
59.	Scripting
60.	Search Engine
61.	Security
62.	Sevlet Container
63.	Source Control
64.	Swing Library
65.	Template Engine
66.	Test Coverage
67.	Testing
68.	UML
69.	Web Crawler
70.	Web Framework
71.	Web Mail
72.	Web Server
73.	Web Services
74.	Web Services apache cxf 2.0.1
75.	Web Services AXIS2
76.	Wiki Engine
77.	Workflow Engines
78.	XML
79.	XML UI
Java
Java Tutorial
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » 6.0 JDK Modules » j2me » java.text
Source Cross Referenced Class Diagram Java Document (Java Doc)
001:        /*
002:         *
003:         * @(#)CollationElementIterator.java	1.37 06/10/03
004:         *
005:         * Portions Copyright  2000-2006 Sun Microsystems, Inc. All Rights
006:         * Reserved.  Use is subject to license terms.
007:         * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER
008:         * 
009:         * This program is free software; you can redistribute it and/or
010:         * modify it under the terms of the GNU General Public License version
011:         * 2 only, as published by the Free Software Foundation.
012:         * 
013:         * This program is distributed in the hope that it will be useful, but
014:         * WITHOUT ANY WARRANTY; without even the implied warranty of
015:         * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
016:         * General Public License version 2 for more details (a copy is
017:         * included at /legal/license.txt).
018:         * 
019:         * You should have received a copy of the GNU General Public License
020:         * version 2 along with this work; if not, write to the Free Software
021:         * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
022:         * 02110-1301 USA
023:         * 
024:         * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa
025:         * Clara, CA 95054 or visit www.sun.com if you need additional
026:         * information or have any questions.
027:         */
028:
029:        /*
030:         * (C) Copyright Taligent, Inc. 1996, 1997 - All Rights Reserved
031:         * (C) Copyright IBM Corp. 1996-1998 - All Rights Reserved
032:         *
033:         *   The original version of this source code and documentation is copyrighted
034:         * and owned by Taligent, Inc., a wholly-owned subsidiary of IBM. These
035:         * materials are provided under terms of a License Agreement between Taligent
036:         * and Sun. This technology is protected by multiple US and International
037:         * patents. This notice and attribution to Taligent may not be removed.
038:         *   Taligent is a registered trademark of Taligent, Inc.
039:         *
040:         */
041:
042:        package java.text;
043:
044:        import java.lang.Character;
045:        import java.util.Vector;
046:        import sun.text.Normalizer;
047:        import sun.text.NormalizerUtilities;
048:
049:        /**
050:         * The <code>CollationElementIterator</code> class is used as an iterator
051:         * to walk through each character of an international string. Use the iterator
052:         * to return the ordering priority of the positioned character. The ordering
053:         * priority of a character, which we refer to as a key, defines how a character
054:         * is collated in the given collation object.
055:         *
056:         * <p>
057:         * For example, consider the following in Spanish:
058:         * <blockquote>
059:         * <pre>
060:         * "ca" -> the first key is key('c') and second key is key('a').
061:         * "cha" -> the first key is key('ch') and second key is key('a').
062:         * </pre>
063:         * </blockquote>
064:         * And in German,
065:         * <blockquote>
066:         * <pre>
067:         * "\u00e4b"-> the first key is key('a'), the second key is key('e'), and
068:         * the third key is key('b').
069:         * </pre>
070:         * </blockquote>
071:         * The key of a character is an integer composed of primary order(short),
072:         * secondary order(byte), and tertiary order(byte). Java strictly defines
073:         * the size and signedness of its primitive data types. Therefore, the static
074:         * functions <code>primaryOrder</code>, <code>secondaryOrder</code>, and
075:         * <code>tertiaryOrder</code> return <code>int</code>, <code>short</code>,
076:         * and <code>short</code> respectively to ensure the correctness of the key
077:         * value.
078:         *
079:         * <p>
080:         * Example of the iterator usage,
081:         * <blockquote>
082:         * <pre>
083:         *
084:         *  String testString = "This is a test";
085:         *  RuleBasedCollator ruleBasedCollator = (RuleBasedCollator)Collator.getInstance();
086:         *  CollationElementIterator collationElementIterator = ruleBasedCollator.getCollationElementIterator(testString);
087:         *  int primaryOrder = CollationElementIterator.primaryOrder(collationElementIterator.next());
088:         * </pre>
089:         * </blockquote>
090:         *
091:         * <p>
092:         * <code>CollationElementIterator.next</code> returns the collation order
093:         * of the next character. A collation order consists of primary order,
094:         * secondary order and tertiary order. The data type of the collation
095:         * order is <strong>int</strong>. The first 16 bits of a collation order
096:         * is its primary order; the next 8 bits is the secondary order and the
097:         * last 8 bits is the tertiary order.
098:         *
099:         * @see                Collator
100:         * @see                RuleBasedCollator
101:         * @version            1.24 07/27/98
102:         * @author             Helena Shih, Laura Werner, Richard Gillam
103:         */
104:        public final class CollationElementIterator {
105:            /**
106:             * Null order which indicates the end of string is reached by the
107:             * cursor.
108:             */
109:            public final static int NULLORDER = 0xffffffff;
110:
111:            /**
112:             * CollationElementIterator constructor.  This takes the source string and
113:             * the collation object.  The cursor will walk thru the source string based
114:             * on the predefined collation rules.  If the source string is empty,
115:             * NULLORDER will be returned on the calls to next().
116:             * @param sourceText the source string.
117:             * @param order the collation object.
118:             */
119:            CollationElementIterator(String sourceText, RuleBasedCollator owner) {
120:                this .owner = owner;
121:                ordering = owner.getTables();
122:                if (sourceText.length() != 0) {
123:                    Normalizer.Mode mode = NormalizerUtilities
124:                            .toNormalizerMode(owner.getDecomposition());
125:                    text = new Normalizer(sourceText, mode);
126:                }
127:            }
128:
129:            /**
130:             * CollationElementIterator constructor.  This takes the source string and
131:             * the collation object.  The cursor will walk thru the source string based
132:             * on the predefined collation rules.  If the source string is empty,
133:             * NULLORDER will be returned on the calls to next().
134:             * @param sourceText the source string.
135:             * @param order the collation object.
136:             */
137:            CollationElementIterator(CharacterIterator sourceText,
138:                    RuleBasedCollator owner) {
139:                this .owner = owner;
140:                ordering = owner.getTables();
141:                Normalizer.Mode mode = NormalizerUtilities
142:                        .toNormalizerMode(owner.getDecomposition());
143:                text = new Normalizer(sourceText, mode);
144:            }
145:
146:            /**
147:             * Resets the cursor to the beginning of the string.  The next call
148:             * to next() will return the first collation element in the string.
149:             */
150:            public void reset() {
151:                if (text != null) {
152:                    text.reset();
153:                    Normalizer.Mode mode = NormalizerUtilities
154:                            .toNormalizerMode(owner.getDecomposition());
155:                    text.setMode(mode);
156:                }
157:                buffer = null;
158:                expIndex = 0;
159:                swapOrder = 0;
160:            }
161:
162:            /**
163:             * Get the next collation element in the string.  <p>This iterator iterates
164:             * over a sequence of collation elements that were built from the string.
165:             * Because there isn't necessarily a one-to-one mapping from characters to
166:             * collation elements, this doesn't mean the same thing as "return the
167:             * collation element [or ordering priority] of the next character in the
168:             * string".</p>
169:             * <p>This function returns the collation element that the iterator is currently
170:             * pointing to and then updates the internal pointer to point to the next element.
171:             * previous() updates the pointer first and then returns the element.  This
172:             * means that when you change direction while iterating (i.e., call next() and
173:             * then call previous(), or call previous() and then call next()), you'll get
174:             * back the same element twice.</p>
175:             */
176:            public int next() {
177:                if (text == null) {
178:                    return NULLORDER;
179:                }
180:                Normalizer.Mode textMode = text.getMode();
181:                // convert the owner's mode to something the Normalizer understands
182:                Normalizer.Mode ownerMode = NormalizerUtilities
183:                        .toNormalizerMode(owner.getDecomposition());
184:                if (textMode != ownerMode) {
185:                    text.setMode(ownerMode);
186:                }
187:
188:                // if buffer contains any decomposed char values
189:                // return their strength orders before continuing in
190:                // the the Normalizer's CharacterIterator.
191:                if (buffer != null) {
192:                    if (expIndex < buffer.length) {
193:                        return strengthOrder(buffer[expIndex++]);
194:                    } else {
195:                        buffer = null;
196:                        expIndex = 0;
197:                    }
198:                } else if (swapOrder != 0) {
199:                    int order = swapOrder << 16;
200:                    swapOrder = 0;
201:                    return order;
202:                }
203:
204:                char ch = text.next();
205:
206:                // are we at the end of Normalizer's text?
207:                if (ch == Normalizer.DONE) {
208:                    return NULLORDER;
209:                }
210:
211:                int value = ordering.getUnicodeOrder(ch);
212:                if (value == RuleBasedCollator.UNMAPPED) {
213:                    swapOrder = ch;
214:                    return UNMAPPEDCHARVALUE;
215:                } else if (value >= RuleBasedCollator.CONTRACTCHARINDEX) {
216:                    value = nextContractChar(ch);
217:                }
218:                if (value >= RuleBasedCollator.EXPANDCHARINDEX) {
219:                    buffer = ordering.getExpandValueList(value);
220:                    expIndex = 0;
221:                    value = buffer[expIndex++];
222:                }
223:
224:                if (ordering.isSEAsianSwapping()) {
225:                    char consonant;
226:                    if (isThaiPreVowel(ch)) {
227:                        consonant = text.next();
228:                        if (isThaiBaseConsonant(consonant)) {
229:                            buffer = makeReorderedBuffer(consonant, value,
230:                                    buffer, true);
231:                            value = buffer[0];
232:                            expIndex = 1;
233:                        } else {
234:                            text.previous();
235:                        }
236:                    }
237:                    if (isLaoPreVowel(ch)) {
238:                        consonant = text.next();
239:                        if (isLaoBaseConsonant(consonant)) {
240:                            buffer = makeReorderedBuffer(consonant, value,
241:                                    buffer, true);
242:                            value = buffer[0];
243:                            expIndex = 1;
244:                        } else {
245:                            text.previous();
246:                        }
247:                    }
248:                }
249:
250:                return strengthOrder(value);
251:            }
252:
253:            /**
254:             * Get the previous collation element in the string.  <p>This iterator iterates
255:             * over a sequence of collation elements that were built from the string.
256:             * Because there isn't necessarily a one-to-one mapping from characters to
257:             * collation elements, this doesn't mean the same thing as "return the
258:             * collation element [or ordering priority] of the previous character in the
259:             * string".</p>
260:             * <p>This function updates the iterator's internal pointer to point to the
261:             * collation element preceding the one it's currently pointing to and then
262:             * returns that element, while next() returns the current element and then
263:             * updates the pointer.  This means that when you change direction while
264:             * iterating (i.e., call next() and then call previous(), or call previous()
265:             * and then call next()), you'll get back the same element twice.</p>
266:             * @since 1.2
267:             */
268:            public int previous() {
269:                if (text == null) {
270:                    return NULLORDER;
271:                }
272:                Normalizer.Mode textMode = text.getMode();
273:                // convert the owner's mode to something the Normalizer understands
274:                Normalizer.Mode ownerMode = NormalizerUtilities
275:                        .toNormalizerMode(owner.getDecomposition());
276:                if (textMode != ownerMode) {
277:                    text.setMode(ownerMode);
278:                }
279:                if (buffer != null) {
280:                    if (expIndex > 0) {
281:                        return strengthOrder(buffer[--expIndex]);
282:                    } else {
283:                        buffer = null;
284:                        expIndex = 0;
285:                    }
286:                } else if (swapOrder != 0) {
287:                    int order = swapOrder << 16;
288:                    swapOrder = 0;
289:                    return order;
290:                }
291:                char ch = text.previous();
292:                if (ch == Normalizer.DONE) {
293:                    return NULLORDER;
294:                }
295:
296:                int value = ordering.getUnicodeOrder(ch);
297:
298:                if (value == RuleBasedCollator.UNMAPPED) {
299:                    swapOrder = UNMAPPEDCHARVALUE;
300:                    return ch;
301:                } else if (value >= RuleBasedCollator.CONTRACTCHARINDEX) {
302:                    value = prevContractChar(ch);
303:                }
304:                if (value >= RuleBasedCollator.EXPANDCHARINDEX) {
305:                    buffer = ordering.getExpandValueList(value);
306:                    expIndex = buffer.length;
307:                    value = buffer[--expIndex];
308:                }
309:
310:                if (ordering.isSEAsianSwapping()) {
311:                    char vowel;
312:                    if (isThaiBaseConsonant(ch)) {
313:                        vowel = text.previous();
314:                        if (isThaiPreVowel(vowel)) {
315:                            buffer = makeReorderedBuffer(vowel, value, buffer,
316:                                    false);
317:                            expIndex = buffer.length - 1;
318:                            value = buffer[expIndex];
319:                        } else {
320:                            text.next();
321:                        }
322:                    }
323:                    if (isLaoBaseConsonant(ch)) {
324:                        vowel = text.previous();
325:                        if (isLaoPreVowel(vowel)) {
326:                            buffer = makeReorderedBuffer(vowel, value, buffer,
327:                                    false);
328:                            expIndex = buffer.length - 1;
329:                            value = buffer[expIndex];
330:                        } else {
331:                            text.next();
332:                        }
333:                    }
334:                }
335:
336:                return strengthOrder(value);
337:            }
338:
339:            /**
340:             * Return the primary component of a collation element.
341:             * @param order the collation element
342:             * @return the element's primary component
343:             */
344:            public final static int primaryOrder(int order) {
345:                order &= RBCollationTables.PRIMARYORDERMASK;
346:                return (order >>> RBCollationTables.PRIMARYORDERSHIFT);
347:            }
348:
349:            /**
350:             * Return the secondary component of a collation element.
351:             * @param order the collation element
352:             * @return the element's secondary component
353:             */
354:            public final static short secondaryOrder(int order) {
355:                order = order & RBCollationTables.SECONDARYORDERMASK;
356:                return ((short) (order >> RBCollationTables.SECONDARYORDERSHIFT));
357:            }
358:
359:            /**
360:             * Return the tertiary component of a collation element.
361:             * @param order the collation element
362:             * @return the element's tertiary component
363:             */
364:            public final static short tertiaryOrder(int order) {
365:                return ((short) (order &= RBCollationTables.TERTIARYORDERMASK));
366:            }
367:
368:            /**
369:             *  Get the comparison order in the desired strength.  Ignore the other
370:             *  differences.
371:             *  @param order The order value
372:             */
373:            final int strengthOrder(int order) {
374:                int s = owner.getStrength();
375:                if (s == Collator.PRIMARY) {
376:                    order &= RBCollationTables.PRIMARYDIFFERENCEONLY;
377:                } else if (s == Collator.SECONDARY) {
378:                    order &= RBCollationTables.SECONDARYDIFFERENCEONLY;
379:                }
380:                return order;
381:            }
382:
383:            /**
384:             * Sets the iterator to point to the collation element corresponding to
385:             * the specified character (the parameter is a CHARACTER offset in the
386:             * original string, not an offset into its corresponding sequence of
387:             * collation elements).  The value returned by the next call to next()
388:             * will be the collation element corresponding to the specified position
389:             * in the text.  If that position is in the middle of a contracting
390:             * character sequence, the result of the next call to next() is the
391:             * collation element for that sequence.  This means that getOffset()
392:             * is not guaranteed to return the same value as was passed to a preceding
393:             * call to setOffset().
394:             *
395:             * @param newOffset The new character offset into the original text.
396:             * @since 1.2
397:             */
398:            public void setOffset(int newOffset) {
399:                if (text != null) {
400:                    if (newOffset < text.getBeginIndex()
401:                            || newOffset >= text.getEndIndex()) {
402:                        text.setIndexOnly(newOffset);
403:                    } else {
404:                        char c = text.setIndex(newOffset);
405:                        // if the desired character isn't used in a contracting character
406:                        // sequence, bypass all the backing-up logic-- we're sitting on
407:                        // the right character already
408:                        if (ordering.usedInContractSeq(c)) {
409:                            // walk backwards through the string until we see a character
410:                            // that DOESN'T participate in a contracting character sequence
411:                            while (ordering.usedInContractSeq(c)) {
412:                                c = text.previous();
413:                            }
414:                            // now walk forward using this object's next() method until
415:                            // we pass the starting point and set our current position
416:                            // to the beginning of the last "character" before or at
417:                            // our starting position
418:                            int last = text.getIndex();
419:                            while (text.getIndex() <= newOffset) {
420:                                last = text.getIndex();
421:                                next();
422:                            }
423:                            text.setIndexOnly(last);
424:                            // we don't need this, since last is the last index 
425:                            // that is the starting of the contraction which encompass
426:                            // newOffset 
427:                            // text.previous();
428:                        }
429:                    }
430:                }
431:                buffer = null;
432:                expIndex = 0;
433:                swapOrder = 0;
434:            }
435:
436:            /**
437:             * Returns the character offset in the original text corresponding to the next
438:             * collation element.  (That is, getOffset() returns the position in the text
439:             * corresponding to the collation element that will be returned by the next
440:             * call to next().)  This value will always be the index of the FIRST character
441:             * corresponding to the collation element (a contracting character sequence is
442:             * when two or more characters all correspond to the same collation element).
443:             * This means if you do setOffset(x) followed immediately by getOffset(), getOffset()
444:             * won't necessarily return x.
445:             *
446:             * @return The character offset in the original text corresponding to the collation
447:             * element that will be returned by the next call to next().
448:             * @since 1.2
449:             */
450:            public int getOffset() {
451:                return (text != null) ? text.getIndex() : 0;
452:            }
453:
454:            /**
455:             * Return the maximum length of any expansion sequences that end
456:             * with the specified comparison order.
457:             * @param order a collation order returned by previous or next.
458:             * @return the maximum length of any expansion sequences ending
459:             *         with the specified order.
460:             * @since 1.2
461:             */
462:            public int getMaxExpansion(int order) {
463:                return ordering.getMaxExpansion(order);
464:            }
465:
466:            /**
467:             * Set a new string over which to iterate.
468:             *
469:             * @param source  the new source text
470:             * @since 1.2
471:             */
472:            public void setText(String source) {
473:                buffer = null;
474:                swapOrder = 0;
475:                expIndex = 0;
476:                Normalizer.Mode mode = NormalizerUtilities
477:                        .toNormalizerMode(owner.getDecomposition());
478:                if (text == null) {
479:                    text = new Normalizer(source, mode);
480:                } else {
481:                    text.setMode(mode);
482:                    text.setText(source);
483:                }
484:            }
485:
486:            /**
487:             * Set a new string over which to iterate.
488:             *
489:             * @param source  the new source text.
490:             * @since 1.2
491:             */
492:            public void setText(CharacterIterator source) {
493:                buffer = null;
494:                swapOrder = 0;
495:                expIndex = 0;
496:                Normalizer.Mode mode = NormalizerUtilities
497:                        .toNormalizerMode(owner.getDecomposition());
498:                if (text == null) {
499:                    text = new Normalizer(source, mode);
500:                } else {
501:                    text.setMode(mode);
502:                    text.setText(source);
503:                }
504:            }
505:
506:            //============================================================
507:            // privates
508:            //============================================================
509:
510:            /**
511:             * Determine if a character is a Thai vowel (which sorts after
512:             * its base consonant).
513:             */
514:            private final static boolean isThaiPreVowel(char ch) {
515:                return (ch >= '\u0e40') && (ch <= '\u0e44');
516:            }
517:
518:            /**
519:             * Determine if a character is a Thai base consonant
520:             */
521:            private final static boolean isThaiBaseConsonant(char ch) {
522:                return (ch >= '\u0e01') && (ch <= '\u0e2e');
523:            }
524:
525:            /**
526:             * Determine if a character is a Lao vowel (which sorts after
527:             * its base consonant).
528:             */
529:            private final static boolean isLaoPreVowel(char ch) {
530:                return (ch >= '\u0ec0') && (ch <= '\u0ec4');
531:            }
532:
533:            /**
534:             * Determine if a character is a Lao base consonant
535:             */
536:            private final static boolean isLaoBaseConsonant(char ch) {
537:                return (ch >= '\u0e81') && (ch <= '\u0eae');
538:            }
539:
540:            /**
541:             * This method produces a buffer which contains the collation
542:             * elements for the two characters, with colFirst's values preceding
543:             * another character's.  Presumably, the other character precedes colFirst
544:             * in logical order (otherwise you wouldn't need this method would you?).
545:             * The assumption is that the other char's value(s) have already been
546:             * computed.  If this char has a single element it is passed to this
547:             * method as lastValue, and lastExpansion is null.  If it has an
548:             * expansion it is passed in lastExpansion, and colLastValue is ignored.
549:             */
550:            private int[] makeReorderedBuffer(char colFirst, int lastValue,
551:                    int[] lastExpansion, boolean forward) {
552:
553:                int[] result;
554:
555:                int firstValue = ordering.getUnicodeOrder(colFirst);
556:                if (firstValue >= RuleBasedCollator.CONTRACTCHARINDEX) {
557:                    firstValue = forward ? nextContractChar(colFirst)
558:                            : prevContractChar(colFirst);
559:                }
560:
561:                int[] firstExpansion = null;
562:                if (firstValue >= RuleBasedCollator.EXPANDCHARINDEX) {
563:                    firstExpansion = ordering.getExpandValueList(firstValue);
564:                }
565:
566:                if (!forward) {
567:                    int temp1 = firstValue;
568:                    firstValue = lastValue;
569:                    lastValue = temp1;
570:                    int[] temp2 = firstExpansion;
571:                    firstExpansion = lastExpansion;
572:                    lastExpansion = temp2;
573:                }
574:
575:                if (firstExpansion == null && lastExpansion == null) {
576:                    result = new int[2];
577:                    result[0] = firstValue;
578:                    result[1] = lastValue;
579:                } else {
580:                    int firstLength = firstExpansion == null ? 1
581:                            : firstExpansion.length;
582:                    int lastLength = lastExpansion == null ? 1
583:                            : lastExpansion.length;
584:                    result = new int[firstLength + lastLength];
585:
586:                    if (firstExpansion == null) {
587:                        result[0] = firstValue;
588:                    } else {
589:                        System.arraycopy(firstExpansion, 0, result, 0,
590:                                firstLength);
591:                    }
592:
593:                    if (lastExpansion == null) {
594:                        result[firstLength] = lastValue;
595:                    } else {
596:                        System.arraycopy(lastExpansion, 0, result, firstLength,
597:                                lastLength);
598:                    }
599:                }
600:
601:                return result;
602:            }
603:
604:            /**
605:             *  Check if a comparison order is ignorable.
606:             *  @return true if a character is ignorable, false otherwise.
607:             */
608:            final static boolean isIgnorable(int order) {
609:                return ((primaryOrder(order) == 0) ? true : false);
610:            }
611:
612:            /**
613:             * Get the ordering priority of the next contracting character in the
614:             * string.
615:             * @param ch the starting character of a contracting character token
616:             * @return the next contracting character's ordering.  Returns NULLORDER
617:             * if the end of string is reached.
618:             */
619:            private int nextContractChar(char ch) {
620:                // First get the ordering of this single character,
621:                // which is always the first element in the list
622:                Vector list = ordering.getContractValues(ch);
623:                EntryPair pair = (EntryPair) list.firstElement();
624:                int order = pair.value;
625:
626:                // find out the length of the longest contracting character sequence in the list.
627:                // There's logic in the builder code to make sure the longest sequence is always
628:                // the last.
629:                pair = (EntryPair) list.lastElement();
630:                int maxLength = pair.entryName.length();
631:
632:                // (the Normalizer is cloned here so that the seeking we do in the next loop
633:                // won't affect our real position in the text)
634:                Normalizer tempText = (Normalizer) text.clone();
635:
636:                // extract the next maxLength characters in the string (we have to do this using the
637:                // Normalizer to ensure that our offsets correspond to those the rest of the
638:                // iterator is using) and store it in "fragment".
639:                tempText.previous();
640:                key.setLength(0);
641:                char c = tempText.next();
642:                while (maxLength > 0 && c != Normalizer.DONE) {
643:                    key.append(c);
644:                    --maxLength;
645:                    c = tempText.next();
646:                }
647:                String fragment = key.toString();
648:
649:                // now that we have that fragment, iterate through this list looking for the
650:                // longest sequence that matches the characters in the actual text.  (maxLength
651:                // is used here to keep track of the length of the longest sequence)
652:                // Upon exit from this loop, maxLength will contain the length of the matching
653:                // sequence and order will contain the collation-element value corresponding
654:                // to this sequence
655:                maxLength = 1;
656:                for (int i = list.size() - 1; i > 0; i--) {
657:                    pair = (EntryPair) list.elementAt(i);
658:                    if (!pair.fwd)
659:                        continue;
660:
661:                    if (fragment.startsWith(pair.entryName)
662:                            && pair.entryName.length() > maxLength) {
663:                        maxLength = pair.entryName.length();
664:                        order = pair.value;
665:                    }
666:                }
667:
668:                // seek our current iteration position to the end of the matching sequence
669:                // and return the appropriate collation-element value (if there was no matching
670:                // sequence, we're already seeked to the right position and order already contains
671:                // the correct collation-element value for the single character)
672:                while (maxLength > 1) {
673:                    text.next();
674:                    --maxLength;
675:                }
676:                return order;
677:            }
678:
679:            /**
680:             * Get the ordering priority of the previous contracting character in the
681:             * string.
682:             * @param ch the starting character of a contracting character token
683:             * @return the next contracting character's ordering.  Returns NULLORDER
684:             * if the end of string is reached.
685:             */
686:            private int prevContractChar(char ch) {
687:                // This function is identical to nextContractChar(), except that we've
688:                // switched things so that the next() and previous() calls on the Normalizer
689:                // are switched and so that we skip entry pairs with the fwd flag turned on
690:                // rather than off.  Notice that we still use append() and startsWith() when
691:                // working on the fragment.  This is because the entry pairs that are used
692:                // in reverse iteration have their names reversed already.
693:                Vector list = ordering.getContractValues(ch);
694:                EntryPair pair = (EntryPair) list.firstElement();
695:                int order = pair.value;
696:
697:                pair = (EntryPair) list.lastElement();
698:                int maxLength = pair.entryName.length();
699:
700:                Normalizer tempText = (Normalizer) text.clone();
701:
702:                tempText.next();
703:                key.setLength(0);
704:                char c = tempText.previous();
705:                while (maxLength > 0 && c != Normalizer.DONE) {
706:                    key.append(c);
707:                    --maxLength;
708:                    c = tempText.previous();
709:                }
710:                String fragment = key.toString();
711:
712:                maxLength = 1;
713:                for (int i = list.size() - 1; i > 0; i--) {
714:                    pair = (EntryPair) list.elementAt(i);
715:                    if (pair.fwd)
716:                        continue;
717:
718:                    if (fragment.startsWith(pair.entryName)
719:                            && pair.entryName.length() > maxLength) {
720:                        maxLength = pair.entryName.length();
721:                        order = pair.value;
722:                    }
723:                }
724:
725:                while (maxLength > 1) {
726:                    text.previous();
727:                    --maxLength;
728:                }
729:                return order;
730:            }
731:
732:            final static int UNMAPPEDCHARVALUE = 0x7FFF0000;
733:
734:            private Normalizer text = null;
735:            private int[] buffer = null;
736:            private int expIndex = 0;
737:            private StringBuffer key = new StringBuffer(5);
738:            private int swapOrder = 0;
739:            private RBCollationTables ordering;
740:            private RuleBasedCollator owner;
741:        }
www.java2java.com | Contact Us
All other trademarks are property of their respective owners.