Java Doc for CollationElementIterator.java in » Internationalization-Localization » icu4j » com » ibm » icu » text » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation

1.	6.0 JDK Core
2.	6.0 JDK Modules
3.	6.0 JDK Modules com.sun
4.	6.0 JDK Modules com.sun.java
5.	6.0 JDK Modules sun
6.	6.0 JDK Platform
7.	Ajax
8.	Apache Harmony Java SE
9.	Aspect oriented
10.	Authentication Authorization
11.	Blogger System
12.	Build
13.	Byte Code
14.	Cache
15.	Chart
16.	Chat
17.	Code Analyzer
18.	Collaboration
19.	Content Management System
20.	Database Client
21.	Database DBMS
22.	Database JDBC Connection Pool
23.	Database ORM
24.	Development
25.	EJB Server geronimo
26.	EJB Server GlassFish
27.	EJB Server JBoss 4.2.1
28.	EJB Server resin 3.1.5
29.	ERP CRM Financial
30.	ESB
31.	Forum
32.	GIS
33.	Graphic Library
34.	Groupware
35.	HTML Parser
36.	IDE
37.	IDE Eclipse
38.	IDE Netbeans
39.	Installer
40.	Internationalization Localization
41.	Inversion of Control
42.	Issue Tracking
43.	J2EE
44.	JBoss
45.	JMS
46.	JMX
47.	Library
48.	Mail Clients
49.	Net
50.	Parser
51.	PDF
52.	Portal
53.	Profiler
54.	Project Management
55.	Report
56.	RSS RDF
57.	Rule Engine
58.	Science
59.	Scripting
60.	Search Engine
61.	Security
62.	Sevlet Container
63.	Source Control
64.	Swing Library
65.	Template Engine
66.	Test Coverage
67.	Testing
68.	UML
69.	Web Crawler
70.	Web Framework
71.	Web Mail
72.	Web Server
73.	Web Services
74.	Web Services apache cxf 2.0.1
75.	Web Services AXIS2
76.	Wiki Engine
77.	Workflow Engines
78.	XML
79.	XML UI

Java

Java Tutorial

Illustrator Tutorials

GIMP Tutorials

C# / C Sharp

C# / CSharp Tutorial

C# / CSharp Open Source

SQL Server / T-SQL Tutorial

Oracle PL / SQL

Oracle PL/SQL Tutorial

Flash / Flex / ActionScript

VBA / Excel / Access / Word

XML

XML Tutorial

Microsoft Office PowerPoint 2007 Tutorial

Microsoft Office Excel 2007 Tutorial

Microsoft Office Word 2007 Tutorial

Java Source Code / Java Documentation » Internationalization Localization » icu4j » com.ibm.icu.text

Source Cross Reference Class Diagram Java Document (Java Doc)

java.lang .Object

com.ibm.icu.text .CollationElementIterator

CollationElementIterator

final public class CollationElementIterator (Code)

CollationElementIterator is an iterator created by a RuleBasedCollator to walk through a string. The return result of each iteration is a 32-bit collation element that defines the ordering priority of the next character or sequence of characters in the source string.

For illustration, consider the following in Spanish:

 "ca" -> the first collation element is collation_element('c') and second
 collation element is collation_element('a').
 Since "ch" in Spanish sorts as one entity, the below example returns one
 collation element for the two characters 'c' and 'h'
 "cha" -> the first collation element is collation_element('ch') and second
 collation element is collation_element('a').

And in German,

 Since the character 'æ' is a composed character of 'a' and 'e', the
 iterator returns two collation elements for the single character 'æ'
 "æb" -> the first collation element is collation_element('a'), the
 second collation element is collation_element('e'), and the
 third collation element is collation_element('b').

For collation ordering comparison, the collation element results can not be compared simply by using basic arithmetric operators, e.g. <, == or >, further processing has to be done. Details can be found in the ICU user guide. An example of using the CollationElementIterator for collation ordering comparison is the class com.ibm.icu.text.StringSearch.

To construct a CollationElementIterator object, users call the method getCollationElementIterator() on a RuleBasedCollator that defines the desired sorting order.

Example:

 String testString = "This is a test";
 RuleBasedCollator rbc = new RuleBasedCollator("&a<b");
 CollationElementIterator iterator = rbc.getCollationElementIterator(testString);
 int primaryOrder = iterator.IGNORABLE;
 while (primaryOrder != iterator.NULLORDER) {
 int order = iterator.next();
 if (order != iterator.IGNORABLE &&
 order != iterator.NULLORDER) {
 // order is valid, not ignorable and we have not passed the end
 // of the iteration, we do something
 primaryOrder = CollationElementIterator.primaryOrder(order);
 System.out.println("Next primary order 0x" +
 Integer.toHexString(primaryOrder));
 }
 }

This class is not subclassable

See Also:   Collator
See Also:   RuleBasedCollator
See Also:   StringSearch
author:
   Syn Wee Quek

Field Summary
final static int	CE_CONTRACTION_TAG_
final static int	CE_DIGIT_TAG_
final static int	CE_EXPANSION_TAG_
final static int	CE_NOT_FOUND_
final static int	CE_SPEC_PROC_TAG_
final public static int	IGNORABLE
final public static int	NULLORDER
int	m_CEBufferOffset_ This is the CE from CEs buffer that should be returned.
int	m_CEBufferSize_ This is the position to which we have stored processed CEs.
int	m_FCDStart_
boolean	m_isCodePointHiragana_

Constructor Summary
	CollationElementIterator(String source, RuleBasedCollator collator) CollationElementIterator constructor.
	CollationElementIterator(CharacterIterator source, RuleBasedCollator collator) CollationElementIterator constructor.
	CollationElementIterator(UCharacterIterator source, RuleBasedCollator collator) CollationElementIterator constructor.

Method Summary
public boolean	equals(Object that) Tests that argument object is equals to this CollationElementIterator.
public int	getMaxExpansion(int ce) Returns the maximum length of any expansion sequence that ends with the specified collation element.
public int	getOffset() Returns the character offset in the source string corresponding to the next collation element.
boolean	isInBuffer()
public int	next() Get the next collation element in the source string. This iterator iterates over a sequence of collation elements that were built from the string.
public int	previous() Get the previous collation element in the source string. This iterator iterates over a sequence of collation elements that were built from the string.
final public static int	primaryOrder(int ce) Return the primary order of the specified collation element, i.e.
public void	reset() Resets the cursor to the beginning of the string.
final public static int	secondaryOrder(int ce) Return the secondary order of the specified collation element, i.e.
void	setCollator(RuleBasedCollator collator) Sets the collator used.
void	setExactOffset(int offset) Sets the iterator to point to the collation element corresponding to the specified character (the parameter is a CHARACTER offset in the original string, not an offset into its corresponding sequence of collation elements).
public void	setOffset(int offset) Sets the iterator to point to the collation element corresponding to the character at the specified offset.
public void	setText(String source)
public void	setText(UCharacterIterator source) Set a new source string iterator for iteration, and reset the offset to the beginning of the text.
public void	setText(CharacterIterator source) Set a new source string iterator for iteration, and reset the offset to the beginning of the text.
void	setText(UCharacterIterator source, int offset) Sets the iterator to point to the collation element corresponding to the specified character (the parameter is a CHARACTER offset in the original string, not an offset into its corresponding sequence of collation elements).
final public static int	tertiaryOrder(int ce) Return the tertiary order of the specified collation element, i.e.

Field Detail

CE_CONTRACTION_TAG_
final static int CE_CONTRACTION_TAG_(Code)

CE_DIGIT_TAG_
final static int CE_DIGIT_TAG_(Code)
	Collate Digits As Numbers (CODAN) implementation

CE_EXPANSION_TAG_
final static int CE_EXPANSION_TAG_(Code)

CE_NOT_FOUND_
final static int CE_NOT_FOUND_(Code)

CE_SPEC_PROC_TAG_
final static int CE_SPEC_PROC_TAG_(Code)

IGNORABLE

final public static int IGNORABLE(Code)

This constant is returned by the iterator in the methods next() and previous() when a collation element result is to be ignored.

See class documentation for an example of use.

See Also:   CollationElementIterator.next
See Also:   CollationElementIterator.previous
See Also:

NULLORDER

final public static int NULLORDER(Code)

This constant is returned by the iterator in the methods next() and previous() when the end or the beginning of the source string has been reached, and there are no more valid collation elements to return.

See class documentation for an example of use.

See Also:   CollationElementIterator.next
See Also:   CollationElementIterator.previous
See Also:

m_CEBufferOffset_
int m_CEBufferOffset_(Code)
	This is the CE from CEs buffer that should be returned. Initial value is 0. Forwards iteration will end with m_CEBufferOffset_ == m_CEBufferSize_, backwards will end with m_CEBufferOffset_ == 0. The next/previous after we reach the end/beginning of the m_CEBuffer_ will cause this value to be reset to 0.

m_CEBufferSize_
int m_CEBufferSize_(Code)
	This is the position to which we have stored processed CEs. Initial value is 0. The next/previous after we reach the end/beginning of the m_CEBuffer_ will cause this value to be reset to 0.

m_FCDStart_
int m_FCDStart_(Code)
	Position in the original string that starts with a non-FCD sequence

m_isCodePointHiragana_
boolean m_isCodePointHiragana_(Code)
	true if current codepoint was Hiragana

Constructor Detail

CollationElementIterator
CollationElementIterator(String source, RuleBasedCollator collator)(Code)
	CollationElementIterator constructor. This takes a source string and a RuleBasedCollator. The iterator will walk through the source string based on the rules defined by the collator. If the source string is empty, NULLORDER will be returned on the first call to next(). Parameters: source - the source string. Parameters: collator - the RuleBasedCollator

CollationElementIterator
CollationElementIterator(CharacterIterator source, RuleBasedCollator collator)(Code)
	CollationElementIterator constructor. This takes a source character iterator and a RuleBasedCollator. The iterator will walk through the source string based on the rules defined by the collator. If the source string is empty, NULLORDER will be returned on the first call to next(). Parameters: source - the source string iterator. Parameters: collator - the RuleBasedCollator

CollationElementIterator
CollationElementIterator(UCharacterIterator source, RuleBasedCollator collator)(Code)
	CollationElementIterator constructor. This takes a source character iterator and a RuleBasedCollator. The iterator will walk through the source string based on the rules defined by the collator. If the source string is empty, NULLORDER will be returned on the first call to next(). Parameters: source - the source string iterator. Parameters: collator - the RuleBasedCollator

Method Detail

equals
public boolean equals(Object that)(Code)
	Tests that argument object is equals to this CollationElementIterator. Iterators are equal if the objects uses the same RuleBasedCollator, the same source text and have the same current position in iteration. Parameters: that - object to test if it is equals to thisCollationElementIterator

getMaxExpansion
public int getMaxExpansion(int ce)(Code)
	Returns the maximum length of any expansion sequence that ends with the specified collation element. If there is no expansion with this collation element as the last element, returns 1. Parameters: ce - a collation element returned by previous() or next(). the maximum length of any expansion sequence endingwith the specified collation element.

getOffset

public int getOffset()(Code)

Returns the character offset in the source string corresponding to the next collation element. I.e., getOffset() returns the position in the source string corresponding to the collation element that will be returned by the next call to next(). This value could be any of:

The index of the first character corresponding to the next collation element. (This means that if setOffset(offset) sets the index in the middle of a contraction, getOffset() returns the index of the first character in the contraction, which may not be equal to the original offset that was set. Hence calling getOffset() immediately after setOffset(offset) does not guarantee that the original offset set will be returned.)
If normalization is on, the index of the immediate subsequent character, or composite character with the first character, having a combining class of 0.
The length of the source string, if iteration has reached the end.

The character offset in the source string corresponding to thecollation element that will be returned by the next call tonext().

isInBuffer
boolean isInBuffer()(Code)
	Checks if iterator is in the buffer zone true if iterator is in buffer zone, false otherwise

next

public int next()(Code)

Get the next collation element in the source string.

This iterator iterates over a sequence of collation elements that were built from the string. Because there isn't necessarily a one-to-one mapping from characters to collation elements, this doesn't mean the same thing as "return the collation element [or ordering priority] of the next character in the string".

This function returns the collation element that the iterator is currently pointing to, and then updates the internal pointer to point to the next element. Previous() updates the pointer first, and then returns the element. This means that when you change direction while iterating (i.e., call next() and then call previous(), or call previous() and then call next()), you'll get back the same element twice.

the next collation element or NULLORDER if the end of theiteration has been reached.

previous

public int previous()(Code)

Get the previous collation element in the source string.

This function updates the iterator's internal pointer to point to the collation element preceding the one it's currently pointing to and then returns that element, while next() returns the current element and then updates the pointer. This means that when you change direction while iterating (i.e., call next() and then call previous(), or call previous() and then call next()), you'll get back the same element twice.

the previous collation element, or NULLORDER when the start ofthe iteration has been reached.

primaryOrder
final public static int primaryOrder(int ce)(Code)
	Return the primary order of the specified collation element, i.e. the first 16 bits. This value is unsigned. Parameters: ce - the collation element the element's 16 bits primary order.

reset

public void reset()(Code)

Resets the cursor to the beginning of the string. The next call to next() or previous() will return the first and last collation element in the string, respectively.

If the RuleBasedCollator used by this iterator has had its attributes changed, calling reset() will reinitialize the iterator to use the new attributes.

secondaryOrder
final public static int secondaryOrder(int ce)(Code)
	Return the secondary order of the specified collation element, i.e. the 16th to 23th bits, inclusive. This value is unsigned. Parameters: ce - the collation element the element's 8 bits secondary order

setCollator
void setCollator(RuleBasedCollator collator)(Code)
	Sets the collator used. Internal use, all data members will be reset to the default values Parameters: collator - to set

setExactOffset
void setExactOffset(int offset)(Code)
	Sets the iterator to point to the collation element corresponding to the specified character (the parameter is a CHARACTER offset in the original string, not an offset into its corresponding sequence of collation elements). The value returned by the next call to next() will be the collation element corresponding to the specified position in the text. Unlike the public method setOffset(int), this method does not try to readjust the offset to the start of a contracting sequence. getOffset() is guaranteed to return the same value as was passed to a preceding call to setOffset(). Parameters: offset - new character offset into the original text to set.

setOffset

public void setOffset(int offset)(Code)

Sets the iterator to point to the collation element corresponding to the character at the specified offset. The value returned by the next call to next() will be the collation element corresponding to the characters at offset.

If offset is in the middle of a contracting character sequence, the iterator is adjusted to the start of the contracting sequence. This means that getOffset() is not guaranteed to return the same value set by this method.

If the decomposition mode is on, and offset is in the middle of a decomposible range of source text, the iterator may not return a correct result for the next forwards or backwards iteration. The user must ensure that the offset is not in the middle of a decomposible range.

Parameters:
offset - the character offset into the original source string toset. Note that this is not an offset into the correspondingsequence of collation elements.

setText
public void setText(String source)(Code)
	Set a new source string for iteration, and reset the offset to the beginning of the text. Parameters: source - the new source string for iteration.

setText

public void setText(UCharacterIterator source)(Code)

Set a new source string iterator for iteration, and reset the offset to the beginning of the text.

The source iterator's integrity will be preserved since a new copy will be created for use.

Parameters:
source - the new source string iterator for iteration.

setText
public void setText(CharacterIterator source)(Code)
	Set a new source string iterator for iteration, and reset the offset to the beginning of the text. Parameters: source - the new source string iterator for iteration.

setText

void setText(UCharacterIterator source, int offset)(Code)

Sets the iterator to point to the collation element corresponding to the specified character (the parameter is a CHARACTER offset in the original string, not an offset into its corresponding sequence of collation elements). The value returned by the next call to next() will be the collation element corresponding to the specified position in the text. Unlike the public method setOffset(int), this method does not try to readjust the offset to the start of a contracting sequence. getOffset() is guaranteed to return the same value as was passed to a preceding call to setOffset().

Parameters:
source - the new source string iterator for iteration.
Parameters:
offset - to the source

tertiaryOrder
final public static int tertiaryOrder(int ce)(Code)
	Return the tertiary order of the specified collation element, i.e. the last 8 bits. This value is unsigned. Parameters: ce - the collation element the element's 8 bits tertiary order

Methods inherited from java.lang.Object

native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us

All other trademarks are property of their respective owners.