sun.text.normalizer |
|
Java Source File Name | Type | Comment |
CharacterIteratorWrapper.java | Class | |
CharTrie.java | Class | Trie implementation which stores data in char, 16 bits. |
ICUBinary.java | Class | |
ICUData.java | Class | Provides access to ICU data files as InputStreams. |
IntTrie.java | Class | Trie implementation which stores data in int, 32 bits. |
NormalizerBase.java | Class | Unicode Normalization
Unicode normalization API
normalize transforms Unicode text into an equivalent composed or
decomposed form, allowing for easier sorting and searching of text.
normalize supports the standard normalization forms described in
Unicode Standard Annex #15 — Unicode Normalization Forms.
Characters with accents or other adornments can be encoded in
several different ways in Unicode. |
NormalizerDataReader.java | Class | |
NormalizerImpl.java | Class | |
RangeValueIterator.java | Interface | Interface for enabling iteration over sets of ,
where index is the sorted integer index in ascending order and value, its
associated integer value.
The result for each iteration is the consecutive range of
with the same value. |
Replaceable.java | Interface | Replaceable is an interface representing a
string of characters that supports the replacement of a range of
itself with a new string of characters. |
ReplaceableString.java | Class | ReplaceableString is an adapter class that implements the
Replaceable API around an ordinary StringBuffer .
Note: This class does not support attributes and is not
intended for general use. |
ReplaceableUCharacterIterator.java | Class | DLF docs must define behavior when Replaceable is mutated underneath
the iterator. |
RuleCharacterIterator.java | Class | An iterator that returns 32-bit code points. |
SymbolTable.java | Interface | An interface that defines both lookup protocol and parsing of
symbolic names.
A symbol table maintains two kinds of mappings. |
Trie.java | Class | A trie is a kind of compressed, serializable table of values
associated with Unicode code points (0..0x10ffff).
This class defines the basic structure of a trie and provides methods
to retrieve the offsets to the actual data.
Data will be the form of an array of basic types, char or int.
The actual data format will have to be specified by the user in the
inner static interface com.ibm.icu.impl.Trie.DataManipulate.
This trie implementation is optimized for getting offset while walking
forward through a UTF-16 string. |
TrieIterator.java | Class | Class enabling iteration of the values in a Trie.
Result of each iteration contains the interval of codepoints that have
the same value type and the value type itself.
The comparison of each codepoint value is done via extract(), which the
default implementation is to return the value as it is.
Method extract() can be overwritten to perform manipulations on
codepoint values in order to perform specialized comparison.
TrieIterator is designed to be a generic iterator for the CharTrie
and the IntTrie, hence to accommodate both types of data, the return
result will be in terms of int (32 bit) values.
See com.ibm.icu.text.UCharacterTypeIterator for examples of use.
Notes for porting utrie_enum from icu4c to icu4j:
Internally, icu4c's utrie_enum performs all iterations in its body. |
UCharacter.java | Class |
The UCharacter class provides extensions to the
java.lang.Character class. |
UCharacterIterator.java | Class | Abstract class that defines an API for iteration on text objects.This is an
interface for forward and backward iteration and random access into a text
object. |
UCharacterProperty.java | Class | Internal class used for Unicode character property database.
This classes store binary data read from uprops.icu.
It does not have the capability to parse the data into more high-level
information. |
UCharacterPropertyReader.java | Class | Internal reader class for ICU data file uprops.icu containing
Unicode codepoint data.
This class simply reads uprops.icu, authenticates that it is a valid
ICU data file and split its contents up into blocks of data for use in
com.ibm.icu.impl.UCharacterProperty. |
UnicodeMatcher.java | Interface | UnicodeMatcher defines a protocol for objects that can
match a range of characters in a Replaceable string. |
UnicodeSet.java | Class | A mutable set of Unicode characters and multicharacter strings. |
UnicodeSetIterator.java | Class | UnicodeSetIterator iterates over the contents of a UnicodeSet. |
UProperty.java | Interface | Selection constants for Unicode properties. |
UTF16.java | Class | Standalone utility class providing UTF16 character conversions and
indexing conversions.
Code that uses strings alone rarely need modification.
By design, UTF-16 does not allow overlap, so searching for strings is a safe
operation. |
Utility.java | Class | |
VersionInfo.java | Class | Class to store version numbers of the form major.minor.milli.micro. |