| |
|
| java.lang.Object sun.text.normalizer.UCharacterProperty
UCharacterProperty | final public class UCharacterProperty implements Trie.DataManipulate(Code) | | Internal class used for Unicode character property database.
This classes store binary data read from uprops.icu.
It does not have the capability to parse the data into more high-level
information. It only returns bytes of information when required.
Due to the form most commonly used for retrieval, array of char is used
to store the binary data.
UCharacterPropertyDB also contains information on accessing indexes to
significant points in the binary data.
Responsibility for molding the binary data into more meaning form lies on
UCharacter.
author: Syn Wee Quek since: release 2.1, february 1st 2002 |
Method Summary | |
public UnicodeSet | addPropertyStarts(UnicodeSet set) | public int | getAdditional(int codepoint) Gets the unicode additional properties. | public VersionInfo | getAge(int codepoint) Get the "age" of the code point.
The "age" is the Unicode version when the code point was first
designated (as a non-character or for Private Use) or assigned a
character.
This can be useful to avoid emitting code points to receiving
processes that do not accept newer characters.
The data is from the UCD file DerivedAge.txt.
This API does not check the validity of the codepoint.
Parameters: codepoint - The code point. | public int | getException(int index, int etype) Gets the exception value at the index, assuming that data type is
available. | public static int | getExceptionIndex(int prop) | public void | getFoldCase(int index, int count, StringBuffer str) | public int | getFoldingOffset(int value) Called by com.ibm.icu.util.Trie to extract from a lead surrogate's
data the index array offset of the indexes for that lead surrogate. | public UnicodeSet | getInclusions() | public static UCharacterProperty | getInstance() Loads the property data and initialize the UCharacterProperty instance. | public int | getProperty(int ch) Gets the property value at the index. | public static int | getRawSupplementary(char lead, char trail) | public static int | getSignedValue(int prop) | public boolean | hasExceptionValue(int index, int indicator) | public static boolean | isRuleWhiteSpace(int c) Checks if the argument c is to be treated as a white space in ICU
rules. | public void | setIndexData(CharTrie.FriendAgent friendagent) |
EXCEPTION_MASK | final public static int EXCEPTION_MASK(Code) | | Exception test mask
|
EXC_CASE_FOLDING_ | final public static int EXC_CASE_FOLDING_(Code) | | Exception indicator for case folding type
|
EXC_COMBINING_CLASS_ | final public static int EXC_COMBINING_CLASS_(Code) | | EXC_COMBINING_CLASS_ is not found in ICU.
Used to retrieve the combining class of the character in the exception
value
|
EXC_DENOMINATOR_VALUE_ | final public static int EXC_DENOMINATOR_VALUE_(Code) | | Exception indicator for denominator type
|
EXC_LOWERCASE_ | final public static int EXC_LOWERCASE_(Code) | | Exception indicator for lowercase type
|
EXC_MIRROR_MAPPING_ | final public static int EXC_MIRROR_MAPPING_(Code) | | Exception indicator for mirror type
|
EXC_NUMERIC_VALUE_ | final public static int EXC_NUMERIC_VALUE_(Code) | | Exception indicator for numeric type
|
EXC_SPECIAL_CASING_ | final public static int EXC_SPECIAL_CASING_(Code) | | Exception indicator for special casing type
|
EXC_TITLECASE_ | final public static int EXC_TITLECASE_(Code) | | Exception indicator for titlecase type
|
EXC_UNUSED_ | final public static int EXC_UNUSED_(Code) | | Exception indicator for digit type
|
EXC_UPPERCASE_ | final public static int EXC_UPPERCASE_(Code) | | Exception indicator for uppercase type
|
LATIN_SMALL_LETTER_I_ | final public static char LATIN_SMALL_LETTER_I_(Code) | | Latin lowercase i
|
TYPE_MASK | final public static int TYPE_MASK(Code) | | Character type mask
|
m_additionalColumnsCount_ | int m_additionalColumnsCount_(Code) | | Number of additional columns
|
m_additionalTrie_ | CharTrie m_additionalTrie_(Code) | | Extra property trie
|
m_additionalVectors_ | int m_additionalVectors_(Code) | | Extra property vectors, 1st column for age and second for binary
properties.
|
m_case_ | char m_case_(Code) | | Case table
|
m_exception_ | int m_exception_(Code) | | Exception property table
|
m_maxBlockScriptValue_ | int m_maxBlockScriptValue_(Code) | | Maximum values for block, bits used as in vector word
0
|
m_maxJTGValue_ | int m_maxJTGValue_(Code) | | Maximum values for script, bits used as in vector word
0
|
m_property_ | public int m_property_(Code) | | Character property table
|
m_trieData_ | public char[] m_trieData_(Code) | | Optimization
CharTrie data array
|
m_trieIndex_ | public char[] m_trieIndex_(Code) | | Optimization
CharTrie index array
|
m_trieInitialValue_ | public int m_trieInitialValue_(Code) | | Optimization
CharTrie data offset
|
getAdditional | public int getAdditional(int codepoint)(Code) | | Gets the unicode additional properties.
C version getUnicodeProperties.
Parameters: codepoint - codepoint whose additional properties is to beretrieved unicode properties |
getAge | public VersionInfo getAge(int codepoint)(Code) | | Get the "age" of the code point.
The "age" is the Unicode version when the code point was first
designated (as a non-character or for Private Use) or assigned a
character.
This can be useful to avoid emitting code points to receiving
processes that do not accept newer characters.
The data is from the UCD file DerivedAge.txt.
This API does not check the validity of the codepoint.
Parameters: codepoint - The code point. the Unicode version number |
getException | public int getException(int index, int etype)(Code) | | Gets the exception value at the index, assuming that data type is
available. Result is undefined if data is not available. Use
hasExceptionValue() to determine data's availability.
Parameters: index - Parameters: etype - exception data type exception data type value at index |
getExceptionIndex | public static int getExceptionIndex(int prop)(Code) | | Getting the exception index for argument property
Parameters: prop - character property exception index |
getFoldCase | public void getFoldCase(int index, int count, StringBuffer str)(Code) | | Gets the folded case value at the index
Parameters: index - of the case value to be retrieved Parameters: count - number of characters to retrieve Parameters: str - string buffer to which to append the result |
getFoldingOffset | public int getFoldingOffset(int value)(Code) | | Called by com.ibm.icu.util.Trie to extract from a lead surrogate's
data the index array offset of the indexes for that lead surrogate.
Parameters: value - data value for a surrogate from the trie, including thefolding offset data offset or 0 if there is no data for the lead surrogate |
getProperty | public int getProperty(int ch)(Code) | | Gets the property value at the index.
This is optimized.
Note this is alittle different from CharTrie the index m_trieData_
is never negative.
Parameters: ch - code point whose property value is to be retrieved property value of code point |
getRawSupplementary | public static int getRawSupplementary(char lead, char trail)(Code) | | Forms a supplementary code point from the argument character
Note this is for internal use hence no checks for the validity of the
surrogate characters are done
Parameters: lead - lead surrogate character Parameters: trail - trailing surrogate character code point of the supplementary character |
getSignedValue | public static int getSignedValue(int prop)(Code) | | Getting the signed numeric value of a character embedded in the property
argument
Parameters: prop - the character signed numberic value |
hasExceptionValue | public boolean hasExceptionValue(int index, int indicator)(Code) | | Determines if the exception value passed in has the kind of information
which the indicator wants, e.g if the exception value contains the digit
value of the character
Parameters: index - exception index Parameters: indicator - type indicator true if type value exist |
isRuleWhiteSpace | public static boolean isRuleWhiteSpace(int c)(Code) | | Checks if the argument c is to be treated as a white space in ICU
rules. Usually ICU rule white spaces are ignored unless quoted.
Parameters: c - codepoint to check true if c is a ICU white space |
|
|
|