| |
|
| java.lang.Object com.ibm.icu.impl.UCharacterProperty
UCharacterProperty | final public class UCharacterProperty (Code) | | Internal class used for Unicode character property database.
This classes store binary data read from uprops.icu.
It does not have the capability to parse the data into more high-level
information. It only returns bytes of information when required.
Due to the form most commonly used for retrieval, array of char is used
to store the binary data.
UCharacterPropertyDB also contains information on accessing indexes to
significant points in the binary data.
Responsibility for molding the binary data into more meaning form lies on
UCharacter.
author: Syn Wee Quek since: release 2.1, february 1st 2002 |
Method Summary | |
public UnicodeSet | addPropertyStarts(UnicodeSet set) | public int | getAdditional(int codepoint, int column) Gets the unicode additional properties. | public VersionInfo | getAge(int codepoint) Get the "age" of the code point.
The "age" is the Unicode version when the code point was first
designated (as a non-character or for Private Use) or assigned a
character.
This can be useful to avoid emitting code points to receiving
processes that do not accept newer characters.
The data is from the UCD file DerivedAge.txt.
This API does not check the validity of the codepoint.
Parameters: codepoint - The code point. | public static UCharacterProperty | getInstance() Loads the property data and initialize the UCharacterProperty instance. | final public static int | getMask(int type) | public int | getMaxValues(int column) Get the the maximum values for some enum/int properties. | final public int | getProperty(int ch) Gets the property value at the index. | public static int | getRawSupplementary(char lead, char trail) | public static int | getSignedValue(int prop) | final public int | getSource(int which) | public static int | getUnsignedValue(int prop) | public boolean | hasBinaryProperty(int codepoint, int property) Check a binary Unicode property for a code point.
Unicode, especially in version 3.2, defines many more properties
than the original set in UnicodeData.txt.
This API is intended to reflect Unicode properties as defined in
the Unicode Character Database (UCD) and Unicode Technical Reports
(UTR).
For details about the properties see
http://www.unicode.org/.
For names of Unicode properties see the UCD file
PropertyAliases.txt.
This API does not check the validity of the codepoint.
Important: If ICU is built with UCD files from Unicode versions
below 3.2, then properties marked with "new" are not or
not fully available.
Parameters: codepoint - Code point to test. Parameters: property - selector constant from com.ibm.icu.lang.UProperty,identifies which binary property to check. | public static boolean | isRuleWhiteSpace(int c) Checks if the argument c is to be treated as a white space in ICU
rules. | public void | setIndexData(CharTrie.FriendAgent friendagent) | public void | uhst_addPropertyStarts(UnicodeSet set) | public void | upropsvec_addPropertyStarts(UnicodeSet set) |
LATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_ | final public static char LATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_(Code) | | Latin capital letter i with dot above
|
LATIN_SMALL_LETTER_DOTLESS_I_ | final public static char LATIN_SMALL_LETTER_DOTLESS_I_(Code) | | Latin small letter i with dot above
|
LATIN_SMALL_LETTER_I_ | final public static char LATIN_SMALL_LETTER_I_(Code) | | Latin lowercase i
|
MY_MASK | final static int MY_MASK(Code) | | |
NT_COUNT | final public static int NT_COUNT(Code) | | |
NT_FRACTION | final public static int NT_FRACTION(Code) | | |
NT_LARGE | final public static int NT_LARGE(Code) | | |
SRC_BIDI | final public static int SRC_BIDI(Code) | | From ubidi_props.c/ubidi.icu
|
SRC_CASE | final public static int SRC_CASE(Code) | | From ucase.c/ucase.icu
|
SRC_CHAR | final public static int SRC_CHAR(Code) | | From uchar.c/uprops.icu main trie
|
SRC_CHAR_AND_PROPSVEC | final public static int SRC_CHAR_AND_PROPSVEC(Code) | | From uchar.c/uprops.icu main trie as well as properties vectors trie
|
SRC_COUNT | final public static int SRC_COUNT(Code) | | One more than the highest UPropertySource (SRC_) constant.
|
SRC_HST | final public static int SRC_HST(Code) | | Hangul_Syllable_Type, from uchar.c/uprops.icu
|
SRC_NAMES | final public static int SRC_NAMES(Code) | | From unames.c/unames.icu
|
SRC_NONE | final public static int SRC_NONE(Code) | | No source, not a supported property.
|
SRC_NORM | final public static int SRC_NORM(Code) | | From unorm.cpp/unorm.icu
|
SRC_PROPSVEC | final public static int SRC_PROPSVEC(Code) | | From uchar.c/uprops.icu properties vectors trie
|
TYPE_MASK | final public static int TYPE_MASK(Code) | | Character type mask
|
binProps | BinaryProperties[] binProps(Code) | | |
m_additionalColumnsCount_ | int m_additionalColumnsCount_(Code) | | Number of additional columns
|
m_additionalTrie_ | CharTrie m_additionalTrie_(Code) | | Extra property trie
|
m_additionalVectors_ | int m_additionalVectors_(Code) | | Extra property vectors, 1st column for age and second for binary
properties.
|
m_maxBlockScriptValue_ | int m_maxBlockScriptValue_(Code) | | Maximum values for block, bits used as in vector word
0
|
m_maxJTGValue_ | int m_maxJTGValue_(Code) | | Maximum values for script, bits used as in vector word
0
|
m_trieData_ | public char[] m_trieData_(Code) | | Optimization
CharTrie data array
|
m_trieIndex_ | public char[] m_trieIndex_(Code) | | Optimization
CharTrie index array
|
m_trieInitialValue_ | public int m_trieInitialValue_(Code) | | Optimization
CharTrie data offset
|
getAdditional | public int getAdditional(int codepoint, int column)(Code) | | Gets the unicode additional properties.
C version getUnicodeProperties.
Parameters: codepoint - codepoint whose additional properties is to beretrieved Parameters: column - unicode properties |
getAge | public VersionInfo getAge(int codepoint)(Code) | | Get the "age" of the code point.
The "age" is the Unicode version when the code point was first
designated (as a non-character or for Private Use) or assigned a
character.
This can be useful to avoid emitting code points to receiving
processes that do not accept newer characters.
The data is from the UCD file DerivedAge.txt.
This API does not check the validity of the codepoint.
Parameters: codepoint - The code point. the Unicode version number |
getMask | final public static int getMask(int type)(Code) | | Gets the type mask
Parameters: type - character type mask |
getMaxValues | public int getMaxValues(int column)(Code) | | Get the the maximum values for some enum/int properties.
maximum values for the integer properties. |
getProperty | final public int getProperty(int ch)(Code) | | Gets the property value at the index.
This is optimized.
Note this is alittle different from CharTrie the index m_trieData_
is never negative.
Parameters: ch - code point whose property value is to be retrieved property value of code point |
getRawSupplementary | public static int getRawSupplementary(char lead, char trail)(Code) | | Forms a supplementary code point from the argument character
Note this is for internal use hence no checks for the validity of the
surrogate characters are done
Parameters: lead - lead surrogate character Parameters: trail - trailing surrogate character code point of the supplementary character |
getSignedValue | public static int getSignedValue(int prop)(Code) | | Getting the signed numeric value of a character embedded in the property
argument
Parameters: prop - the character signed numberic value |
getSource | final public int getSource(int which)(Code) | | |
getUnsignedValue | public static int getUnsignedValue(int prop)(Code) | | Getting the unsigned numeric value of a character embedded in the property
argument
Parameters: prop - the character unsigned numberic value |
hasBinaryProperty | public boolean hasBinaryProperty(int codepoint, int property)(Code) | | Check a binary Unicode property for a code point.
Unicode, especially in version 3.2, defines many more properties
than the original set in UnicodeData.txt.
This API is intended to reflect Unicode properties as defined in
the Unicode Character Database (UCD) and Unicode Technical Reports
(UTR).
For details about the properties see
http://www.unicode.org/.
For names of Unicode properties see the UCD file
PropertyAliases.txt.
This API does not check the validity of the codepoint.
Important: If ICU is built with UCD files from Unicode versions
below 3.2, then properties marked with "new" are not or
not fully available.
Parameters: codepoint - Code point to test. Parameters: property - selector constant from com.ibm.icu.lang.UProperty,identifies which binary property to check. true or false according to the binary Unicode property valuefor ch. Also false if property is out of bounds or if theUnicode version does not have data for the property at all, ornot for this code point. See Also: com.ibm.icu.lang.UProperty |
isRuleWhiteSpace | public static boolean isRuleWhiteSpace(int c)(Code) | | Checks if the argument c is to be treated as a white space in ICU
rules. Usually ICU rule white spaces are ignored unless quoted.
Equivalent to test for Pattern_White_Space Unicode property.
Stable set of characters, won't change.
See UAX #31 Identifier and Pattern Syntax: http://www.unicode.org/reports/tr31/
Parameters: c - codepoint to check true if c is a ICU white space |
upropsvec_addPropertyStarts | public void upropsvec_addPropertyStarts(UnicodeSet set)(Code) | | |
|
|
|