Java Doc for UCharacter.java in » 6.0-JDK-Modules-sun » text » sun » text » normalizer » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation

1.	6.0 JDK Core
2.	6.0 JDK Modules
3.	6.0 JDK Modules com.sun
4.	6.0 JDK Modules com.sun.java
5.	6.0 JDK Modules sun
6.	6.0 JDK Platform
7.	Ajax
8.	Apache Harmony Java SE
9.	Aspect oriented
10.	Authentication Authorization
11.	Blogger System
12.	Build
13.	Byte Code
14.	Cache
15.	Chart
16.	Chat
17.	Code Analyzer
18.	Collaboration
19.	Content Management System
20.	Database Client
21.	Database DBMS
22.	Database JDBC Connection Pool
23.	Database ORM
24.	Development
25.	EJB Server geronimo
26.	EJB Server GlassFish
27.	EJB Server JBoss 4.2.1
28.	EJB Server resin 3.1.5
29.	ERP CRM Financial
30.	ESB
31.	Forum
32.	GIS
33.	Graphic Library
34.	Groupware
35.	HTML Parser
36.	IDE
37.	IDE Eclipse
38.	IDE Netbeans
39.	Installer
40.	Internationalization Localization
41.	Inversion of Control
42.	Issue Tracking
43.	J2EE
44.	JBoss
45.	JMS
46.	JMX
47.	Library
48.	Mail Clients
49.	Net
50.	Parser
51.	PDF
52.	Portal
53.	Profiler
54.	Project Management
55.	Report
56.	RSS RDF
57.	Rule Engine
58.	Science
59.	Scripting
60.	Search Engine
61.	Security
62.	Sevlet Container
63.	Source Control
64.	Swing Library
65.	Template Engine
66.	Test Coverage
67.	Testing
68.	UML
69.	Web Crawler
70.	Web Framework
71.	Web Mail
72.	Web Server
73.	Web Services
74.	Web Services apache cxf 2.0.1
75.	Web Services AXIS2
76.	Wiki Engine
77.	Workflow Engines
78.	XML
79.	XML UI

Java

Java Tutorial

Illustrator Tutorials

GIMP Tutorials

C# / C Sharp

C# / CSharp Tutorial

C# / CSharp Open Source

SQL Server / T-SQL Tutorial

Oracle PL / SQL

Oracle PL/SQL Tutorial

Flash / Flex / ActionScript

VBA / Excel / Access / Word

XML

XML Tutorial

Microsoft Office PowerPoint 2007 Tutorial

Microsoft Office Excel 2007 Tutorial

Microsoft Office Word 2007 Tutorial

Java Source Code / Java Documentation » 6.0 JDK Modules sun » text » sun.text.normalizer

Source Cross Reference Class Diagram Java Document (Java Doc)

java.lang .Object

sun.text.normalizer .UCharacter

UCharacter

final public class UCharacter (Code)

The UCharacter class provides extensions to the java.lang.Character class. These extensions provide support for Unicode 3.2 properties and together with the UTF16 class, provide support for supplementary characters (those with code points above U+FFFF).

Code points are represented in these API using ints. While it would be more convenient in Java to have a separate primitive datatype for them, ints suffice in the meantime.

To use this class please add the jar file name icu4j.jar to the class path, since it contains data files which supply the information used by this file.
E.g. In Windows
set CLASSPATH=%CLASSPATH%;$JAR_FILE_PATH/ucharacter.jar.
Otherwise, another method would be to copy the files uprops.dat and unames.icu from the icu4j source subdirectory $ICU4J_SRC/src/com.ibm.icu.impl.data to your class directory $ICU4J_CLASS/com.ibm.icu.impl.data.

Aside from the additions for UTF-16 support, and the updated Unicode 3.1 properties, the main differences between UCharacter and Character are:

UCharacter is not designed to be a char wrapper and does not have APIs to which involves management of that single char.
These include:
- char charValue(),
- int compareTo(java.lang.Character, java.lang.Character), etc.
UCharacter does not include Character APIs that are deprecated, not does it include the Java-specific character information, such as boolean isJavaIdentifierPart(char ch).
Character maps characters 'A' - 'Z' and 'a' - 'z' to the numeric values '10' - '35'. UCharacter also does this in digit and getNumericValue, to adhere to the java semantics of these methods. New methods unicodeDigit, and getUnicodeNumericValue do not treat the above code points as having numeric values. This is a semantic change from ICU4J 1.3.1.

Further detail differences can be determined from the program com.ibm.icu.dev.test.lang.UCharacterCompare

This class is not subclassable

author:
Syn Wee Quek
See Also: com.ibm.icu.lang.UCharacterEnums

Inner Class :public static interface NumericType

Inner Class :public static interface HangulSyllableType

Inner Class :public static interface ECharacterCategory

Field Summary
final public static int	MAX_VALUE The highest Unicode code point value (scalar value) according to the Unicode Standard.
final public static int	MIN_VALUE The lowest Unicode code point value.
final public static double	NO_NUMERIC_VALUE Special value that is returned by getUnicodeNumericValue(int) when no numeric value is defined for a code point.
final public static int	SUPPLEMENTARY_MIN_VALUE

Method Summary
public static int	digit(int ch, int radix) Retrieves the numeric value of a decimal digit code point. This method observes the semantics of `java.lang.Character.digit()`.
public static String	foldCase(String str, boolean defaultmapping) The given string is mapped to its case folding equivalent according to UnicodeData.txt and CaseFolding.txt; if any character has no case folding equivalent, the character itself is returned. "Full", multiple-code point case folding mappings are returned here. For "simple" single-code point mappings use the API foldCase(int ch, boolean defaultmapping). Parameters: str - the String to be converted Parameters: defaultmapping - Indicates if all mappings defined in CaseFolding.txt is to be used, otherwise the mappings for dotted I and dotless i marked with 'I' in CaseFolding.txt will be skipped.
public static VersionInfo	getAge(int ch) Get the "age" of the code point. The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character. This can be useful to avoid emitting code points to receiving processes that do not accept newer characters. The data is from the UCD file DerivedAge.txt. Parameters: ch - The code point.
public static int	getCodePoint(char lead, char trail) Returns a code point corresponding to the two UTF16 characters.
public static int	getDirection(int ch) Returns the Bidirection property of a code point.
public static int	getIntPropertyValue(int ch, int type) Gets the property value for an Unicode property type of a code point.
public static int	getType(int ch) Returns a value indicating a code point's Unicode category. Up-to-date Unicode implementation of java.lang.Character.getType() except for the above mentioned code points that had their category changed. Return results are constants from the interface UCharacterCategory NOTE: the UCharacterCategory values are not compatible with those returned by java.lang.Character.getType.
public static double	getUnicodeNumericValue(int ch) Get the numeric value for a Unicode code point as defined in the Unicode Character Database. A "double" return type is necessary because some numeric values are fractions, negative, or too large for int. For characters without any numeric values in the Unicode Character Database, this function will return NO_NUMERIC_VALUE. API Change: In release 2.2 and prior, this API has a return type int and returns -1 when the argument ch does not have a corresponding numeric value.

Field Detail

MAX_VALUE
final public static int MAX_VALUE(Code)
	The highest Unicode code point value (scalar value) according to the Unicode Standard. This is a 21-bit value (21 bits, rounded up). Up-to-date Unicode implementation of java.lang.Character.MIN_VALUE

MIN_VALUE
final public static int MIN_VALUE(Code)
	The lowest Unicode code point value.

NO_NUMERIC_VALUE
final public static double NO_NUMERIC_VALUE(Code)
	Special value that is returned by getUnicodeNumericValue(int) when no numeric value is defined for a code point. See Also: UCharacter.getUnicodeNumericValue

SUPPLEMENTARY_MIN_VALUE
final public static int SUPPLEMENTARY_MIN_VALUE(Code)
	The minimum value for Supplementary code points

Method Detail

digit
public static int digit(int ch, int radix)(Code)
	Retrieves the numeric value of a decimal digit code point. This method observes the semantics of `java.lang.Character.digit()`. Note that this will return positive values for code points for which isDigit returns false, just like java.lang.Character. Semantic Change: In release 1.3.1 and prior, this did not treat the European letters as having a digit value, and also treated numeric letters and other numbers as digits. This has been changed to conform to the java semantics. A code point is a valid digit if and only if: ch is a decimal digit or one of the european letters, and the value of ch is less than the specified radix. Parameters: ch - the code point to query Parameters: radix - the radix the numeric value represented by the code point in thespecified radix, or -1 if the code point is not a decimal digitor if its value is too large for the radix

foldCase
public static String foldCase(String str, boolean defaultmapping)(Code)
	The given string is mapped to its case folding equivalent according to UnicodeData.txt and CaseFolding.txt; if any character has no case folding equivalent, the character itself is returned. "Full", multiple-code point case folding mappings are returned here. For "simple" single-code point mappings use the API foldCase(int ch, boolean defaultmapping). Parameters: str - the String to be converted Parameters: defaultmapping - Indicates if all mappings defined in CaseFolding.txt is to be used, otherwise the mappings for dotted I and dotless i marked with 'I' in CaseFolding.txt will be skipped. the case folding equivalent of the character, if any; otherwise the character itself. See Also: UCharacter.foldCase(int,boolean)

getAge

public static VersionInfo getAge(int ch)(Code)

Get the "age" of the code point.

The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character.

This can be useful to avoid emitting code points to receiving processes that do not accept newer characters.

The data is from the UCD file DerivedAge.txt.

Parameters:
ch - The code point. the Unicode version number

getCodePoint
public static int getCodePoint(char lead, char trail)(Code)
	Returns a code point corresponding to the two UTF16 characters. Parameters: lead - the lead char Parameters: trail - the trail char code point if surrogate characters are valid. exception: IllegalArgumentException - thrown when argument characters donot form a valid codepoint

getDirection
public static int getDirection(int ch)(Code)
	Returns the Bidirection property of a code point. For example, 0x0041 (letter A) has the LEFT_TO_RIGHT directional property. Result returned belongs to the interface UCharacterDirection Parameters: ch - the code point to be determined its direction direction constant from UCharacterDirection.

getIntPropertyValue

public static int getIntPropertyValue(int ch, int type)(Code)

Gets the property value for an Unicode property type of a code point. Also returns binary and mask property values.

Unicode, especially in version 3.2, defines many more properties than the original set in UnicodeData.txt.

The properties APIs are intended to reflect Unicode properties as defined in the Unicode Character Database (UCD) and Unicode Technical Reports (UTR). For details about the properties see http://www.unicode.org/.

For names of Unicode properties see the UCD file PropertyAliases.txt.

 Sample usage:
 int ea = UCharacter.getIntPropertyValue(c, UProperty.EAST_ASIAN_WIDTH);
 int ideo = UCharacter.getIntPropertyValue(c, UProperty.IDEOGRAPHIC);
 boolean b = (ideo == 1) ? true : false;

Parameters:
  ch - code point to test.
Parameters:
  type - UProperty selector constant, identifies which binary property to check. Must be UProperty.BINARY_START <= type < UProperty.BINARY_LIMIT or UProperty.INT_START <= type < UProperty.INT_LIMIT or UProperty.MASK_START <= type < UProperty.MASK_LIMIT. numeric value that is directly the property value or,for enumerated properties, corresponds to the numeric value of the enumerated constant of the respective property value enumeration type (cast to enum type if necessary).Returns 0 or 1 (for false / true) for binary Unicode properties.Returns a bit-mask for mask properties.Returns 0 if 'type' is out of bounds or if the Unicode versiondoes not have data for the property at all, or not for this code point.
See Also:   UProperty
See Also:   UCharacter.hasBinaryProperty
See Also:   UCharacter.getIntPropertyMinValue
See Also:   UCharacter.getIntPropertyMaxValue
See Also:   UCharacter.getUnicodeVersion

getType
public static int getType(int ch)(Code)
	Returns a value indicating a code point's Unicode category. Up-to-date Unicode implementation of java.lang.Character.getType() except for the above mentioned code points that had their category changed. Return results are constants from the interface UCharacterCategory NOTE: the UCharacterCategory values are not compatible with those returned by java.lang.Character.getType. UCharacterCategory values match the ones used in ICU4C, while java.lang.Character type values, though similar, skip the value 17. Parameters: ch - code point whose type is to be determined category which is a value of UCharacterCategory

getUnicodeNumericValue

public static double getUnicodeNumericValue(int ch)(Code)

Get the numeric value for a Unicode code point as defined in the Unicode Character Database.

A "double" return type is necessary because some numeric values are fractions, negative, or too large for int.

For characters without any numeric values in the Unicode Character Database, this function will return NO_NUMERIC_VALUE.

API Change: In release 2.2 and prior, this API has a return type int and returns -1 when the argument ch does not have a corresponding numeric value. This has been changed to synch with ICU4C

This corresponds to the ICU4C function u_getNumericValue.
Parameters:
ch - Code point to get the numeric value for. numeric value of ch, or NO_NUMERIC_VALUE if none is defined.

Methods inherited from java.lang.Object

native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us

All other trademarks are property of their respective owners.