| java.lang.Object com.ibm.icu.text.CharsetRecognizer com.ibm.icu.text.CharsetRecog_2022
CharsetRecog_2022 | abstract class CharsetRecog_2022 extends CharsetRecognizer (Code) | | class CharsetRecog_2022 part of the ICU charset detection imlementation.
This is a superclass for the individual detectors for
each of the detectable members of the ISO 2022 family
of encodings.
The separate classes are nested within this class.
|
Method Summary | |
int | match(byte[] text, int textLen, byte[][] escapeSequences) Matching function shared among the 2022 detectors JP, CN and KR
Counts up the number of legal an unrecognized escape sequences in
the sample of text, and computes a score based on the total number &
the proportion that fit the encoding.
Parameters: text - the byte buffer containing text to analyse Parameters: textLen - the size of the text in the byte. Parameters: escapeSequences - the byte escape sequences to test for. |
match | int match(byte[] text, int textLen, byte[][] escapeSequences)(Code) | | Matching function shared among the 2022 detectors JP, CN and KR
Counts up the number of legal an unrecognized escape sequences in
the sample of text, and computes a score based on the total number &
the proportion that fit the encoding.
Parameters: text - the byte buffer containing text to analyse Parameters: textLen - the size of the text in the byte. Parameters: escapeSequences - the byte escape sequences to test for. match quality, in the range of 0-100. |
|
|