Source Code Cross Referenced for UConverterDataReader.java in  » Internationalization-Localization » icu4j » com » ibm » icu » charset » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Internationalization Localization » icu4j » com.ibm.icu.charset 
Source Cross Referenced  Class Diagram Java Document (Java Doc) 


001:        /**
002:         *******************************************************************************
003:         * Copyright (C) 2006, International Business Machines Corporation and    *
004:         * others. All Rights Reserved.                                                *
005:         *******************************************************************************
006:         *
007:         *******************************************************************************
008:         */package com.ibm.icu.charset;
009:
010:        import com.ibm.icu.impl.ICUBinary;
011:
012:        import java.io.IOException;
013:        import java.io.InputStream;
014:        import java.io.DataInputStream;
015:        import java.nio.ByteBuffer;
016:
017:        /**
018:         * ucnvmbcs.h
019:         *
020:         * ICU conversion (.cnv) data file structure, following the usual UDataInfo
021:         * header.
022:         *
023:         * Format version: 6.2
024:         *
025:         * struct UConverterStaticData -- struct containing the converter name, IBM CCSID,
026:         *                                min/max bytes per character, etc.
027:         *                                see ucnv_bld.h
028:         *
029:         * --------------------
030:         *
031:         * The static data is followed by conversionType-specific data structures.
032:         * At the moment, there are only variations of MBCS converters. They all have
033:         * the same toUnicode structures, while the fromUnicode structures for SBCS
034:         * differ from those for other MBCS-style converters.
035:         *
036:         * _MBCSHeader.version 4.2 adds an optional conversion extension data structure.
037:         * If it is present, then an ICU version reading header versions 4.0 or 4.1
038:         * will be able to use the base table and ignore the extension.
039:         *
040:         * The unicodeMask in the static data is part of the base table data structure.
041:         * Especially, the UCNV_HAS_SUPPLEMENTARY flag determines the length of the
042:         * fromUnicode stage 1 array.
043:         * The static data unicodeMask refers only to the base table's properties if
044:         * a base table is included.
045:         * In an extension-only file, the static data unicodeMask is 0.
046:         * The extension data indexes have a separate field with the unicodeMask flags.
047:         *
048:         * MBCS-style data structure following the static data.
049:         * Offsets are counted in bytes from the beginning of the MBCS header structure.
050:         * Details about usage in comments in ucnvmbcs.c.
051:         *
052:         * struct _MBCSHeader (see the definition in this header file below)
053:         * contains 32-bit fields as follows:
054:         * 8 values:
055:         *  0   uint8_t[4]  MBCS version in UVersionInfo format (currently 4.2.0.0)
056:         *  1   uint32_t    countStates
057:         *  2   uint32_t    countToUFallbacks
058:         *  3   uint32_t    offsetToUCodeUnits
059:         *  4   uint32_t    offsetFromUTable
060:         *  5   uint32_t    offsetFromUBytes
061:         *  6   uint32_t    flags, bits:
062:         *                      31.. 8 offsetExtension -- _MBCSHeader.version 4.2 (ICU 2.8) and higher
063:         *                                                0 for older versions and if
064:         *                                                there is not extension structure
065:         *                       7.. 0 outputType
066:         *  7   uint32_t    fromUBytesLength -- _MBCSHeader.version 4.1 (ICU 2.4) and higher
067:         *                  counts bytes in fromUBytes[]
068:         *
069:         * if(outputType==MBCS_OUTPUT_EXT_ONLY) {
070:         *     -- base table name for extension-only table
071:         *     char baseTableName[variable]; -- with NUL plus padding for 4-alignment
072:         *
073:         *     -- all _MBCSHeader fields except for version and flags are 0
074:         * } else {
075:         *     -- normal base table with optional extension
076:         *
077:         *     int32_t stateTable[countStates][256];
078:         *    
079:         *     struct _MBCSToUFallback { (fallbacks are sorted by offset)
080:         *         uint32_t offset;
081:         *         UChar32 codePoint;
082:         *     } toUFallbacks[countToUFallbacks];
083:         *    
084:         *     uint16_t unicodeCodeUnits[(offsetFromUTable-offsetToUCodeUnits)/2];
085:         *                  (padded to an even number of units)
086:         *    
087:         *     -- stage 1 tables
088:         *     if(staticData.unicodeMask&UCNV_HAS_SUPPLEMENTARY) {
089:         *         -- stage 1 table for all of Unicode
090:         *         uint16_t fromUTable[0x440]; (32-bit-aligned)
091:         *     } else {
092:         *         -- BMP-only tables have a smaller stage 1 table
093:         *         uint16_t fromUTable[0x40]; (32-bit-aligned)
094:         *     }
095:         *    
096:         *     -- stage 2 tables
097:         *        length determined by top of stage 1 and bottom of stage 3 tables
098:         *     if(outputType==MBCS_OUTPUT_1) {
099:         *         -- SBCS: pure indexes
100:         *         uint16_t stage 2 indexes[?];
101:         *     } else {
102:         *         -- DBCS, MBCS, EBCDIC_STATEFUL, ...: roundtrip flags and indexes
103:         *         uint32_t stage 2 flags and indexes[?];
104:         *     }
105:         *    
106:         *     -- stage 3 tables with byte results
107:         *     if(outputType==MBCS_OUTPUT_1) {
108:         *         -- SBCS: each 16-bit result contains flags and the result byte, see ucnvmbcs.c
109:         *         uint16_t fromUBytes[fromUBytesLength/2];
110:         *     } else {
111:         *         -- DBCS, MBCS, EBCDIC_STATEFUL, ... 2/3/4 bytes result, see ucnvmbcs.c
112:         *         uint8_t fromUBytes[fromUBytesLength]; or
113:         *         uint16_t fromUBytes[fromUBytesLength/2]; or
114:         *         uint32_t fromUBytes[fromUBytesLength/4];
115:         *     }
116:         * }
117:         *
118:         * -- extension table, details see ucnv_ext.h
119:         * int32_t indexes[>=32]; ...
120:         */
121:        /*
122:         * ucnv_ext.h
123:         *
124:         * See icuhtml/design/conversion/conversion_extensions.html
125:         *
126:         * Conversion extensions serve two purposes:
127:         * 1. They support m:n mappings.
128:         * 2. They support extension-only conversion files that are used together
129:         *    with the regular conversion data in base files.
130:         *
131:         * A base file may contain an extension table (explicitly requested or
132:         * implicitly generated for m:n mappings), but its extension table is not
133:         * used when an extension-only file is used.
134:         *
135:         * It is an error if a base file contains any regular (not extension) mapping
136:         * from the same sequence as a mapping in the extension file
137:         * because the base mapping would hide the extension mapping.
138:         *
139:         *
140:         * Data for conversion extensions:
141:         *
142:         * One set of data structures per conversion direction (to/from Unicode).
143:         * The data structures are sorted by input units to allow for binary search.
144:         * Input sequences of more than one unit are handled like contraction tables
145:         * in collation:
146:         * The lookup value of a unit points to another table that is to be searched
147:         * for the next unit, recursively.
148:         *
149:         * For conversion from Unicode, the initial code point is looked up in
150:         * a 3-stage trie for speed,
151:         * with an additional table of unique results to save space.
152:         *
153:         * Long output strings are stored in separate arrays, with length and index
154:         * in the lookup tables.
155:         * Output results also include a flag distinguishing roundtrip from
156:         * (reverse) fallback mappings.
157:         *
158:         * Input Unicode strings must not begin or end with unpaired surrogates
159:         * to avoid problems with matches on parts of surrogate pairs.
160:         *
161:         * Mappings from multiple characters (code points or codepage state
162:         * table sequences) must be searched preferring the longest match.
163:         * For this to work and be efficient, the variable-width table must contain
164:         * all mappings that contain prefixes of the multiple characters.
165:         * If an extension table is built on top of a base table in another file
166:         * and a base table entry is a prefix of a multi-character mapping, then
167:         * this is an error.
168:         *
169:         *
170:         * Implementation note:
171:         *
172:         * Currently, the parser and several checks in the code limit the number
173:         * of UChars or bytes in a mapping to
174:         * UCNV_EXT_MAX_UCHARS and UCNV_EXT_MAX_BYTES, respectively,
175:         * which are output value limits in the data structure.
176:         *
177:         * For input, this is not strictly necessary - it is a hard limit only for the
178:         * buffers in UConverter that are used to store partial matches.
179:         *
180:         * Input sequences could otherwise be arbitrarily long if partial matches
181:         * need not be stored (i.e., if a sequence does not span several buffers with too
182:         * many units before the last buffer), although then results would differ
183:         * depending on whether partial matches exceed the limits or not,
184:         * which depends on the pattern of buffer sizes.
185:         *
186:         *
187:         * Data structure:
188:         *
189:         * int32_t indexes[>=32];
190:         *
191:         *   Array of indexes and lengths etc. The length of the array is at least 32.
192:         *   The actual length is stored in indexes[0] to be forward compatible.
193:         *
194:         *   Each index to another array is the number of bytes from indexes[].
195:         *   Each length of an array is the number of array base units in that array.
196:         *
197:         *   Some of the structures may not be present, in which case their indexes
198:         *   and lengths are 0.
199:         *
200:         *   Usage of indexes[i]:
201:         *   [0]  length of indexes[]
202:         *
203:         *   // to Unicode table
204:         *   [1]  index of toUTable[] (array of uint32_t)
205:         *   [2]  length of toUTable[]
206:         *   [3]  index of toUUChars[] (array of UChar)
207:         *   [4]  length of toUUChars[]
208:         *
209:         *   // from Unicode table, not for the initial code point
210:         *   [5]  index of fromUTableUChars[] (array of UChar)
211:         *   [6]  index of fromUTableValues[] (array of uint32_t)
212:         *   [7]  length of fromUTableUChars[] and fromUTableValues[]
213:         *   [8]  index of fromUBytes[] (array of char)
214:         *   [9]  length of fromUBytes[]
215:         *
216:         *   // from Unicode trie for initial-code point lookup
217:         *   [10] index of fromUStage12[] (combined array of uint16_t for stages 1 & 2)
218:         *   [11] length of stage 1 portion of fromUStage12[]
219:         *   [12] length of fromUStage12[]
220:         *   [13] index of fromUStage3[] (array of uint16_t indexes into fromUStage3b[])
221:         *   [14] length of fromUStage3[]
222:         *   [15] index of fromUStage3b[] (array of uint32_t like fromUTableValues[])
223:         *   [16] length of fromUStage3b[]
224:         *
225:         *   [17] Bit field containing numbers of bytes:
226:         *        31..24 reserved, 0
227:         *        23..16 maximum input bytes
228:         *        15.. 8 maximum output bytes
229:         *         7.. 0 maximum bytes per UChar
230:         *
231:         *   [18] Bit field containing numbers of UChars:
232:         *        31..24 reserved, 0
233:         *        23..16 maximum input UChars
234:         *        15.. 8 maximum output UChars
235:         *         7.. 0 maximum UChars per byte
236:         *
237:         *   [19] Bit field containing flags:
238:         *               (extension table unicodeMask)
239:         *         1     UCNV_HAS_SURROGATES flag for the extension table
240:         *         0     UCNV_HAS_SUPPLEMENTARY flag for the extension table
241:         *
242:         *   [20]..[30] reserved, 0
243:         *   [31] number of bytes for the entire extension structure
244:         *   [>31] reserved; there are indexes[0] indexes
245:         *
246:         *
247:         * uint32_t toUTable[];
248:         *
249:         *   Array of byte/value pairs for lookups for toUnicode conversion.
250:         *   The array is partitioned into sections like collation contraction tables.
251:         *   Each section contains one word with the number of following words and
252:         *   a default value for when the lookup in this section yields no match.
253:         *
254:         *   A section is sorted in ascending order of input bytes,
255:         *   allowing for fast linear or binary searches.
256:         *   The builder may store entries for a contiguous range of byte values
257:         *   (compare difference between the first and last one with count),
258:         *   which then allows for direct array access.
259:         *   The builder should always do this for the initial table section.
260:         *
261:         *   Entries may have 0 values, see below.
262:         *   No two entries in a section have the same byte values.
263:         *
264:         *   Each uint32_t contains an input byte value in bits 31..24 and the
265:         *   corresponding lookup value in bits 23..0.
266:         *   Interpret the value as follows:
267:         *     if(value==0) {
268:         *       no match, see below
269:         *     } else if(value<0x1f0000) {
270:         *       partial match - use value as index to the next toUTable section
271:         *       and match the next unit; (value indexes toUTable[value])
272:         *     } else {
273:         *       if(bit 23 set) {
274:         *         roundtrip;
275:         *       } else {
276:         *         fallback;
277:         *       }
278:         *       unset value bit 23;
279:         *       if(value<=0x2fffff) {
280:         *         (value-0x1f0000) is a code point; (BMP: value<=0x1fffff)
281:         *       } else {
282:         *         bits 17..0 (value&0x3ffff) is an index to
283:         *           the result UChars in toUUChars[]; (0 indexes toUUChars[0])
284:         *         length of the result=((value>>18)-12); (length=0..19)
285:         *       }
286:         *     }
287:         *
288:         *   The first word in a section contains the number of following words in the
289:         *   input byte position (bits 31..24, number=1..0xff).
290:         *   The value of the initial word is used when the current byte is not found
291:         *   in this section.
292:         *   If the value is not 0, then it represents a result as above.
293:         *   If the value is 0, then the search has to return a shorter match with an
294:         *   earlier default value as the result, or result in "unmappable" even for the
295:         *   initial bytes.
296:         *   If the value is 0 for the initial toUTable entry, then the initial byte
297:         *   does not start any mapping input.
298:         *
299:         *
300:         * UChar toUUChars[];
301:         *
302:         *   Contains toUnicode mapping results, stored as sequences of UChars.
303:         *   Indexes and lengths stored in the toUTable[].
304:         *
305:         *
306:         * UChar fromUTableUChars[];
307:         * uint32_t fromUTableValues[];
308:         *
309:         *   The fromUTable is split into two arrays, but works otherwise much like
310:         *   the toUTable. The array is partitioned into sections like collation
311:         *   contraction tables and toUTable.
312:         *   A row in the table consists of same-index entries in fromUTableUChars[]
313:         *   and fromUTableValues[].
314:         *
315:         *   Interpret a value as follows:
316:         *     if(value==0) {
317:         *       no match, see below
318:         *     } else if(value<=0xffffff) { (bits 31..24 are 0)
319:         *       partial match - use value as index to the next fromUTable section
320:         *       and match the next unit; (value indexes fromUTable[value])
321:         *     } else {
322:         *       if(value==0x80000001) {
323:         *         return no mapping, but request for <subchar1>;
324:         *       }
325:         *       if(bit 31 set) {
326:         *         roundtrip;
327:         *       } else {
328:         *         fallback;
329:         *       }
330:         *       // bits 30..29 reserved, 0
331:         *       length=(value>>24)&0x1f; (bits 28..24)
332:         *       if(length==1..3) {
333:         *         bits 23..0 contain 1..3 bytes, padded with 00s on the left;
334:         *       } else {
335:         *         bits 23..0 (value&0xffffff) is an index to
336:         *           the result bytes in fromUBytes[]; (0 indexes fromUBytes[0])
337:         *       }
338:         *     }
339:         *       
340:         *   The first pair in a section contains the number of following pairs in the
341:         *   UChar position (16 bits, number=1..0xffff).
342:         *   The value of the initial pair is used when the current UChar is not found
343:         *   in this section.
344:         *   If the value is not 0, then it represents a result as above.
345:         *   If the value is 0, then the search has to return a shorter match with an
346:         *   earlier default value as the result, or result in "unmappable" even for the
347:         *   initial UChars.
348:         *
349:         *   If the from Unicode trie is present, then the from Unicode search tables
350:         *   are not used for initial code points.
351:         *   In this case, the first entries (index 0) in the tables are not used
352:         *   (reserved, set to 0) because a value of 0 is used in trie results
353:         *   to indicate no mapping.
354:         *
355:         *
356:         * uint16_t fromUStage12[];
357:         *
358:         *   Stages 1 & 2 of a trie that maps an initial code point.
359:         *   Indexes in stage 1 are all offset by the length of stage 1 so that the
360:         *   same array pointer can be used for both stages.
361:         *   If (c>>10)>=(length of stage 1) then c does not start any mapping.
362:         *   Same bit distribution as for regular conversion tries.
363:         *
364:         *
365:         * uint16_t fromUStage3[];
366:         * uint32_t fromUStage3b[];
367:         *
368:         *   Stage 3 of the trie. The first array simply contains indexes to the second,
369:         *   which contains words in the same format as fromUTableValues[].
370:         *   Use a stage 3 granularity of 4, which allows for 256k stage 3 entries,
371:         *   and 16-bit entries in stage 3 allow for 64k stage 3b entries.
372:         *   The stage 3 granularity means that the stage 2 entry needs to be left-shifted.
373:         *
374:         *   Two arrays are used because it is expected that more than half of the stage 3
375:         *   entries will be zero. The 16-bit index stage 3 array saves space even
376:         *   considering storing a total of 6 bytes per non-zero entry in both arrays
377:         *   together.
378:         *   Using a stage 3 granularity of >1 diminishes the compactability in that stage
379:         *   but provides a larger effective addressing space in stage 2.
380:         *   All but the final result stage use 16-bit entries to save space.
381:         *
382:         *   fromUStage3b[] contains a zero for "no mapping" at its index 0,
383:         *   and may contain UCNV_EXT_FROM_U_SUBCHAR1 at index 1 for "<subchar1> SUB mapping"
384:         *   (i.e., "no mapping" with preference for <subchar1> rather than <subchar>),
385:         *   and all other items are unique non-zero results.
386:         *
387:         *   The default value of a fromUTableValues[] section that is referenced
388:         *   _directly_ from a fromUStage3b[] item may also be UCNV_EXT_FROM_U_SUBCHAR1,
389:         *   but this value must not occur anywhere else in fromUTableValues[]
390:         *   because "no mapping" is always a property of a single code point,
391:         *   never of multiple.
392:         *
393:         *
394:         * char fromUBytes[];
395:         *
396:         *   Contains fromUnicode mapping results, stored as sequences of chars.
397:         *   Indexes and lengths stored in the fromUTableValues[].
398:         */
399:
400:        final class UConverterDataReader implements  ICUBinary.Authenticate {
401:            //private final static boolean debug = ICUDebug.enabled("UConverterDataReader");
402:
403:            /*
404:             * 	 UConverterDataReader(UConverterDataReader r)
405:                {
406:                    dataInputStream = new DataInputStream(r.dataInputStream);
407:                    unicodeVersion = r.unicodeVersion;
408:                }
409:             */
410:
411:            /**
412:             * <p>Protected constructor.</p>
413:             * @param inputStream ICU uprop.dat file input stream
414:             * @exception IOException throw if data file fails authentication 
415:             * @draft 2.1
416:             */
417:            protected UConverterDataReader(InputStream inputStream)
418:                    throws IOException {
419:                //if(debug) System.out.println("Bytes in inputStream " + inputStream.available());
420:
421:                unicodeVersion = ICUBinary.readHeader(inputStream,
422:                        DATA_FORMAT_ID, this );
423:
424:                //if(debug) System.out.println("Bytes left in inputStream " +inputStream.available());
425:
426:                dataInputStream = new DataInputStream(inputStream);
427:
428:                //if(debug) System.out.println("Bytes left in dataInputStream " +dataInputStream.available());
429:            }
430:
431:            // protected methods -------------------------------------------------
432:
433:            protected void readStaticData(UConverterStaticData sd)
434:                    throws IOException {
435:                sd.structSize = dataInputStream.readInt();
436:                byte[] name = new byte[UConverterConstants.MAX_CONVERTER_NAME_LENGTH];
437:                int length = dataInputStream.read(name);
438:                sd.name = new String(name, 0, length);
439:                sd.codepage = dataInputStream.readInt();
440:                sd.platform = dataInputStream.readByte();
441:                sd.conversionType = dataInputStream.readByte();
442:                sd.minBytesPerChar = dataInputStream.readByte();
443:                sd.maxBytesPerChar = dataInputStream.readByte();
444:                dataInputStream.read(sd.subChar);
445:                sd.subCharLen = dataInputStream.readByte();
446:                sd.hasToUnicodeFallback = dataInputStream.readByte();
447:                sd.hasFromUnicodeFallback = dataInputStream.readByte();
448:                sd.unicodeMask = (short) dataInputStream.readUnsignedByte();
449:                sd.subChar1 = dataInputStream.readByte();
450:                dataInputStream.read(sd.reserved);
451:            }
452:
453:            protected void readMBCSHeader(CharsetMBCS.MBCSHeader h)
454:                    throws IOException {
455:                dataInputStream.read(h.version);
456:                h.countStates = dataInputStream.readInt();
457:                h.countToUFallbacks = dataInputStream.readInt();
458:                h.offsetToUCodeUnits = dataInputStream.readInt();
459:                h.offsetFromUTable = dataInputStream.readInt();
460:                h.offsetFromUBytes = dataInputStream.readInt();
461:                h.flags = dataInputStream.readInt();
462:                h.fromUBytesLength = dataInputStream.readInt();
463:            }
464:
465:            protected void readMBCSTable(int[][] stateTableArray,
466:                    CharsetMBCS.MBCSToUFallback[] toUFallbacksArray,
467:                    char[] unicodeCodeUnitsArray, char[] fromUnicodeTableArray,
468:                    byte[] fromUnicodeBytesArray) throws IOException {
469:                int i, j;
470:                for (i = 0; i < stateTableArray.length; ++i)
471:                    for (j = 0; j < stateTableArray[i].length; ++j)
472:                        stateTableArray[i][j] = dataInputStream.readInt();
473:                for (i = 0; i < toUFallbacksArray.length; ++i) {
474:                    toUFallbacksArray[i].offset = dataInputStream.readInt();
475:                    toUFallbacksArray[i].codePoint = dataInputStream.readInt();
476:                }
477:                for (i = 0; i < unicodeCodeUnitsArray.length; ++i)
478:                    unicodeCodeUnitsArray[i] = dataInputStream.readChar();
479:                for (i = 0; i < fromUnicodeTableArray.length; ++i)
480:                    fromUnicodeTableArray[i] = dataInputStream.readChar();
481:                for (i = 0; i < fromUnicodeBytesArray.length; ++i)
482:                    fromUnicodeBytesArray[i] = dataInputStream.readByte();
483:            }
484:
485:            protected String readBaseTableName() throws IOException {
486:                char c;
487:                StringBuffer name = new StringBuffer();
488:                while ((c = (char) dataInputStream.readByte()) != 0)
489:                    name.append(c);
490:                return name.toString();
491:            }
492:
493:            //protected int[] readExtIndexes(int skip) throws IOException
494:            protected ByteBuffer readExtIndexes(int skip) throws IOException {
495:                dataInputStream.skipBytes(skip);
496:
497:                int n = dataInputStream.readInt();
498:                int[] indexes = new int[n];
499:                indexes[0] = n;
500:                for (int i = 1; i < n; ++i) {
501:                    indexes[i] = dataInputStream.readInt();
502:                }
503:                //return indexes;
504:
505:                ByteBuffer b = ByteBuffer.allocate(indexes[31]);
506:                for (int i = 0; i < n; ++i) {
507:                    b.putInt(indexes[i]);
508:                }
509:                dataInputStream.read(b.array(), b.position(), b.remaining());
510:                return b;
511:            }
512:
513:            protected byte[] readExtTables(int n) throws IOException {
514:                byte[] tables = new byte[n];
515:                dataInputStream.read(tables);
516:                return tables;
517:            }
518:
519:            byte[] getDataFormatVersion() {
520:                return DATA_FORMAT_VERSION;
521:            }
522:
523:            /**
524:             * Inherited method
525:             */
526:            public boolean isDataVersionAcceptable(byte version[]) {
527:                return version[0] == DATA_FORMAT_VERSION[0];
528:            }
529:
530:            byte[] getUnicodeVersion() {
531:                return unicodeVersion;
532:            }
533:
534:            // private data members -------------------------------------------------
535:
536:            /**
537:             * ICU data file input stream
538:             */
539:            private DataInputStream dataInputStream;
540:
541:            private byte[] unicodeVersion;
542:
543:            /**
544:             * File format version that this class understands.
545:             * No guarantees are made if a older version is used
546:             * see store.c of gennorm for more information and values
547:             */
548:            // DATA_FORMAT_ID_ values taken from icu4c isCnvAcceptable (ucnv_bld.c)
549:            private static final byte DATA_FORMAT_ID[] = { (byte) 0x63,
550:                    (byte) 0x6e, (byte) 0x76, (byte) 0x74 }; // dataFormat="cnvt"
551:            private static final byte DATA_FORMAT_VERSION[] = { (byte) 0x6 };
552:
553:        }
www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.