Source Code Cross Referenced for CharacterEntityFormat.java in  » Development » ivatamasks » com » ivata » mask » web » format » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Development » ivatamasks » com.ivata.mask.web.format 
Source Cross Referenced  Class Diagram Java Document (Java Doc) 


001:        /*
002:         * Copyright (c) 2001 - 2005 ivata limited.
003:         * All rights reserved.
004:         * -----------------------------------------------------------------------------
005:         * ivata masks may be redistributed under the GNU General Public
006:         * License as published by the Free Software Foundation;
007:         * version 2 of the License.
008:         *
009:         * These programs are free software; you can redistribute them and/or
010:         * modify them under the terms of the GNU General Public License
011:         * as published by the Free Software Foundation; version 2 of the License.
012:         *
013:         * These programs are distributed in the hope that they will be useful,
014:         * but WITHOUT ANY WARRANTY; without even the implied warranty of
015:         * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
016:         *
017:         * See the GNU General Public License in the file LICENSE.txt for more
018:         * details.
019:         *
020:         * If you would like a copy of the GNU General Public License write to
021:         *
022:         * Free Software Foundation, Inc.
023:         * 59 Temple Place - Suite 330
024:         * Boston, MA 02111-1307, USA.
025:         *
026:         *
027:         * To arrange commercial support and licensing, contact ivata at
028:         *                  http://www.ivata.com/contact.jsp
029:         * -----------------------------------------------------------------------------
030:         * $Log: CharacterEntityFormat.java,v $
031:         * Revision 1.5  2005/10/03 10:17:25  colinmacleod
032:         * Fixed some style and javadoc issues.
033:         *
034:         * Revision 1.4  2005/10/02 14:06:33  colinmacleod
035:         * Added/improved log4j logging.
036:         *
037:         * Revision 1.3  2005/04/11 14:45:38  colinmacleod
038:         * Changed HTMLFormat from an abstract class
039:         * into an interface.
040:         *
041:         * Revision 1.2  2005/04/09 18:04:18  colinmacleod
042:         * Changed copyright text to GPL v2 explicitly.
043:         *
044:         * Revision 1.1  2005/01/06 22:41:01  colinmacleod
045:         * Moved up a version number.
046:         * Changed copyright notices to 2005.
047:         * Updated the documentation:
048:         *   - started working on multiproject:site docu.
049:         *   - changed the logo.
050:         * Added checkstyle and fixed LOADS of style issues.
051:         * Added separate thirdparty subproject.
052:         * Added struts (in web), util and webgui (in webtheme) from ivata op.
053:         *
054:         * Revision 1.4  2004/11/03 17:07:21  colinmacleod
055:         * Fixed bug in entity matching.
056:         *
057:         * Revision 1.3  2004/03/21 21:16:37  colinmacleod
058:         * Shortened name to ivata op.
059:         *
060:         * Revision 1.2  2004/02/01 22:07:32  colinmacleod
061:         * Added full names to author tags
062:         *
063:         * Revision 1.1.1.1  2004/01/27 20:59:47  colinmacleod
064:         * Moved ivata op to SourceForge.
065:         *
066:         * Revision 1.2  2003/10/15 14:13:39  colin
067:         * Fixes for XDoclet.
068:         *
069:         * Revision 1.3  2003/02/26 08:15:03  colin
070:         * Fixed bug in append routine.
071:         *
072:         * Revision 1.2  2003/02/26 08:13:43  colin
073:         * added toString to entity StringBuffer - not supported in JDK 1.3
074:         *
075:         * Revision 1.1  2003/02/24 19:33:32  colin
076:         * Moved to new subproject.
077:         *
078:         * Revision 1.5  2003/02/04 17:43:46  colin
079:         * copyright notice
080:         *
081:         * Revision 1.4  2002/09/06 15:08:34  colin
082:         * split off the character entity map into a new file
083:         *
084:         * Revision 1.3  2002/09/04 08:10:36  colin
085:         * fixed bug when entities are converted by browser
086:         *
087:         * Revision 1.1  2002/06/21 11:58:37  colin
088:         * restructured com.ivata.mask.jsp into
089:         * format, JavaScript, theme and tree.
090:         * -----------------------------------------------------------------------------
091:         */
092:        package com.ivata.mask.web.format;
093:
094:        import org.apache.log4j.Logger;
095:
096:        /**
097:         * Convert characters to their HTML character entity equivalents.
098:         *
099:         * @since ivata masks 0.4 (2002-06-19)
100:         * @author Colin MacLeod
101:         * <a href='mailto:colin.macleod@ivata.com'>colin.macleod@ivata.com</a>
102:         * @version $Revision: 1.5 $
103:         */
104:        public class CharacterEntityFormat implements  HTMLFormat {
105:            /**
106:             * Logger for this class.
107:             */
108:            private static final Logger logger = Logger
109:                    .getLogger(CharacterEntityFormat.class);
110:
111:            /**
112:             * Maintain the mapping via this translation array.
113:             */
114:            static final String[][] ENTITIES = {
115:                    // quotation mark = APL quote
116:                    // NOTE: JBuilder compiler wouldn't accept the Unicode value here
117:                    // (\u0022): probably an internal compiler problem with the string
118:                    // quotes
119:                    { "\"", "quot" }, //\u0022
120:                    { "\"", "#34" }, //\u0022
121:                    // ampersand
122:                    { "\u0026", "amp" }, { "\u0026", "#38" }, { "&", "amp" },
123:                    { "\u0027", "#39" },
124:                    // less-than sign
125:                    { "\u003C", "lt" }, { "\u003C", "#60" }, { "<", "lt" },
126:                    // greater-than sign
127:                    { "\u003E", "gt" }, { "\u003E", "#62" }, { ">", "gt" },
128:                    // Latin capital ligature OE
129:                    { "\u0152", "OElig" }, { "\u0152", "#338" },
130:                    // Latin small ligature oe
131:                    { "\u0153", "oelig" }, { "\u0153", "#339" },
132:                    // Latin capital letter S with caron
133:                    { "\u0160", "Scaron" }, { "\u0160", "#352" },
134:                    // Latin small letter s with caron
135:                    { "\u0161", "scaron" }, { "\u0161", "#353" },
136:                    // Latin capital letter Y with diaeresis
137:                    { "\u0178", "Yuml" }, { "\u0178", "#376" },
138:                    // Latin small f with hook = function = florin
139:                    { "\u0192", "fnof" }, { "\u0192", "#402" },
140:                    // modifier letter circumflex accent
141:                    { "\u02C6", "circ" }, { "\u02C6", "#710" },
142:                    // small tilde
143:                    { "\u02DC", "tilde" }, { "\u02DC", "#732" },
144:                    // greek capital letter alpha
145:                    { "\u0391", "Alpha" }, { "\u0391", "#913" },
146:                    // greek capital letter beta
147:                    { "\u0392", "Beta" }, { "\u0392", "#914" },
148:                    // greek capital letter gamma
149:                    { "\u0393", "Gamma" }, { "\u0393", "#915" },
150:                    // greek capital letter delta
151:                    { "\u0394", "Delta" }, { "\u0394", "#916" },
152:                    // greek capital letter epsilon
153:                    { "\u0395", "Epsilon" }, { "\u0395", "#917" },
154:                    // greek capital letter zeta
155:                    { "\u0396", "Zeta" }, { "\u0396", "#918" },
156:                    // greek capital letter eta
157:                    { "\u0397", "Eta" }, { "\u0397", "#919" },
158:                    // greek capital letter theta
159:                    { "\u0398", "Theta" }, { "\u0398", "#920" },
160:                    // greek capital letter iota
161:                    { "\u0399", "Iota" }, { "\u0399", "#921" },
162:                    // greek capital letter kappa
163:                    { "\u039A", "Kappa" }, { "\u039A", "#922" },
164:                    // greek capital letter lambda
165:                    { "\u039B", "Lambda" }, { "\u039B", "#923" },
166:                    // greek capital letter mu
167:                    { "\u039C", "Mu" }, { "\u039C", "#924" },
168:                    // greek capital letter nu
169:                    { "\u039D", "Nu" }, { "\u039D", "#925" },
170:                    // greek capital letter xi
171:                    { "\u039E", "Xi" }, { "\u039E", "#926" },
172:                    // greek capital letter omicron
173:                    { "\u039F", "Omicron" }, { "\u039F", "#927" },
174:                    // greek capital letter pi
175:                    { "\u03A0", "Pi" }, { "\u03A0", "#928" },
176:                    // greek capital letter rho
177:                    { "\u03A1", "Rho" }, { "\u03A1", "#929" },
178:                    // greek capital letter sigma
179:                    { "\u03A3", "Sigma" }, { "\u03A3", "#931" },
180:                    // greek capital letter tau
181:                    { "\u03A4", "Tau" }, { "\u03A4", "#932" },
182:                    // greek capital letter upsilon
183:                    { "\u03A5", "Upsilon" }, { "\u03A5", "#933" },
184:                    // greek capital letter phi
185:                    { "\u03A6", "Phi" }, { "\u03A6", "#934" },
186:                    // greek capital letter chi
187:                    { "\u03A7", "Chi" }, { "\u03A7", "#935" },
188:                    // greek capital letter psi
189:                    { "\u03A8", "Psi" }, { "\u03A8", "#936" },
190:                    // greek capital letter omega
191:                    { "\u03A9", "Omega" }, { "\u03A9", "#937" },
192:                    // greek small letter alpha
193:                    { "\u03B1", "alpha" }, { "\u03B1", "#945" },
194:                    // greek small letter beta
195:                    { "\u03B2", "beta" }, { "\u03B2", "#946" },
196:                    // greek small letter gamma
197:                    { "\u03B3", "gamma" }, { "\u03B3", "#947" },
198:                    // greek small letter delta
199:                    { "\u03B4", "delta" }, { "\u03B4", "#948" },
200:                    // greek small letter epsilon
201:                    { "\u03B5", "epsilon" }, { "\u03B5", "#949" },
202:                    // greek small letter zeta
203:                    { "\u03B6", "zeta" }, { "\u03B6", "#950" },
204:                    // greek small letter eta
205:                    { "\u03B7", "eta" }, { "\u03B7", "#951" },
206:                    // greek small letter theta
207:                    { "\u03B8", "theta" }, { "\u03B8", "#952" },
208:                    // greek small letter iota
209:                    { "\u03B9", "iota" }, { "\u03B9", "#953" },
210:                    // greek small letter kappa
211:                    { "\u03BA", "kappa" }, { "\u03BA", "#954" },
212:                    // greek small letter lambda
213:                    { "\u03BB", "lambda" }, { "\u03BB", "#955" },
214:                    // greek small letter mu
215:                    { "\u03BC", "mu" }, { "\u03BC", "#956" },
216:                    // greek small letter nu
217:                    { "\u03BD", "nu" }, { "\u03BD", "#957" },
218:                    // greek small letter xi
219:                    { "\u03BE", "xi" }, { "\u03BE", "#958" },
220:                    // greek small letter omicron
221:                    { "\u03BF", "omicron" }, { "\u03BF", "#959" },
222:                    // greek small letter pi
223:                    { "\u03C0", "pi" }, { "\u03C0", "#960" },
224:                    // greek small letter rho
225:                    { "\u03C1", "rho" }, { "\u03C1", "#961" },
226:                    // greek small letter final sigma
227:                    { "\u03C2", "sigmaf" }, { "\u03C2", "#962" },
228:                    // greek small letter sigma
229:                    { "\u03C3", "sigma" }, { "\u03C3", "#963" },
230:                    // greek small letter tau
231:                    { "\u03C4", "tau" }, { "\u03C4", "#964" },
232:                    // greek small letter upsilon
233:                    { "\u03C5", "upsilon" }, { "\u03C5", "#965" },
234:                    // greek small letter phi
235:                    { "\u03C6", "phi" }, { "\u03C6", "#966" },
236:                    // greek small letter chi
237:                    { "\u03C7", "chi" }, { "\u03C7", "#967" },
238:                    // greek small letter psi
239:                    { "\u03C8", "psi" }, { "\u03C8", "#968" },
240:                    // greek small letter omega
241:                    { "\u03C9", "omega" }, { "\u03C9", "#969" },
242:                    // greek small letter theta symbol
243:                    { "\u03D1", "thetasym" }, { "\u03D1", "#977" },
244:                    // greek upsilon with hook symbol
245:                    { "\u03D2", "upsih" }, { "\u03D2", "#978" },
246:                    // greek pi symbol
247:                    { "\u03D6", "piv" }, { "\u03D6", "#982" },
248:                    // en space
249:                    { "\u2002", "ensp" }, { "\u2002", "#8194" },
250:                    // em space
251:                    { "\u2003", "emsp" }, { "\u2003", "#8195" },
252:                    // thin space
253:                    { "\u2009", "thinsp" }, { "\u2009", "#8201" },
254:                    // zero width non-joiner
255:                    { "\u200C", "zwnj" }, { "\u200C", "#8204" },
256:                    // zero width joiner
257:                    { "\u200D", "zwj" }, { "\u200D", "#8205" },
258:                    // left-to-right mark
259:                    { "\u200E", "lrm" }, { "\u200E", "#8206" },
260:                    // right-to-left mark
261:                    { "\u200F", "rlm" }, { "\u200F", "#8207" },
262:                    // en dash
263:                    { "\u2013", "ndash" }, { "\u2013", "#8211" },
264:                    // em dash
265:                    { "\u2014", "mdash" }, { "\u2014", "#8212" },
266:                    // left single quotation mark
267:                    { "\u2018", "lsquo" }, { "\u2018", "#8216" },
268:                    // right single quotation mark
269:                    { "\u2019", "rsquo" }, { "\u2019", "#8217" },
270:                    // single low-9 quotation mark
271:                    { "\u201A", "sbquo" }, { "\u201A", "#8218" },
272:                    // left double quotation mark
273:                    { "\u201C", "ldquo" }, { "\u201C", "#8220" },
274:                    // right double quotation mark
275:                    { "\u201D", "rdquo" }, { "\u201D", "#8221" },
276:                    // double low-9 quotation mark
277:                    { "\u201E", "bdquo" }, { "\u201E", "#8222" },
278:                    // dagger
279:                    { "\u2020", "dagger" }, { "\u2020", "#8224" },
280:                    // double dagger
281:                    { "\u2021", "Dagger" }, { "\u2021", "#8225" },
282:                    // bullet = black small circle
283:                    { "\u2022", "bull" }, { "\u2022", "#8226" },
284:                    // horizontal ellipsis = three dot leader
285:                    { "\u2026", "hellip" }, { "\u2026", "#8230" },
286:                    // per mille sign
287:                    { "\u2030", "permil" }, { "\u2030", "#8240" },
288:                    // double prime = seconds = inches
289:                    { "\u2033", "Prime" }, { "\u2033", "#8243" },
290:                    // single left-pointing angle quotation mark
291:                    { "\u2039", "lsaquo" }, { "\u2039", "#8249" },
292:                    // single right-pointing angle quotation mark
293:                    { "\u203A", "rsaquo" }, { "\u203A", "#8250" },
294:                    // prime = minutes = feet
295:                    { "\u2032", "prime" }, { "\u2032", "#8242" },
296:                    // overline = spacing overscore
297:                    { "\u203E", "oline" }, { "\u203E", "#8254" },
298:                    // fraction slash
299:                    { "\u2044", "frasl" }, { "\u2044", "#8260" },
300:                    // euro sign
301:                    { "\u20AC", "euro" }, { "\u20AC", "#8364" },
302:                    // script capital P = power set = Weierstrass p
303:                    { "\u2118", "weierp" }, { "\u2118", "#8472" },
304:                    // blackletter capital I = imaginary part
305:                    { "\u2111", "image" }, { "\u2111", "#8465" },
306:                    // blackletter capital R = real part symbol
307:                    { "\u211C", "real" }, { "\u211C", "#8476" },
308:                    // trade mark sign
309:                    { "\u2122", "trade" }, { "\u2122", "#8482" },
310:                    // alef symbol = first transfinite cardinal
311:                    { "\u2135", "alefsym" }, { "\u2135", "#8501" },
312:                    // leftwards arrow
313:                    { "\u2190", "larr" }, { "\u2190", "#8592" },
314:                    // upwards arrow
315:                    { "\u2191", "uarr" }, { "\u2191", "#8593" },
316:                    // rightwards arrow
317:                    { "\u2192", "rarr" }, { "\u2192", "#8594" },
318:                    // downwards arrow
319:                    { "\u2193", "darr" }, { "\u2193", "#8595" },
320:                    // left right arrow
321:                    { "\u2194", "harr" }, { "\u2194", "#8596" },
322:                    // downwards arrow with corner leftwards = carriage return
323:                    { "\u21B5", "crarr" }, { "\u21B5", "#8629" },
324:                    // leftwards double arrow
325:                    { "\u21D0", "lArr" }, { "\u21D0", "#8656" },
326:                    // upwards double arrow
327:                    { "\u21D1", "uArr" }, { "\u21D1", "#8657" },
328:                    // rightwards double arrow
329:                    { "\u21D2", "rArr" }, { "\u21D2", "#8658" },
330:                    // downwards double arrow
331:                    { "\u21D3", "hArr" }, { "\u21D3", "#8659" },
332:                    // left right double arrow
333:                    { "\u21D4", "hArr" }, { "\u21D4", "#8660" },
334:                    // for all
335:                    { "\u2200", "forall" }, { "\u2200", "#8704" },
336:                    // partial differential
337:                    { "\u2202", "part" }, { "\u2202", "#8706" },
338:                    // there exists
339:                    { "\u2203", "exist" }, { "\u2203", "#8707" },
340:                    // empty set = null set = diameter
341:                    { "\u2205", "empty" }, { "\u2205", "#8709" },
342:                    // nabla = backward difference
343:                    { "\u2207", "nabla" }, { "\u2207", "#8711" },
344:                    // element of
345:                    { "\u2208", "isin" }, { "\u2208", "#8712" },
346:                    // not an element of
347:                    { "\u2209", "notin" }, { "\u2209", "#8713" },
348:                    // contains as member
349:                    { "\u220B", "ni" }, { "\u220B", "#8715" },
350:                    // n-ary product = product sign
351:                    { "\u220F", "prod" }, { "\u220F", "#8719" },
352:                    // n-ary sumation
353:                    { "\u2211", "sum" }, { "\u2211", "#8721" },
354:                    // minus sign
355:                    { "\u2212", "minus" }, { "\u2212", "#8722" },
356:                    // asterisk operator
357:                    { "\u2217", "lowast" }, { "\u2217", "#8727" },
358:                    // square root = radical sign
359:                    { "\u221A", "radic" }, { "\u221A", "#8730" },
360:                    // proportional to
361:                    { "\u221D", "prop" }, { "\u221D", "#8733" },
362:                    // infinity
363:                    { "\u221E", "infin" }, { "\u221E", "#8734" },
364:                    // angle
365:                    { "\u2220", "ang" }, { "\u2220", "#8736" },
366:                    // logical and = wedge
367:                    { "\u2227", "and" }, { "\u2227", "#8743" },
368:                    // logical or = vee
369:                    { "\u2228", "or" }, { "\u2228", "#8744" },
370:                    // intersection = cap
371:                    { "\u2229", "cap" }, { "\u2229", "#8745" },
372:                    // union = cup
373:                    { "\u222A", "cup" }, { "\u222A", "#8746" },
374:                    // integral
375:                    { "\u222B", "int" }, { "\u222B", "#8747" },
376:                    // therefore
377:                    { "\u2234", "there4" }, { "\u2234", "#8756" },
378:                    // tilde operator = varies with = similar to
379:                    { "\u223C", "sim" }, { "\u223C", "#8764" },
380:                    // approximately equal to
381:                    { "\u2245", "cong" }, { "\u2245", "#8773" },
382:                    // almost equal to = asymptotic to
383:                    { "\u2248", "asymp" }, { "\u2248", "#8776" },
384:                    // not equal to
385:                    { "\u2260", "ne" }, { "\u2260", "#8800" },
386:                    // identical to
387:                    { "\u2261", "equiv" }, { "\u2261", "#8801" },
388:                    // less-than or equal to
389:                    { "\u2264", "le" }, { "\u2264", "#8804" },
390:                    // greater-than or equal to
391:                    { "\u2265", "ge" }, { "\u2265", "#8805" },
392:                    // subset of
393:                    { "\u2282", "sub" }, { "\u2282", "#8834" },
394:                    // superset of
395:                    { "\u2283", "sup" }, { "\u2283", "#8835" },
396:                    // not a subset of
397:                    { "\u2284", "nsub" }, { "\u2284", "#8836" },
398:                    // subset of or equal to
399:                    { "\u2286", "sube" }, { "\u2286", "#8838" },
400:                    // superset of or equal to
401:                    { "\u2287", "supe" }, { "\u2287", "#8839" },
402:                    // circled plus = direct sum
403:                    { "\u2295", "oplus" }, { "\u2295", "#8853" },
404:                    // circled times = vector product
405:                    { "\u2297", "otimes" }, { "\u2297", "#8855" },
406:                    // up tack = orthogonal to = perpendicular
407:                    { "\u22A5", "perp" }, { "\u22A5", "#8869" },
408:                    // dot operator
409:                    { "\u22C5", "sdot" }, { "\u22C5", "#8901" },
410:                    // left ceiling = apl upstile
411:                    { "\u2308", "lceil" }, { "\u2308", "#8968" },
412:                    // right ceiling
413:                    { "\u2309", "rceil" }, { "\u2309", "#8969" },
414:                    // left floor = apl downstile
415:                    { "\u230A", "lfloor" }, { "\u230A", "#8970" },
416:                    // right floor
417:                    { "\u230B", "rfloor" }, { "\u230B", "#8971" },
418:                    // left-pointing angle bracket = bra
419:                    { "\u2329", "lang" }, { "\u2329", "#9001" },
420:                    // right-pointing angle bracket = ket
421:                    { "\u232A", "rang" }, { "\u232A", "#9002" },
422:                    // lozenge
423:                    { "\u25CA", "loz" }, { "\u25CA", "#9674" },
424:                    // black spade suit
425:                    { "\u2660", "spades" }, { "\u2660", "#9824" },
426:                    // black club suit = shamrock
427:                    { "\u2663", "clubs" }, { "\u2663", "#9827" },
428:                    // black heart suit = valentine
429:                    { "\u2665", "hearts" }, { "\u2665", "#9829" },
430:                    // black diamond suit
431:                    { "\u2666", "diams" }, { "\u2666", "#9830" } };
432:            /**
433:             * This array stores all of the character entities we want to convert.
434:             */
435:            private static String[] entitiesArray = null;
436:            /**
437:             * Each character in this array maps to an entity in
438:             * <code>entitiesArray</code>.
439:             */
440:            private static String entityMapString;
441:            /**
442:             * This is tag is placed after anything which should not be converted. It is
443:             * used by other formats.
444:             */
445:            private static final String KEEP_END = "</KEEP:>";
446:            /**
447:             * This is tag is placed before anything which should not be converted. It
448:             * is used by other formats.
449:             */
450:            private static final String KEEP_START = "<KEEP:>";
451:            /**
452:             * Just what it says on the tin - no character entity string can be longer
453:             * than this limit.
454:             */
455:            private static final int MAXIMUM_ENTITY_LENGTH = 15;
456:            /**
457:             * <copyDoc>Refer to {@link #isReverse}.</copyDoc>
458:             */
459:            private boolean reverse = false;
460:
461:            /**
462:             * <p>
463:             * Default constructor.
464:             * </p>
465:             */
466:            public CharacterEntityFormat() {
467:                // this will speed up the conversion of HTML entities
468:                // we put them into the array of array of strings to make it more
469:                // manageable :-)
470:                if (entitiesArray == null) {
471:                    int length = CharacterEntityFormat.ENTITIES.length;
472:                    StringBuffer temporaryBuffer = new StringBuffer();
473:                    entitiesArray = new String[length];
474:                    for (int n = 0; n < length; ++n) {
475:                        temporaryBuffer
476:                                .append(CharacterEntityFormat.ENTITIES[n][0]);
477:                        entitiesArray[n] = "&"
478:                                + CharacterEntityFormat.ENTITIES[n][1] + ";";
479:                        // this code can be used to calculate the maximum entity length
480:                        /*
481:                         * if (entities[n][1].length() > MAXIMUM_ENTITY_LENGTH) {
482:                         * MAXIMUM_ENTITY_LENGTH = entities[n][1].length(); }
483:                         */
484:                    }
485:                    entityMapString = temporaryBuffer.toString();
486:                }
487:            }
488:
489:            /**
490:             * <p>
491:             * Convert the character entities in the text provided.
492:             * </p>
493:             *
494:             * @param hTMLText a text to convert all the character entities in
495:             * @return formatted text where all of the characters are converted to the
496:             *      appropriate character entities.
497:             */
498:            public final String format(final String hTMLText) {
499:                if (logger.isDebugEnabled()) {
500:                    logger.debug("format(String hTMLText = " + hTMLText
501:                            + ") - start");
502:                }
503:
504:                StringBuffer returnBuffer = new StringBuffer();
505:                int length = hTMLText.length();
506:                int index;
507:                int indexStart = hTMLText.indexOf(KEEP_START);
508:                int indexEnd;
509:                for (int n = 0; n < length; ++n) {
510:                    // if we have reached the next keep section (and there is one)
511:                    if ((indexStart > -1) && (indexStart == n)) {
512:                        // find the end of the keep section
513:                        if ((indexEnd = hTMLText.indexOf(KEEP_END, indexStart)) != -1) {
514:                            int keepEndPosition = KEEP_END.length() + 1;
515:                            returnBuffer.append(hTMLText.substring(indexStart
516:                                    + keepEndPosition, indexEnd));
517:                            n = indexEnd + keepEndPosition;
518:                            indexStart = hTMLText.indexOf(KEEP_START, n);
519:                        } else {
520:                            //  no end tag -> ignore
521:                            indexStart = -1;
522:                        }
523:                    } else {
524:                        int semiIndex = n;
525:                        char ch = hTMLText.charAt(n);
526:                        StringBuffer entity = null;
527:                        // is there a character entity at this point?
528:                        if (ch == '&') {
529:                            // look ahead for the semicolon
530:                            for (entity = new StringBuffer(
531:                                    MAXIMUM_ENTITY_LENGTH); (semiIndex < length)
532:                                    && ((semiIndex - n + 1) <= MAXIMUM_ENTITY_LENGTH); ++semiIndex) {
533:                                char semi = hTMLText.charAt(semiIndex);
534:                                // add the character to the buffer
535:                                entity.append(semi);
536:                                // if we found a semi-colon, that's great. this is a
537:                                // real entity.
538:                                if (semi == ';') {
539:                                    break;
540:                                }
541:                                // if this is not alphanumeric or hash, remove the
542:                                // entity buffer
543:                                if ((semi != '&') && (semi != '#')
544:                                        && !Character.isLetterOrDigit(semi)) {
545:                                    entity = null;
546:                                    break;
547:                                }
548:                            }
549:                        }
550:                        // if we go in reverse direction, look for character entities
551:                        if (reverse) {
552:                            // if there was an entity, try to convert it
553:                            if (entity == null) {
554:                                returnBuffer.append(ch);
555:                            } else if (entity.toString().equalsIgnoreCase(
556:                                    "nbsp")) {
557:                                // this is a special case - it only translates one way
558:                                returnBuffer.append(' ');
559:                                n = semiIndex;
560:                            } else {
561:                                String compare = entity.toString();
562:                                for (int arrayIndex = 0; arrayIndex < entitiesArray.length; ++arrayIndex) {
563:                                    if (entitiesArray[arrayIndex]
564:                                            .equals(compare)) {
565:                                        returnBuffer.append(entityMapString
566:                                                .charAt(arrayIndex));
567:                                        entity = null;
568:                                        n = semiIndex;
569:                                        break;
570:                                    }
571:                                }
572:                                if (entity != null) {
573:                                    returnBuffer.append(ch);
574:                                }
575:                            }
576:                        } else if (entity == null) {
577:                            // see if we should convert the character
578:                            if ((index = entityMapString.indexOf(ch)) == -1) {
579:                                returnBuffer.append(ch);
580:                            } else {
581:                                returnBuffer.append(entitiesArray[index]);
582:                            }
583:                        } else {
584:                            // if this is not reverse direction, and an entity was found
585:                            // then skip past it no matter what
586:                            n = semiIndex;
587:                            returnBuffer.append(entity.toString());
588:                        }
589:                    }
590:                }
591:                String returnString = returnBuffer.toString();
592:                if (logger.isDebugEnabled()) {
593:                    logger.debug("format(String) - end - return value = "
594:                            + returnString);
595:                }
596:                return returnString;
597:            }
598:
599:            /**
600:             * <p>
601:             * Gets whether or not character entity conversion goes in the opposite
602:             * direction. If character entities are converted to characters then this
603:             * method returns <code>true</code>, otherwise <code>false</code>
604:             * </p>
605:             *
606:             * @return <code>true</code> if character entities are converted to
607:             *     characters, otherwise <code>false</code>/
608:             */
609:            public final boolean isReverse() {
610:                if (logger.isDebugEnabled()) {
611:                    logger.debug("isReverse() - start");
612:                }
613:
614:                if (logger.isDebugEnabled()) {
615:                    logger.debug("isReverse() - end - return value = "
616:                            + reverse);
617:                }
618:                return reverse;
619:            }
620:
621:            /**
622:             * <p>
623:             * Set whether or not character entity conversion goes in the opposite
624:             * direction.
625:             * </p>
626:             *
627:             * @param newReverse set to <code>true</code> if character entities should
628:             *      be converted to characters, otherwise <code>false</code>.
629:             */
630:            public final void setReverse(final boolean newReverse) {
631:                if (logger.isDebugEnabled()) {
632:                    logger.debug("setReverse(boolean newReverse = "
633:                            + newReverse + ") - start");
634:                }
635:
636:                reverse = newReverse;
637:
638:                if (logger.isDebugEnabled()) {
639:                    logger.debug("setReverse(boolean) - end");
640:                }
641:            }
642:        }
www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.