Java Doc for HtmlRewriter.java in  » Web-Server » Brazil » sunlabs » brazil » handler » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Web Server » Brazil » sunlabs.brazil.handler 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   sunlabs.brazil.handler.HtmlRewriter

All known Subclasses:   sunlabs.brazil.template.RewriteContext,
HtmlRewriter
public class HtmlRewriter (Code)
This class helps with parsing and rewriting an HTML document. The source document is not changed; a new HTML document is built.

The user can sequentially examine and rewrite each token in the source HTML document. As each token in the document is seen, the user has two choices:

  • modify the current token.
  • don't modify the current token.
If the user modifies (or replaces, deletes, etc.) the current token, then the resultant HTML document will contain that modification. On the other hand, if the user doesn't do anything with the current token, it will appear, unchanged, in the resultant HTML document.

Parsing is implemented lazily, meaning, for example, that unless the user actually asks for attributes of an HTML tag, this parser does not have to spend the time breaking up the attributes.

This class is used by HTML filters to maintain the state of the document and allow the filters to perform arbitrary rewriting.
author:
   Colin Stevens (colin.stevens@sun.com)
version:
   1.9, 00/12/27



Field Summary
 booleanaccumulate
     true if nextToken should automatically append unmodified tokens to the result.
 booleanappendToken
     true if the user has already explicitly appended something, so nextToken shouldn't append the unmodified token.
public  LexHTMLlex
     The parser for the source HTML document.
 StringMapmap
    
 booleanpushback
     true if the last token was pushed back and should be presented again next time.
public  StringBuffersb
     Storage holding the resultant HTML document.
 Stringtag
    
 Stringtoken
    
 booleantokenModified
     true if the user has modified the tag name or attributes of the current tag, so when this tag is appended, we need to write out its parts rather than just emitting the raw token.
 inttype
    

Constructor Summary
public  HtmlRewriter(LexHTML lex)
     Creates a new HtmlRewriter from the given HTML parser.
public  HtmlRewriter(String str)
     Creates a new HtmlRewriter that will operate on the given string.

Method Summary
public  booleanaccumulate(boolean accumulate)
     Turns on or off the automatic accumulation of each token.

After each token is processed, the current token is appended to to the resultant HTML document unless the user has already appended something else.

public  voidappend(String str)
     Instead of modifying an existing token, this method allows the user to completely replace the current token with arbitrary new content.
public  voidappendToken()
     Appends the current token to the resultant HTML document. If the caller has changed the current token using the setTag, set, or remove methods, those changes will be reflected.

By default, this method is automatically called after each token is processed unless the user has already appended something to the resultant HTML document.

public  Stringget(String key)
     Returns the value that the specified case-insensitive key maps to in the attributes for the current tag.
public  StringgetArgs()
     Gets the arguments of the current token as a string.
public  StringgetBody()
     Gets the body of the current token as a string.
public  StringgetTag()
     Gets the current tag's name.
public  StringgetToken()
     Gets the raw string making up the entire current token, including the angle brackets or comment delimiters, if applicable.
public  intgetType()
     Gets the type of the current token.
public  Enumerationkeys()
     Returns an enumeration of the keys in the current tag's attributes. The elements of the enumeration are the string keys.
public  voidkillToken()
     Tells this HtmlRewriter not to append the current token to the resultant HTML document.
public  booleannextTag()
     A convenence method built on top of nextToken. Advances to the next HTML tag.
public  booleannextToken()
     Advances to the next token in the source HTML document.

The other purpose of this function is to "do the right thing", which is to append the token we just processed to the resultant HTML document, unless the user has already appended something else.

public  voidpushback()
     Puts the current token back.
public  voidput(String key, String value)
     Maps the given case-insensitive key to the specified value in the current tag's attributes.

The value can be retrieved by calling get with a key that is case-insensitive equal to the given key.

If the attributes already contained a mapping for the given key, the old value is forgotten and the new specified value is used. The case of the prior key is retained in that case.

public static  Stringquote(String str)
     Helper class to quote a attribute's value when the value is being written to the resultant HTML document.
public  voidremove(String key)
     Removes the given case-insensitive key and its corresponding value from the current tag's attributes.
public  voidreset()
     Forgets all the tokens that have been appended to the resultant HTML document so far, including the current token.
public  voidsetTag(String tag)
     Changes the current tag's name.
public  voidsetType(int type)
     Sets the type of the current token.
public  StringtoString()
     Returns the "new" rewritten HTML document.

Field Detail
accumulate
boolean accumulate(Code)
true if nextToken should automatically append unmodified tokens to the result.



appendToken
boolean appendToken(Code)
true if the user has already explicitly appended something, so nextToken shouldn't append the unmodified token.



lex
public LexHTML lex(Code)
The parser for the source HTML document.



map
StringMap map(Code)



pushback
boolean pushback(Code)
true if the last token was pushed back and should be presented again next time. Made false once the pushedback token is presented.



sb
public StringBuffer sb(Code)
Storage holding the resultant HTML document.



tag
String tag(Code)



token
String token(Code)



tokenModified
boolean tokenModified(Code)
true if the user has modified the tag name or attributes of the current tag, so when this tag is appended, we need to write out its parts rather than just emitting the raw token.



type
int type(Code)




Constructor Detail
HtmlRewriter
public HtmlRewriter(LexHTML lex)(Code)
Creates a new HtmlRewriter from the given HTML parser.
Parameters:
  lex - The HTML parser.



HtmlRewriter
public HtmlRewriter(String str)(Code)
Creates a new HtmlRewriter that will operate on the given string.
Parameters:
  str - The HTML document.




Method Detail
accumulate
public boolean accumulate(boolean accumulate)(Code)
Turns on or off the automatic accumulation of each token.

After each token is processed, the current token is appended to to the resultant HTML document unless the user has already appended something else. By setting accumulate to false, this behavior is turned off. The user must then explicitly call appendToken to cause the current token to be appended.

Turning off accumulation takes effect immediately, while turning on accumulation takes effect on the next token. In other words, whether the user turns this setting off or on, the current token will not be added to the resultant HTML document unless the user explicitly calls appendToken.

Following is sample code that illustrates how to use this method to extract the contents of the <head> of the source HTML document.

 HtmlRewriter hr = new HtmlRewriter(str);
 // Don't accumulate tokens until we see the <head> below.
 hr.accumulate(false);
 while (hr.nextTag()) {
 if (hr.getTag().equals("head")) {
 // Start remembering the contents of the HTML document,
 // not including the <head> tag itself.
 hr.accumulate(true);
 } else if (hr.getTag().equals("/head")) {
 // Return everything accumulated so far.
 return hr.toString();
 }
 }
 
This method can be called any number of times while processing the source HTML document.
Parameters:
  accumulate - true to automatically accumulate tokens in theresultant HTML document, false to requirethat the user explicitly accumulate them. The previous accumulate setting
See Also:   HtmlRewriter.reset



append
public void append(String str)(Code)
Instead of modifying an existing token, this method allows the user to completely replace the current token with arbitrary new content.

This method may be called multiple times while processing the current token to add more and more data to the resultant HTML document. Before and/or after calling this method, the appendToken method may also be called explicitly in order to add the current token to the resultant HTML document.

Following is sample code illustrating how to use this method to put bold tags around all the <a> tags.

 HtmlRewriter hr = new HtmlRewriter(str);
 while (hr.nextTag()) {
 if (hr.getTag().equals("a")) {
 hr.append("<b>");
 hr.appendToken();
 } else if (hr.getTag().equals("/a")) {
 hr.appendToken();
 hr.append("</b>");
 }
 }
 
The calls to appendToken are necessary. Otherwise, the HtmlRewriter could not know where and when to append the existing token in addition to the new content provided by the user.
Parameters:
  str - The new content to append. May be null,in which case no new content is appended (the equivalentof appending "").
See Also:   HtmlRewriter.appendToken
See Also:   HtmlRewriter.killToken



appendToken
public void appendToken()(Code)
Appends the current token to the resultant HTML document. If the caller has changed the current token using the setTag, set, or remove methods, those changes will be reflected.

By default, this method is automatically called after each token is processed unless the user has already appended something to the resultant HTML document. Therefore, if the user appends something and also wants to append the current token, or if the user wants to append the current token a number of times, this method must be called.
See Also:   HtmlRewriter.append
See Also:   HtmlRewriter.killToken




get
public String get(String key)(Code)
Returns the value that the specified case-insensitive key maps to in the attributes for the current tag. For keys that were present in the tag's attributes without a value, the value returned is the empty string. In other words, for the tag <table border rows=2>:
  • get("border") returns the empty string "".
  • get("rows") returns 2.

Surrounding single and double quote marks that occur in the literal tag are removed from the values reported. So, for the tag <a href="/foo.html" target=_top onclick='alert("hello")'>:

  • get("href") returns /foo.html .
  • get("target") returns _top .
  • get("onclick") returns alert("hello") .

Parameters:
  The - key to lookup in the current tag's attributes. The value to which the specified key is mapped, ornull if the key was not in the attributes.
See Also:   LexHTML.getAttributes



getArgs
public String getArgs()(Code)
Gets the arguments of the current token as a string. The body.
See Also:   LexHTML.getArgs



getBody
public String getBody()(Code)
Gets the body of the current token as a string. The body.
See Also:   LexHTML.getBody



getTag
public String getTag()(Code)
Gets the current tag's name. The name returned is converted to lower case. The lower-cased tag name, or null if thecurrent token does not have a tag name
See Also:   LexHTML.getTag



getToken
public String getToken()(Code)
Gets the raw string making up the entire current token, including the angle brackets or comment delimiters, if applicable. The current token.
See Also:   LexHTML.getToken



getType
public int getType()(Code)
Gets the type of the current token. The type.
See Also:   LexHTML.getType



keys
public Enumeration keys()(Code)
Returns an enumeration of the keys in the current tag's attributes. The elements of the enumeration are the string keys. The keys can be passed to get to get the values of the attributes. An enumeration of the keys.



killToken
public void killToken()(Code)
Tells this HtmlRewriter not to append the current token to the resultant HTML document. Even if the user hasn't appended anything else, the current token will be ignored rather than appended.
See Also:   HtmlRewriter.append
See Also:   HtmlRewriter.killToken



nextTag
public boolean nextTag()(Code)
A convenence method built on top of nextToken. Advances to the next HTML tag. All intervening strings and comments between the last tag and the new current tag are copied through unchanged. This method can be used when the caller wants to process only HTML tags, without having to manually check the type of each token to see if it is actually a tag. true if there are tokens left to process,false otherwise.



nextToken
public boolean nextToken()(Code)
Advances to the next token in the source HTML document.

The other purpose of this function is to "do the right thing", which is to append the token we just processed to the resultant HTML document, unless the user has already appended something else.

A sample program follows. This program changes all <img> tags to <form> tags, deletes all <table> tags, capitalizes and bolds each string token, and passes all other tokens through unchanged, to illustrate how nextToken interacts with some of the other methods in this class.

 HtmlRewriter hr = new HtmlRewriter(str);
 while (hr.nextToken()) {
 switch (hr.getType()) {
 case LexHTML.TAG: 
 if (hr.getTag().equals("img")) {
 // Change the tag name w/o affecting the attributes.
 hr.setTag("form");
 } else if (hr.getTag().equals("table")) {
 // Eliminate the entire "table" token.
 hr.killToken();
 } 
 break;
 case LexHTML.STRING:
 // Append a new sequence in place of the existing token.
 hr.append("<b>" + hr.getToken().toUpperCase() + "</b>");
 break;
 }
 // Any tokens we didn't modify get copied through unchanged.
 }
 
true if there are tokens left to process,false otherwise.



pushback
public void pushback()(Code)
Puts the current token back. The next time nextToken is called, it will be the current token again, rather than advancing to the next token in the source HTML document.

This is useful when a code fragment needs to read an indefinite number of tokens, but that once some distinguished token is found, needs to push that token back so that normal processing can occur on that token.




put
public void put(String key, String value)(Code)
Maps the given case-insensitive key to the specified value in the current tag's attributes.

The value can be retrieved by calling get with a key that is case-insensitive equal to the given key.

If the attributes already contained a mapping for the given key, the old value is forgotten and the new specified value is used. The case of the prior key is retained in that case. Otherwise the case of the new key is used and a new mapping is made.
Parameters:
  key - The new key. May not be null.
Parameters:
  value - The new value. May be not be null.




quote
public static String quote(String str)(Code)
Helper class to quote a attribute's value when the value is being written to the resultant HTML document. Values set by the put method are automatically quoted as needed. This method is provided in case the user is dynamically constructing a new tag to be appended with append and needs to quote some arbitrary values.

The quoting algorithm is as follows:
If the string contains double-quotes, put single quotes around it.
If the string contains single-quotes or spaces, put double-quotes around it.

This algorithm is, of course, insufficient for complicated strings that include both single and double quotes. In that case, it is the user's responsibility to escape the special characters in the string using the HTML special symbols like &quot; or &#34; The quoted string, or the original string if it did notneed to be quoted.




remove
public void remove(String key)(Code)
Removes the given case-insensitive key and its corresponding value from the current tag's attributes. This method does nothing if the key is not in the attributes.
Parameters:
  key - The key that needs to be removed. Must not benull.



reset
public void reset()(Code)
Forgets all the tokens that have been appended to the resultant HTML document so far, including the current token.



setTag
public void setTag(String tag)(Code)
Changes the current tag's name. The tag's attributes are not changed.
Parameters:
  tag - New tag name



setType
public void setType(int type)(Code)
Sets the type of the current token.



toString
public String toString()(Code)
Returns the "new" rewritten HTML document. This is normally called once all of the tokens have been processed, and the user wants to send on this rewritten document.

At any time, this method can be called to return the current state of the HTML document. The return value is the result of processing the source document up to this point in time; the unprocessed remainder of the source document is not considered.

Due to the implementation, calling this method may be expensive. Specifically, calling this method a second (or further) time for a given HtmlRewriter may involve copying temporary strings around. The pessimal case would be to call this method every time a new token is appended. The rewritten HTML document, up to this point in time.




Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.