Java Doc for SourceFormatter.java in  » HTML-Parser » jericho-html » au » id » jericho » lib » html » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » HTML Parser » jericho html » au.id.jericho.lib.html 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   au.id.jericho.lib.html.SourceFormatter

SourceFormatter
final public class SourceFormatter implements CharStreamSource(Code)
Formats HTML source by laying out each non-inline-level element on a new line with an appropriate indent.

Any indentation present in the original source text is removed.

Use one of the following methods to obtain the output:

The output text is functionally equivalent to the original source and should be rendered identically unless specified below.

The following points describe the process in general terms. Any aspect of the algorithm not specifically mentioned here is subject to change without notice in future versions.

  • Every element that is not an appears on a new line with an indent corresponding to its in the document element hierarchy.
  • The indent is formed by writing n repetitions of the string specified in the SourceFormatter.setIndentString(String) IndentString property, where n is the depth of the indentation.
  • The of an indented element starts on a new line and is indented at a depth one greater than that of the element, with the end tag appearing on a new line at the same depth as the start tag. If the content contains only text and , it may continue on the same line as the start tag. Additionally, if the output content contains no new lines, the end tag may also continue on the same line.
  • The content of preformatted elements such as HTMLElementName.PRE PRE and HTMLElementName.TEXTAREA TEXTAREA are not indented, nor is the white space modified in any way.
  • Only and elements are indented. All others are treated as .
  • White space and indentation inside HTML , , or any is preserved, but with the indentation of new lines starting at a depth one greater than that of the surrounding text.
  • White space and indentation inside HTMLElementName.SCRIPT SCRIPT elements is preserved, but with the indentation of new lines starting at a depth one greater than that of the SCRIPT element.
  • If the SourceFormatter.setTidyTags(boolean) TidyTags property is set to true, every tag in the document is replaced with the output from its Tag.tidy method. If this property is set to false, the tag from the original text is used, including all white space, but with any new lines indented at a depth one greater than that of the element.
  • If the SourceFormatter.setCollapseWhiteSpace(boolean) CollapseWhiteSpace property is set to true, every string of one or more characters located outside of a tag is replaced with a single space in the output. White space located adjacent to a non-inline-level element tag (except ) may be removed.
  • If the SourceFormatter.setIndentAllElements(boolean) IndentAllElements property is set to true, every element appears indented on a new line, including . This generates output that is a good representation of the actual document element hierarchy, but is very likely to introduce white space that compromises the functional equivalency of the document.
  • The SourceFormatter.setNewLine(String) NewLine property specifies the character sequence to use for each newline in the output document.
  • If the source document contains , the functional equivalency of the output document may be compromised.

Formatting an entire Source object performs a automatically.




Constructor Summary
public  SourceFormatter(Segment segment)
     Constructs a new SourceFormatter based on the specified Segment .

Method Summary
public  booleangetCollapseWhiteSpace()
     Indicates whether in the text between the tags is to be collapsed.
public  longgetEstimatedMaximumOutputLength()
    
public  booleangetIndentAllElements()
     Indicates whether all elements are to be indented, including and those with preformatted contents.
public  StringgetIndentString()
     Returns the string to be used for indentation.
public  StringgetNewLine()
     Returns the string to be used to represent a newline in the output.
public  booleangetTidyTags()
     Indicates whether the original text of each tag is to be replaced with the output from its Tag.tidy method.
public  SourceFormattersetCollapseWhiteSpace(boolean collapseWhiteSpace)
     Sets whether in the text between the tags is to be collapsed.

The default value is false.

If this property is set to true, every string of one or more characters located outside of a tag is replaced with a single space in the output. White space located adjacent to a non-inline-level element tag (except ) may be removed.
Parameters:
  collapseWhiteSpace - specifies whether in the text between the tags is to be collapsed.

public  SourceFormattersetIndentAllElements(boolean indentAllElements)
     Sets whether all elements are to be indented, including and those with preformatted contents.

The default value is false.

If this property is set to true, every element appears indented on a new line, including .

This generates output that is a good representation of the actual document element hierarchy, but is very likely to introduce white space that compromises the functional equivalency of the document.
Parameters:
  indentAllElements - specifies whether all elements are to be indented.

public  SourceFormattersetIndentString(String indentString)
     Sets the string to be used for indentation.

The default value is a string containing a single tab character (U+0009).

The most commonly used indent strings are "\t" (single tab), " " (single space), "  " (2 spaces), and "    " (4 spaces).
Parameters:
  indentString - the string to be used for indentation, must not be null.

public  SourceFormattersetNewLine(String newLine)
     Sets the string to be used to represent a newline in the output.

The default is to use the same new line string as is used in the source document, which is determined via the Source.getNewLine method. If the source document does not contain any new lines, a "best guess" is made by either taking the new line string of a previously parsed document, or using the value from Config.NewLine .

Specifying a null argument resets the property to its default value, which is to use the same new line string as is used in the source document.
Parameters:
  newLine - the string to be used to represent a newline in the output, may be null.

public  SourceFormattersetTidyTags(boolean tidyTags)
     Sets whether the original text of each tag is to be replaced with the output from its Tag.tidy method.

The default value is false.

If this property is set to false, the tag from the original text is used, including all white space, but with any new lines indented at a depth one greater than that of the element.
Parameters:
  tidyTags - specifies whether the original text of each tag is to be replaced with the output from its Tag.tidy method.

public  StringtoString()
    
public  voidwriteTo(Writer writer)
    


Constructor Detail
SourceFormatter
public SourceFormatter(Segment segment)(Code)
Constructs a new SourceFormatter based on the specified Segment .
Parameters:
  segment - the segment containing the HTML to be formatted.
See Also:   Source.getSourceFormatter




Method Detail
getCollapseWhiteSpace
public boolean getCollapseWhiteSpace()(Code)
Indicates whether in the text between the tags is to be collapsed.

See the SourceFormatter.setCollapseWhiteSpace(boolean collapseWhiteSpace) method for a full description of this property. true if in the text between the tags is to be collapsed, otherwise false.




getEstimatedMaximumOutputLength
public long getEstimatedMaximumOutputLength()(Code)



getIndentAllElements
public boolean getIndentAllElements()(Code)
Indicates whether all elements are to be indented, including and those with preformatted contents.

See the SourceFormatter.setIndentAllElements(boolean) method for a full description of this property. true if all elements are to be indented, otherwise false.




getIndentString
public String getIndentString()(Code)
Returns the string to be used for indentation.

See the SourceFormatter.setIndentString(String) method for a full description of this property. the string to be used for indentation.




getNewLine
public String getNewLine()(Code)
Returns the string to be used to represent a newline in the output.

See the SourceFormatter.setNewLine(String) method for a full description of this property. the string to be used to represent a newline in the output.




getTidyTags
public boolean getTidyTags()(Code)
Indicates whether the original text of each tag is to be replaced with the output from its Tag.tidy method.

See the SourceFormatter.setTidyTags(boolean) method for a full description of this property. true if the original text of each tag is to be replaced with the output from its Tag.tidy method, otherwise false.




setCollapseWhiteSpace
public SourceFormatter setCollapseWhiteSpace(boolean collapseWhiteSpace)(Code)
Sets whether in the text between the tags is to be collapsed.

The default value is false.

If this property is set to true, every string of one or more characters located outside of a tag is replaced with a single space in the output. White space located adjacent to a non-inline-level element tag (except ) may be removed.
Parameters:
  collapseWhiteSpace - specifies whether in the text between the tags is to be collapsed. this SourceFormatter instance, allowing multiple property setting methods to be chained in a single statement.
See Also:   SourceFormatter.getCollapseWhiteSpace()




setIndentAllElements
public SourceFormatter setIndentAllElements(boolean indentAllElements)(Code)
Sets whether all elements are to be indented, including and those with preformatted contents.

The default value is false.

If this property is set to true, every element appears indented on a new line, including .

This generates output that is a good representation of the actual document element hierarchy, but is very likely to introduce white space that compromises the functional equivalency of the document.
Parameters:
  indentAllElements - specifies whether all elements are to be indented. this SourceFormatter instance, allowing multiple property setting methods to be chained in a single statement.
See Also:   SourceFormatter.getIndentAllElements()




setIndentString
public SourceFormatter setIndentString(String indentString)(Code)
Sets the string to be used for indentation.

The default value is a string containing a single tab character (U+0009).

The most commonly used indent strings are "\t" (single tab), " " (single space), "  " (2 spaces), and "    " (4 spaces).
Parameters:
  indentString - the string to be used for indentation, must not be null. this SourceFormatter instance, allowing multiple property setting methods to be chained in a single statement.
See Also:   SourceFormatter.getIndentString()




setNewLine
public SourceFormatter setNewLine(String newLine)(Code)
Sets the string to be used to represent a newline in the output.

The default is to use the same new line string as is used in the source document, which is determined via the Source.getNewLine method. If the source document does not contain any new lines, a "best guess" is made by either taking the new line string of a previously parsed document, or using the value from Config.NewLine .

Specifying a null argument resets the property to its default value, which is to use the same new line string as is used in the source document.
Parameters:
  newLine - the string to be used to represent a newline in the output, may be null. this SourceFormatter instance, allowing multiple property setting methods to be chained in a single statement.
See Also:   SourceFormatter.getNewLine()




setTidyTags
public SourceFormatter setTidyTags(boolean tidyTags)(Code)
Sets whether the original text of each tag is to be replaced with the output from its Tag.tidy method.

The default value is false.

If this property is set to false, the tag from the original text is used, including all white space, but with any new lines indented at a depth one greater than that of the element.
Parameters:
  tidyTags - specifies whether the original text of each tag is to be replaced with the output from its Tag.tidy method. this SourceFormatter instance, allowing multiple property setting methods to be chained in a single statement.
See Also:   SourceFormatter.getTidyTags()




toString
public String toString()(Code)



writeTo
public void writeTo(Writer writer) throws IOException(Code)



Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.