Java Doc for Clean.java in  » HTML-Parser » JTidy » org » w3c » tidy » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » HTML Parser » JTidy » org.w3c.tidy 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   org.w3c.tidy.Clean

Clean
public class Clean (Code)
Clean up misuse of presentation markup. Filters from other formats such as Microsoft Word often make excessive use of presentation markup such as font tags, B, I, and the align attribute. By applying a set of production rules, it is straight forward to transform this to use CSS. Some rules replace some of the children of an element by style properties on the element, e.g.

...

.

...

Such rules are applied to the element's content and then to the element itself until none of the rules more apply. Having applied all the rules to an element, it will have a style attribute with one or more properties. Other rules strip the element they apply to, replacing it by style properties on the contents, e.g.
  • ...

  • .

    ... These rules are applied to an element before processing its content and replace the current element by the first element in the exposed content. After applying both sets of rules, you can replace the style attribute by a class value and style rule in the document head. To support this, an association of styles and class names is built. A naive approach is to rely on string matching to test when two property lists are the same. A better approach would be to first sort the properties before matching.
    author:
       Dave Raggett dsr@w3.org
    author:
       Andy Quick ac.quick@sympatico.ca (translation to Java)
    author:
       Fabrizio Giustina
    version:
       $Revision: 1.25 $ ($Author: fgiust $)




    Constructor Summary
    public  Clean(TagTable tagTable)
         Instantiates a new Clean.

    Method Summary
    public  voidbQ2Div(Node node)
         Replace implicit blockquote by div with an indent taking care to reduce nested blockquotes to a single div with the indent set to match the nesting depth.
    static  voidbumpObject(Lexer lexer, Node html)
         Where appropriate move object elements from head to body.
    public  voidcleanTree(Lexer lexer, Node doc)
         Clean an html tree.
    public  voidcleanWord2000(Lexer lexer, Node node)
         This is a major clean up to strip out all the extra stuff you get when you save as web page from Word 2000.
    public  voiddropSections(Lexer lexer, Node node)
         Drop if/endif sections inserted by word2000.
    public  voidemFromI(Node node)
         Replace i by em and b by strong.
     NodefindEnclosingCell(Node node)
         Find the enclosing table cell for the given node.
    public  booleanisWord2000(Node root)
         Check if the current document is a converted Word document.
    public  voidlist2BQ(Node node)
         Some people use dir or ul without an li to indent the content.
    public  voidnestedEmphasis(Node node)
         simplifies ...
     booleannoMargins(Node node)
         Used to hunt for hidden preformatted sections.
    public  NodepruneSection(Lexer lexer, Node node)
         node is <![if ...]> prune up to <![endif]>.
    public  voidpurgeWord2000Attributes(Node node)
         Remove word2000 attributes from node.
     booleansingleSpace(Lexer lexer, Node node)
        
    public  NodestripSpan(Lexer lexer, Node span)
         Word2000 uses span excessively, so we strip span out.


    Constructor Detail
    Clean
    public Clean(TagTable tagTable)(Code)
    Instantiates a new Clean.
    Parameters:
      tagTable - tag table instance




    Method Detail
    bQ2Div
    public void bQ2Div(Node node)(Code)
    Replace implicit blockquote by div with an indent taking care to reduce nested blockquotes to a single div with the indent set to match the nesting depth.
    Parameters:
      node - root Node



    bumpObject
    static void bumpObject(Lexer lexer, Node html)(Code)
    Where appropriate move object elements from head to body.
    Parameters:
      lexer - Lexer
    Parameters:
      html - html node



    cleanTree
    public void cleanTree(Lexer lexer, Node doc)(Code)
    Clean an html tree.
    Parameters:
      lexer - Lexer
    Parameters:
      doc - root node



    cleanWord2000
    public void cleanWord2000(Lexer lexer, Node node)(Code)
    This is a major clean up to strip out all the extra stuff you get when you save as web page from Word 2000. It doesn't yet know what to do with VML tags, but these will appear as errors unless you declare them as new tags, such as o:p which needs to be declared as inline.
    Parameters:
      lexer - Lexer
    Parameters:
      node - node to clean up



    dropSections
    public void dropSections(Lexer lexer, Node node)(Code)
    Drop if/endif sections inserted by word2000.
    Parameters:
      lexer - Lexer
    Parameters:
      node - Node root node



    emFromI
    public void emFromI(Node node)(Code)
    Replace i by em and b by strong.
    Parameters:
      node - root Node



    findEnclosingCell
    Node findEnclosingCell(Node node)(Code)
    Find the enclosing table cell for the given node.
    Parameters:
      node - Node enclosing cell node



    isWord2000
    public boolean isWord2000(Node root)(Code)
    Check if the current document is a converted Word document.
    Parameters:
      root - root Node true if the document has been geenrated by Microsoft Word.



    list2BQ
    public void list2BQ(Node node)(Code)
    Some people use dir or ul without an li to indent the content. The pattern to look for is a list with a single implicit li. This is recursively replaced by an implicit blockquote.
    Parameters:
      node - root Node



    nestedEmphasis
    public void nestedEmphasis(Node node)(Code)
    simplifies ... ... etc.
    Parameters:
      node - root Node



    noMargins
    boolean noMargins(Node node)(Code)
    Used to hunt for hidden preformatted sections.
    Parameters:
      node - checked node true if the node has a "margin-top: 0" or "margin-bottom: 0" style



    pruneSection
    public Node pruneSection(Lexer lexer, Node node)(Code)
    node is <![if ...]> prune up to <![endif]>.
    Parameters:
      lexer - Lexer
    Parameters:
      node - Node cleaned up Node



    purgeWord2000Attributes
    public void purgeWord2000Attributes(Node node)(Code)
    Remove word2000 attributes from node.
    Parameters:
      node - node to cleanup



    singleSpace
    boolean singleSpace(Lexer lexer, Node node)(Code)
    Does element have a single space as its content?
    Parameters:
      lexer - Lexer
    Parameters:
      node - checked node true if the element has a single space as its content



    stripSpan
    public Node stripSpan(Lexer lexer, Node span)(Code)
    Word2000 uses span excessively, so we strip span out.
    Parameters:
      lexer - Lexer
    Parameters:
      span - Node span cleaned node



    Methods inherited from java.lang.Object
    native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
    public boolean equals(Object obj)(Code)(Java Doc)
    protected void finalize() throws Throwable(Code)(Java Doc)
    final native public Class getClass()(Code)(Java Doc)
    native public int hashCode()(Code)(Java Doc)
    final native public void notify()(Code)(Java Doc)
    final native public void notifyAll()(Code)(Java Doc)
    public String toString()(Code)(Java Doc)
    final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
    final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
    final public void wait() throws InterruptedException(Code)(Java Doc)

    www.java2java.com | Contact Us
    Copyright 2009 - 12 Demo Source and Support. All rights reserved.
    All other trademarks are property of their respective owners.