org.cyberneko.html.filters |
|
Java Source File Name | Type | Comment |
DefaultFilter.java | Class | This class implements a filter that simply passes document
events to the next handler. |
ElementRemover.java | Class | This class is a document filter capable of removing specified
elements from the processing stream. |
Identity.java | Class | This filter performs the identity operation of the original
document event stream generated by the HTML scanner by removing
events that are synthesized by the tag balancer. |
NamespaceBinder.java | Class | This filter binds namespaces if namespace processing is turned on
by setting the feature "http://xml.org/sax/features/namespaces" is
set to true . |
Purifier.java | Class | This filter purifies the HTML input to ensure XML well-formedness.
The purification process includes:
- fixing illegal characters in the document, including
- element and attribute names,
- processing instruction target and data,
- document text;
- ensuring the string "--" does not appear in the content of
a comment;
- ensuring the string "]]>" does not appear in the content of
a CDATA section;
- ensuring that the XML declaration has required pseudo-attributes
and that the values are correct;
and
- synthesized missing namespace bindings.
Illegal characters in XML names are converted to the character
sequence "_u####_" where "####" is the value of the Unicode
character represented in hexadecimal. |
Writer.java | Class | An HTML writer written as a filter. |