Builds an HTMLPage object from an HTML document. This behaves
similarly to the FastPageParser, however it's a complete rewrite that is simpler to add custom features to such as
extraction and transformation of elements.
To customize the rules used, this class can be extended and have the userDefinedRules() methods overridden.
author: Joe Walnes See Also: HTMLProcessor |