Applications using strict XML parsing will either use the
JAXP API or the SAX API to create their parsers. Applications needing
to parse HTML will generally instantiate their own parsers.
There are four parser flavors with the same API:
Xml, LooseXml, Html and LooseHtml. The core of the API is in AbstractParser.
Xml is a strict XML parser.
LooseXml is a forgiving XML parser.
Html is a strict HTML parser.
LooseHtml is a forgiving HTML parser.
You can parse XML into a DOM tree or you can use the SAX callback
API. The core of the API is documented in AbstractParser.
DOM parsing looks like:
Document doc = new Html().parseDocument("test.html");
Parsing directly from a string looks like:
String str = "<em>test html doc</em>";
Document doc = new Html().parseDocumentString(str);
SAX parsing looks like:
Html html = new Html();
html.setContentHandler(myContentHandler);
html.parse("test.html");
Finding and selecting XML nodes using XSL patterns. The Xql package
implements the pattern matching of the July 1999, W3C XSLT draft. It
provides a simple API to find XML nodes.
To find the first table with an image at any
depth beneath the source node:
Node found = Xql.find(".//table[.//img]", node);
Or to select all tables in all sections below the current node:
Iterator iter = Xql.select("section/table", node);
The XSLT transformation package. XSLT transforms XML trees to XML
trees using Stylesheets. The steps for a typical transformation are:
Create the XSLT stylesheet.
Read the source document.
Transform the source document.
The Caucho XSL package supports two related stylesheet languages.
XSLT (W3C 1.0) and 'StyleScript'. Strict XSLT stylesheets are
created by parsing the XML externally, then generating the stylesheet:
StylesheetFactory factory = new Xsl();
Stylesheet style = factory.newStylesheet("mystyle.xsl");
StreamTransformer transformer = style.newStreamTransformer();
WriteStream os = Vfs.openWrite("test.html");
transformer.transform("test.xml", os);
os.close();
StyleScript stylesheets just use a different stylesheet factory.
StylesheetFactory factory = new StyleScript();
Stylesheet style = factory.newStylesheet("mystyle.xsl");
StreamTransformer transformer = style.newStreamTransformer();
WriteStream os = Vfs.openWrite("test.html");
transformer.transform("test.xml", os);
os.close();
Transformers
Resin's XSL provides several different output methods, each represented
by a transformer:
StreamTransformer - print directly to an output stream
StringTransformer - create a string from the result
NodeTransformer - append the results to an XML node