Filter to write an XML document from a SAX event stream.
Code and comments adapted from XMLWriter-0.2, written
by David Megginson and released into the public domain,
without warranty.
This class can be used by itself or as part of a SAX event
stream: it takes as input a series of SAX2 ContentHandler
events and uses the information in those events to write
an XML document. Since this class is a filter, it can also
pass the events on down a filter chain for further processing
(you can use the XMLWriter to take a snapshot of the current
state at any point in a filter chain), and it can be
used directly as a ContentHandler for a SAX2 XMLReader.
The client creates a document by invoking the methods for
standard SAX2 events, always beginning with the
XMLWriter.startDocument startDocument method and ending with
the
XMLWriter.endDocument endDocument method.
The following code will send a simple XML document to
standard output:
XMLWriter w = new XMLWriter();
w.startDocument();
w.dataElement("greeting", "Hello, world!");
w.endDocument();
The resulting document will look like this:
<?xml version="1.0"?>
<greeting>Hello, world!</greeting>
Whitespace
According to the XML Recommendation, all whitespace
in an XML document is potentially significant to an application,
so this class never adds newlines or indentation. If you
insert three elements in a row, as in
w.dataElement("item", "1");
w.dataElement("item", "2");
w.dataElement("item", "3");
you will end up with
<item>1</item><item>3</item><item>3</item>
You need to invoke one of the characters methods
explicitly to add newlines or indentation. Alternatively, you
can use
samples.sax.DataFormatFilter DataFormatFilter add linebreaks and indentation (but does not support mixed content
properly).
Namespace Support
The writer contains extensive support for XML Namespaces, so that
a client application does not have to keep track of prefixes and
supply xmlns attributes. By default, the XML writer will
generate Namespace declarations in the form _NS1, _NS2, etc., wherever
they are needed, as in the following example:
w.startDocument();
w.emptyElement("http://www.foo.com/ns/", "foo");
w.endDocument();
The resulting document will look like this:
<?xml version="1.0"?>
<_NS1:foo xmlns:_NS1="http://www.foo.com/ns/"/>
In many cases, document authors will prefer to choose their
own prefixes rather than using the (ugly) default names. The
XML writer allows two methods for selecting prefixes:
- the qualified name
- the
XMLWriter.setPrefix setPrefix method.
Whenever the XML writer finds a new Namespace URI, it checks
to see if a qualified (prefixed) name is also available; if so
it attempts to use the name's prefix (as long as the prefix is
not already in use for another Namespace URI).
Before writing a document, the client can also pre-map a prefix
to a Namespace URI with the setPrefix method:
w.setPrefix("http://www.foo.com/ns/", "foo");
w.startDocument();
w.emptyElement("http://www.foo.com/ns/", "foo");
w.endDocument();
The resulting document will look like this:
<?xml version="1.0"?>
<foo:foo xmlns:foo="http://www.foo.com/ns/"/>
The default Namespace simply uses an empty string as the prefix:
w.setPrefix("http://www.foo.com/ns/", "");
w.startDocument();
w.emptyElement("http://www.foo.com/ns/", "foo");
w.endDocument();
The resulting document will look like this:
<?xml version="1.0"?>
<foo xmlns="http://www.foo.com/ns/"/>
By default, the XML writer will not declare a Namespace until
it is actually used. Sometimes, this approach will create
a large number of Namespace declarations, as in the following
example:
<xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description about="http://www.foo.com/ids/books/12345">
<dc:title xmlns:dc="http://www.purl.org/dc/">A Dark Night</dc:title>
<dc:creator xmlns:dc="http://www.purl.org/dc/">Jane Smith</dc:title>
<dc:date xmlns:dc="http://www.purl.org/dc/">2000-09-09</dc:title>
</rdf:Description>
</rdf:RDF>
The "rdf" prefix is declared only once, because the RDF Namespace
is used by the root element and can be inherited by all of its
descendants; the "dc" prefix, on the other hand, is declared three
times, because no higher element uses the Namespace. To solve this
problem, you can instruct the XML writer to predeclare Namespaces
on the root element even if they are not used there:
w.forceNSDecl("http://www.purl.org/dc/");
Now, the "dc" prefix will be declared on the root element even
though it's not needed there, and can be inherited by its
descendants:
<xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://www.purl.org/dc/">
<rdf:Description about="http://www.foo.com/ids/books/12345">
<dc:title>A Dark Night</dc:title>
<dc:creator>Jane Smith</dc:title>
<dc:date>2000-09-09</dc:title>
</rdf:Description>
</rdf:RDF>
This approach is also useful for declaring Namespace prefixes
that be used by qualified names appearing in attribute values or
character data.
See Also: XMLFilterBase |