javax.xml.validation
This package provides an API for validation of XML documents. Validation is the process of verifying
that an XML document is an instance of a specified XML schema. An XML schema defines the
content model (also called a grammar or vocabulary) that its instance documents
will represent.
There are a number of popular technologies available for creating an XML schema. Some of the most
popular include:
- Document Type Definition (DTD) - XML's built-in schema language.
- W3C XML Schema (WXS) - an object-oriented XML schema
language. WXS also provides a type system for constraining the character data of an XML document.
WXS is maintained by the World Wide Web Consortium (W3C) and is a W3C
Recommendation (that is, a ratified W3C standard specification).
- RELAX NG (RNG) - a pattern-based,
user-friendly XML schema language. RNG schemas may also use types to constrain XML character data.
RNG is maintained by the Organization for the Advancement of
Structured Information Standards (OASIS) and is both an OASIS and an
ISO (International Organization for Standardization) standard.
- Schematron - a rules-based XML schema
language. Whereas DTD, WXS, and RNG are designed to express the structure of a content model,
Schematron is designed to enforce individual rules that are difficult or impossible to express
with other schema languages. Schematron is intended to supplement a schema written in
structural schema language such as the aforementioned. Schematron is in the process
of becoming an ISO standard.
Previous versions of JAXP supported validation as a feature of an XML parser, represented by
either a {@link javax.xml.parsers.SAXParser} or {@link javax.xml.parsers.DocumentBuilder} instance.
The JAXP validation API decouples the validation of an instance document from the parsing of an
XML document. This is advantageous for several reasons, some of which are:
- Support for additional schema langauges. As of JDK 1.5, the two most
popular JAXP parser implementations, Crimson and Xerces, only support a subset of the available
XML schema languages. The Validation API provides a standard mechanism through which applications
may take of advantage of specialization validation libraries which support additional schema
languages.
- Easy runtime coupling of an XML instance and schema. Specifying the location
of a schema to use for validation with JAXP parsers can be confusing. The Validation API makes this
process simple (see example below).
Usage example. The following example demonstrates validating
an XML document with the Validation API (for readability, some exception handling is not shown):
// parse an XML document into a DOM tree
DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document document = parser.parse(new File("instance.xml"));
// create a SchemaFactory capable of understanding WXS schemas
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
// load a WXS schema, represented by a Schema instance
Source schemaFile = new StreamSource(new File("mySchema.xsd"));
Schema schema = factory.newSchema(schemaFile);
// create a Validator instance, which can be used to validate an instance document
Validator validator = schema.newValidator();
// validate the DOM tree
try {
validator.validate(new DOMSource(document));
} catch (SAXException e) {
// instance document is invalid!
}
The JAXP parsing API has been integrated with the Validation API. Applications may create a {@link javax.xml.validation.Schema} with the validation API
and associate it with a {@link javax.xml.parsers.DocumentBuilderFactory} or a {@link javax.xml.parsers.SAXParserFactory} instance
by using the {@link javax.xml.parsers.DocumentBuilderFactory#setSchema(Schema)} and {@link javax.xml.parsers.SAXParserFactory#setSchema(Schema)}
methods. You should not both set a schema and call setValidating(true) on a parser factory. The former technique
will cause parsers to use the new validation API; the latter will cause parsers to use their own internal validation
facilities. Turning on both of these options simultaneously will cause either redundant behavior or error conditions.
|