net.sf.saxon.s9api

Class DocumentBuilder


public class DocumentBuilder
extends java.lang.Object

A document builder holds properties controlling how a Saxon document tree should be built, and provides methods to invoke the tree construction.

This class has no public constructor. Users should construct a DocumentBuilder by calling the factory method Processor.newDocumentBuilder().

All documents used in a single Saxon query, transformation, or validation episode must be built with the same Configuration. However, there is no requirement that they should use the same DocumentBuilder.

Since:
9.0

Constructor Summary

DocumentBuilder(Configuration config)
Create a DocumentBuilder.

Method Summary

XdmNode
build(File file)
Build a document from a supplied XML file
XdmNode
build(Source source)
Load an XML document, to create a tree representation of the document in memory.
URI
getBaseURI()
Get the base URI of documents loaded using this DocumentBuilder when no other URI is available.
SchemaValidator
getSchemaValidator()
Get the SchemaValidator used to validate documents loaded using this DocumentBuilder.
WhitespaceStrippingPolicy
getWhitespaceStrippingPolicy()
Get the white whitespace stripping policy applied when loading a document using this DocumentBuilder.
boolean
isDTDValidation()
Ask whether DTD validation is to be applied to documents loaded using this DocumentBuilder
boolean
isLineNumbering()
Ask whether line numbering is enabled for documents loaded using this DocumentBuilder.
boolean
isRetainPSVI()
Ask whether the constructed tree should contain information derived from schema validation, specifically whether it should contain type annotations and expanded defaults of missing element and attribute content.
void
setBaseURI(URI uri)
Set the base URI of a document loaded using this DocumentBuilder.
void
setDTDValidation(boolean option)
Set whether DTD validation should be applied to documents loaded using this DocumentBuilder.
void
setLineNumbering(boolean option)
Set whether line numbering is to be enabled for documents constructed using this DocumentBuilder.
void
setRetainPSVI(boolean retainPSVI)
Set whether the constructed tree should contain information derived from schema validation, specifically whether it should contain type annotations and expanded defaults of missing element and attribute content.
void
setSchemaValidator(SchemaValidator validator)
Set the schemaValidator to be used.
void
setWhitespaceStrippingPolicy(WhitespaceStrippingPolicy policy)
Set the whitespace stripping policy applied when loading a document using this DocumentBuilder.
XdmNode
wrap(Object node)
Create a node by wrapping a recognized external node from a supported object model.

Constructor Details

DocumentBuilder

protected DocumentBuilder(Configuration config)
Create a DocumentBuilder. This is a protected constructor. Users should construct a DocumentBuilder by calling the factory method Processor.newDocumentBuilder().
Parameters:
config - the Saxon configuration

Method Details

build

public XdmNode build(File file)
            throws SaxonApiException
Build a document from a supplied XML file
Parameters:
file - the supplied file
Returns:
the XdmNode representing the root of the document tree
Throws:
SaxonApiException - if any failure occurs retrieving or parsing the document

build

public XdmNode build(Source source)
            throws SaxonApiException
Load an XML document, to create a tree representation of the document in memory.
Parameters:
source - A JAXP Source object identifying the source of the document. This can always be a javax.xml.transform.stream.StreamSource or a javax.xml.transform.sax.SAXSource.

An instance of javax.xml.transform.dom.DOMSource is accepted provided that the Saxon support code for DOM (in saxon9-dom.jar) is on the classpath.

If the source is an instance of NodeInfo then the subtree rooted at this node will be copied (applying schema validation if requested) to create a new tree.

Saxon also accepts an instance of PullSource, which can be used to supply a document that is to be parsed using a StAX parser.

Returns:
An XdmNode. This will be the document node at the root of the tree of the resulting in-memory document.

getBaseURI

public URI getBaseURI()
Get the base URI of documents loaded using this DocumentBuilder when no other URI is available.
Returns:
the base URI to be used, or null if no value has been set.

getSchemaValidator

public SchemaValidator getSchemaValidator()
Get the SchemaValidator used to validate documents loaded using this DocumentBuilder.
Returns:
the SchemaValidator if one has been set; otherwise null.

getWhitespaceStrippingPolicy

public WhitespaceStrippingPolicy getWhitespaceStrippingPolicy()
Get the white whitespace stripping policy applied when loading a document using this DocumentBuilder.
Returns:
the policy for stripping whitespace-only text nodes

isDTDValidation

public boolean isDTDValidation()
Ask whether DTD validation is to be applied to documents loaded using this DocumentBuilder
Returns:
true if DTD validation is to be applied

isLineNumbering

public boolean isLineNumbering()
Returns:
true if line numbering is enabled

isRetainPSVI

public boolean isRetainPSVI()
Ask whether the constructed tree should contain information derived from schema validation, specifically whether it should contain type annotations and expanded defaults of missing element and attribute content. If no schema validator is set then this option has no effect.

Not yet implemented.

Returns:
true, if the constructed tree will contain type annotations and expanded defaults of missing element and attribute content. Return false, if the tree that is returned will be the same as if schema validation did not take place (except that if the document is invalid, no tree will be constructed)

setBaseURI

public void setBaseURI(URI uri)
Set the base URI of a document loaded using this DocumentBuilder.

This is used for resolving any relative URIs appearing within the document, for example in references to DTDs and external entities.

This information is required when the document is loaded from a source that does not provide an intrinsic URI, notably when loading from a Stream or a DOMSource. The value is ignored when loading from a source that does have an intrinsic base URI.

Parameters:
uri - the base URI of documents loaded using this DocumentBuilder. This must be an absolute URI.

setDTDValidation

public void setDTDValidation(boolean option)
Set whether DTD validation should be applied to documents loaded using this DocumentBuilder.

By default, no DTD validation takes place.

Parameters:
option - true if DTD validation is to be applied to the document

setLineNumbering

public void setLineNumbering(boolean option)
Parameters:
option - true if line numbers are to be maintained, false otherwise.

setRetainPSVI

public void setRetainPSVI(boolean retainPSVI)
Set whether the constructed tree should contain information derived from schema validation, specifically whether it should contain type annotations and expanded defaults of missing element and attribute content. If no schema validator is set then this option has no effect. The default value is true.

Not yet implemented.

Parameters:
retainPSVI - if true, the constructed tree will contain type annotations and expanded defaults of missing element and attribute content. If false, the tree that is returned will be the same as if schema validation did not take place (except that if the document is invalid, no tree will be constructed)

setSchemaValidator

public void setSchemaValidator(SchemaValidator validator)
Set the schemaValidator to be used. This determines whether schema validation is applied to an input document and whether type annotations in a supplied document are retained. If no schemaValidator is supplied, then schema validation does not take place.

This option requires the schema-aware version of the Saxon product (Saxon-SA).

Parameters:
validator - the SchemaValidator to be used

setWhitespaceStrippingPolicy

public void setWhitespaceStrippingPolicy(WhitespaceStrippingPolicy policy)
Set the whitespace stripping policy applied when loading a document using this DocumentBuilder.

By default, whitespace text nodes appearing in element-only content are stripped, and all other whitespace text nodes are retained.

Parameters:
policy - the policy for stripping whitespace-only text nodes from source documents

wrap

public XdmNode wrap(Object node)
            throws IllegalArgumentException
Create a node by wrapping a recognized external node from a supported object model. The support module for the external object model must be on the class path and registered with the Saxon configuration.

It is best to avoid calling this method repeatedly to wrap different nodes in the same document. Each such wrapper conceptually creates a new XDM tree instance with its own identity. Although the memory is shared, operations that rely on node identity might not have the expected result. It is best to create a single wrapper for the document node, and then to navigate to the other nodes in the tree using S9API interfaces.

Parameters:
node - the node in the external tree representation
Returns:
the supplied node wrapped as an XdmNode