Class StreamingXMLReader
public class StreamingXMLReader
extends java.lang.Object
Woodstox stream reader to extract
top level XML nodes along with metadata describing their location in the XML file, and send them to an
XMLStreamProcessor.
Using the streaming reader allows large files to be processed without significant overhead.
-
Constructor Summary
Constructors Constructor Description StreamingXMLReader() -
Method Summary
Modifier and Type Method Description java.lang.StringgetCharacterEncodingScheme()Returns the character encoding declared on the xml declaration Returns null if none was declaredjava.util.Map<java.lang.String,java.lang.String>getDocumentElementAttributeNamespaces()A map containing the namespace prefix to URI pairs from the document elementjava.util.Map<java.lang.String,java.lang.String>getDocumentElementAttributes()A map containing the attribute name-value pairs from the document elementjava.lang.StringgetDocumentElementTagName()The local name of the document element tagorg.w3c.dom.DocumentgetEmptyDocument()org.w3c.dom.DocumentgetEmptyDocument(java.io.File file)java.lang.StringgetEncoding()Return input encoding if known or null if unknown.OuterDocumentgetOuterDocument()OuterDocumentgetOuterDocument(java.io.File file)java.lang.StringgetPrefix()Returns the prefix of the current event or null if the event does not have a prefixjava.lang.StringgetVersion()Get the xml version declared on the xml declaration Returns null if none was declaredvoidreadFile(java.io.File file, XMLStreamProcessor processor)Read an XML file using theWoodstoxstreaming API and supply theXMLStreamProcessorwithFragmentobjects.voidreadFile(java.io.File file, XMLStreamProcessor processor, java.util.List<java.lang.String> targetPaths)Read an XML file using theWoodstoxstreaming API and supply theXMLStreamProcessorwithFragmentobjects.
-
Constructor Details
-
StreamingXMLReader
public StreamingXMLReader() throws javax.xml.parsers.ParserConfigurationException- Throws:
javax.xml.parsers.ParserConfigurationException
-
-
Method Details
-
readFile
public void readFile(java.io.File file, XMLStreamProcessor processor) throws java.io.IOException, javax.xml.stream.XMLStreamExceptionRead an XML file using theWoodstoxstreaming API and supply theXMLStreamProcessorwithFragmentobjects.All (and only) top level elements are returned. For example, given an XML file with a structure like
<feed> <product></product> <product></product> <product></product> </feed>All
productnodes will be returned.Fragmentobjects wrap the dom node as aDocumentFragmentand anXMLByteLocationobject that describes the node's location in the XML file. This allows efficient retrieval of the nodes later using theRandomAccessXMLReader- Parameters:
file- The XML file to processprocessor-XMLStreamProcessorinstance- Throws:
java.io.FileNotFoundExceptionjavax.xml.stream.XMLStreamExceptionjava.io.IOException
-
readFile
public void readFile(java.io.File file, XMLStreamProcessor processor, java.util.List<java.lang.String> targetPaths) throws java.io.IOException, javax.xml.stream.XMLStreamExceptionRead an XML file using theWoodstoxstreaming API and supply theXMLStreamProcessorwithFragmentobjects. Specify aListof node paths to extract. Example:<xml> <Feed> <Category> <Name>Bolts</Name> <Product>Large</Product> <Product>Small</Product> <Services> <Service>Tightening</Service> <Service>Loosening</Service> </Services> </Category> <Category> <Name>Hammers</Name> <Product>Framing</Product> <Product>Dead Blow</Product> <Services> <Service>Banging</Service> </Services> </Category> </Feed>You can extract all
productandserviceelements in the same read operation by passing in these paths:
/feed/category/Product
/feed/category/Services/ServiceNamespace prefixes may be specified as they appear in the XML:
/aw:PurchaseOrders/aw:PurchaseOrder/aw:AddressPaths are absolute with respect to the document root, they will be normalized to always have a leading slash and never have a trailing slash. Overlapping paths are not supported, the least specific path will be used in such a case.
Fragmentobjects wrap the dom node as aDocumentFragmentand anXMLByteLocationobject that describes the node's location in the XML file. This allows efficient retrieval of the nodes later using theRandomAccessXMLReader- Parameters:
file- The XML file to processprocessor-XMLStreamProcessorinstancetargetPaths-Listof node paths to target for extraction- Throws:
java.io.FileNotFoundExceptionjavax.xml.stream.XMLStreamExceptionjava.io.IOException
-
getDocumentElementAttributes
public java.util.Map<java.lang.String,java.lang.String> getDocumentElementAttributes()A map containing the attribute name-value pairs from the document element- Returns:
Map<String,String>
-
getDocumentElementAttributeNamespaces
public java.util.Map<java.lang.String,java.lang.String> getDocumentElementAttributeNamespaces()A map containing the namespace prefix to URI pairs from the document element- Returns:
Map<String,String>
-
getDocumentElementTagName
public java.lang.String getDocumentElementTagName()The local name of the document element tag- Returns:
- The local name of the document element tag
-
getPrefix
public java.lang.String getPrefix()Returns the prefix of the current event or null if the event does not have a prefix- Returns:
- the prefix or null
-
getCharacterEncodingScheme
public java.lang.String getCharacterEncodingScheme()Returns the character encoding declared on the xml declaration Returns null if none was declared- Returns:
- the encoding declared in the document or null
- See Also:
XMLStreamReader
-
getEncoding
public java.lang.String getEncoding()Return input encoding if known or null if unknown.- Returns:
- the encoding of this instance or null
- See Also:
XMLStreamReader
-
getVersion
public java.lang.String getVersion()Get the xml version declared on the xml declaration Returns null if none was declared- Returns:
- the XML version or null
- See Also:
XMLStreamReader
-
getEmptyDocument
public org.w3c.dom.Document getEmptyDocument(java.io.File file) throws java.io.FileNotFoundException, javax.xml.stream.XMLStreamException, javax.management.modelmbean.XMLParseException- Parameters:
file- XML file to extract an empty document for- Returns:
Documentcontaining only the document element from the file provided- Throws:
java.io.FileNotFoundExceptionjavax.xml.stream.XMLStreamExceptionjavax.management.modelmbean.XMLParseException
-
getEmptyDocument
public org.w3c.dom.Document getEmptyDocument() throws javax.management.modelmbean.XMLParseException- Returns:
Documentcontaining only the document element from the last file provided to this instance of StreamingXMLReader- Throws:
javax.management.modelmbean.XMLParseException
-
getOuterDocument
- Parameters:
file- XML file to parse into anOuterDocument- Returns:
OuterDocumentwrapper containing the emptyDocumentcontaining only the document element from the file provided- Throws:
java.lang.Exception
-
getOuterDocument
- Returns:
OuterDocumentwrapper containing the emptyDocumentcontaining only the document element from the last file provided to this instance ofStreamingXMLReader- Throws:
java.lang.Exception
-