XmlReader (Data Pipeline JavaDoc)

java.lang.Object
- com.northconcepts.datapipeline.core.DataObject
- - com.northconcepts.datapipeline.core.Endpoint
  - - com.northconcepts.datapipeline.core.DataEndpoint
    - - com.northconcepts.datapipeline.core.DataReader
      - com.northconcepts.datapipeline.xml.XmlReader

Direct Known Subclasses:: JavaBeanReader, JsonReader

public class XmlReader
extends DataReader

Obtains records from an XML stream. See the Read an XML file example.

XmlReader's addField(String, String), addField(String, String, boolean), and addRecordBreak(String) methods use a subset of the XPath 1.0 location paths notation to identify field values and demarcate records.

Axis Specifiers

Axis Abbreviated Syntax Supported Examples
ancestor
ancestor-or-self
attribute @ yes @lang or attribute::lang
child yes title or child::title
descendant yes
descendant-or-self // yes //book or /descendant-or-self::book/
following
following-sibling
namespace
parent ..
preceding
preceding-sibling
self . yes

Node Tests

comment(), text(), processing-instruction(), node() are all supported

Predicates

None supported

Functions and Operators

None supported

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class XmlReader.DuplicateFieldPolicy
- Nested classes/interfaces inherited from class com.northconcepts.datapipeline.core.DataEndpoint
  DataEndpoint.State

Nested Classes
Modifier and Type	Class and Description
`static class`	`XmlReader.DuplicateFieldPolicy`

Field Summary

Fields
Modifier and Type	Field and Description
`protected Record`	`currentRecord`
`protected List<XmlField>`	`fields`
`protected File`	`file`
`protected boolean`	`hasCascadingFields`
`protected XmlNodeReader`	`reader`
`protected List<XmlRecordBreak>`	`recordBreaks`

Fields inherited from class com.northconcepts.datapipeline.core.DataReader
fieldLineage, recordLineage

Fields inherited from class com.northconcepts.datapipeline.core.DataEndpoint
lastRecord, PRODUCT, PRODUCT_VERSION, VENDOR, XML_INPUT_FACTORY_KEY

Fields inherited from class com.northconcepts.datapipeline.core.Endpoint
BUFFER_SIZE, captureElapsedTime, DEFAULT_READ_BUFFER_SIZE

Fields inherited from class com.northconcepts.datapipeline.core.DataObject
id, log, name, TIMESTAMP_FORMAT

Constructor Summary

Constructors
Constructor and Description

XmlReader(File file)

XmlReader(Reader reader)

XmlReader(XmlNodeReader reader)

XmlReader(XMLStreamReader streamReader)

Constructors
Constructor and Description
`XmlReader(File file)`
`XmlReader(Reader reader)`
`XmlReader(XmlNodeReader reader)`
`XmlReader(XMLStreamReader streamReader)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`DataException`	`addExceptionProperties(DataException exception)` Adds this endpoint's current state to a `DataException`.
`XmlReader`	`addField(String name, String locationPathAsString)` Identifies a new field using XPath in the XML stream.
`XmlReader`	`addField(String name, String locationPathAsString, boolean cascadeValues)` Identifies a new field using XPath in the XML stream.
`XmlReader`	`addField(String name, String locationPathAsString, String cascadeResetLocationPath)` Identifies a new field using XPath in the XML stream.
`XmlReader`	`addField(XmlField field)`
`protected void`	`addFieldValue(Field recordField, Object value)`
`protected Record`	`addLineage(Record record)`
`XmlReader`	`addRecordBreak(String locationPathAsString)`
`void`	`close()` Indicates that this endpoint has finished reading or writing.
`protected void`	`createRecord()`
`protected void`	`expandAndPushRecords(Record record)`
`protected void`	`expandListFieldsAsFields(Record record)`
`XmlReader.DuplicateFieldPolicy`	`getDuplicateFieldPolicy()`
`protected void`	`getFieldValues(long sequence)`
`protected XmlNodeReader`	`getXmlNodeReader()`
`boolean`	`isAddTextToParent()` Return true if each child node's text should be concatenated to its parent during parsing (defaults to false).
`boolean`	`isAutoCloseReader()`
`boolean`	`isDebug()`
`boolean`	`isIgnoreNamespaces()` Indicates if namespaces on elements and attributes are ignored when matching expressions (default to true).
`boolean`	`isLineageSupported()`
`protected boolean`	`isRecordBreak(XmlNode node)`
`void`	`open()` Makes this endpoint ready for reading or writing.
`protected Record`	`readImpl()` Overridden by subclasses to read the next record from this `DataReader`.
`protected void`	`saveAncestorAttributeFieldValues(XmlNode node)`
`protected void`	`saveAncestorNodeFieldValues(XmlNode node)`
`protected void`	`saveFieldValues(XmlNode node)`
`XmlReader`	`setAddTextToParent(boolean addTextToParent)` Indicates if each child node's text should be concatenated to its parent during parsing (defaults to false).
`XmlReader`	`setAutoCloseReader(boolean autoCloseReader)`
`XmlReader`	`setDebug(boolean debug)`
`XmlReader`	`setDescription(String description)`
`XmlReader`	`setDuplicateFieldPolicy(XmlReader.DuplicateFieldPolicy duplicateFieldPolicy)`
`XmlReader`	`setIgnoreNamespaces(boolean ignoreNamespaces)` Indicates if namespaces on elements and attributes are ignored when matching expressions (default to true).
`protected void`	`setRecordField(XmlNode node, XmlField xmlField, LocationPath path)`
`XmlReader`	`setSaveLineage(boolean saveLineage)`

Methods inherited from class com.northconcepts.datapipeline.core.DataReader
available, getBufferSize, getNestedEndpoint, getNestedReader, getRootEndpoint, getRootReader, isExhausted, isSaveLineage, peek, pop, push, read, skip

Methods inherited from class com.northconcepts.datapipeline.core.DataEndpoint
decrementRecordCount, enableJmx, getLastRecord, getRecordCount, getRecordCountAsBigInteger, getRecordCountAsString, incrementRecordCount, isRecordCountBigInteger, resetRecordCount, toString

Methods inherited from class com.northconcepts.datapipeline.core.Endpoint
addElapsedtime, assertClosed, assertNotOpened, assertOpened, finalize, getClosedOn, getDescription, getElapsedTime, getElapsedTimeAsString, getOpenedOn, getOpenElapsedTime, getOpenElapsedTimeAsString, getSelfTime, getSelfTimeAsString, getState, isCaptureElapsedTime, isClosed, isOpen, setCaptureElapsedTime

Methods inherited from class com.northconcepts.datapipeline.core.DataObject
exception, exception, exception, getId, getName, resetID

Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Field Detail
  - recordBreaks
```
protected final List<XmlRecordBreak> recordBreaks
```
  - fields
```
protected final List<XmlField> fields
```
  - reader
```
protected final XmlNodeReader reader
```
  - file
```
protected final File file
```
  - currentRecord
```
protected Record currentRecord
```
  - hasCascadingFields
```
protected boolean hasCascadingFields
```
- Constructor Detail
  - XmlReader
```
public XmlReader(File file)
```
  - XmlReader
```
public XmlReader(Reader reader)
```
  - XmlReader
```
public XmlReader(XMLStreamReader streamReader)
```
  - XmlReader
```
public XmlReader(XmlNodeReader reader)
```
- Method Detail
  - isDebug
```
public boolean isDebug()
```
  - setDebug
```
public XmlReader setDebug(boolean debug)
```
  - getXmlNodeReader
```
protected XmlNodeReader getXmlNodeReader()
```
  - isAddTextToParent
```
public boolean isAddTextToParent()
```
    Return true if each child node's text should be concatenated to its parent during parsing (defaults to false).
  - setAddTextToParent
```
public XmlReader setAddTextToParent(boolean addTextToParent)
```
    Indicates if each child node's text should be concatenated to its parent during parsing (defaults to false). Setting this to true will result in higher memory consumption.
  - addRecordBreak
```
public XmlReader addRecordBreak(String locationPathAsString)
```
  - addField
```
public XmlReader addField(XmlField field)
```
  - addField
```
public XmlReader addField(String name,
                          String locationPathAsString,
                          boolean cascadeValues)
```
    Identifies a new field using XPath in the XML stream.
    
    Parameters:
    
    name - the field name to create
    
    locationPathAsString - the XPath to match to populate this field
    
    cascadeValues - indicates if this reader should return the last value seen for this field when no matches are available.
  - addField
```
public XmlReader addField(String name,
                          String locationPathAsString,
                          String cascadeResetLocationPath)
```
    Identifies a new field using XPath in the XML stream. If cascadeResetLocationPath is not null and not empty, it will indicate when this field should be cleared, otherwise this field will return the last value seen when no new matches are available.
    
    Parameters:
    
    name - the field name to create
    
    locationPathAsString - the XPath to match to populate this field
    
    cascadeResetLocationPath - the XPath to identify when cascading values (i.e. the last value seen for this field) should be cleared.
  - addField
```
public XmlReader addField(String name,
                          String locationPathAsString)
```
    Identifies a new field using XPath in the XML stream.
    
    Parameters:
    
    name - the field name to create
    
    locationPathAsString - the XPath to match to populate this field
  - getDuplicateFieldPolicy
```
public XmlReader.DuplicateFieldPolicy getDuplicateFieldPolicy()
```
  - setDuplicateFieldPolicy
```
public XmlReader setDuplicateFieldPolicy(XmlReader.DuplicateFieldPolicy duplicateFieldPolicy)
```
  - isIgnoreNamespaces
```
public boolean isIgnoreNamespaces()
```
    Indicates if namespaces on elements and attributes are ignored when matching expressions (default to true).
  - setIgnoreNamespaces
```
public XmlReader setIgnoreNamespaces(boolean ignoreNamespaces)
```
    Indicates if namespaces on elements and attributes are ignored when matching expressions (default to true).
  - createRecord
```
protected void createRecord()
```
  - setSaveLineage
```
public XmlReader setSaveLineage(boolean saveLineage)
```
    Overrides:
    
    setSaveLineage in class DataReader
  - setDescription
```
public XmlReader setDescription(String description)
```
    Overrides:
    
    setDescription in class Endpoint
  - readImpl
```
protected Record readImpl()
                   throws Throwable
```
    Description copied from class: DataReader
    
    Overridden by subclasses to read the next record from this DataReader. The default implementation of DataReader.read() now insures that this method will not be called again after it returns a null.
    If no record is available, null will be returned.
    
    Specified by:
    
    readImpl in class DataReader
    
    Throws:
    
    Throwable
  - expandAndPushRecords
```
protected void expandAndPushRecords(Record record)
```
  - expandListFieldsAsFields
```
protected void expandListFieldsAsFields(Record record)
```
  - setRecordField
```
protected void setRecordField(XmlNode node,
                              XmlField xmlField,
                              LocationPath path)
```
  - saveFieldValues
```
protected void saveFieldValues(XmlNode node)
```
  - saveAncestorAttributeFieldValues
```
protected void saveAncestorAttributeFieldValues(XmlNode node)
```
  - saveAncestorNodeFieldValues
```
protected void saveAncestorNodeFieldValues(XmlNode node)
```
  - addFieldValue
```
protected void addFieldValue(Field recordField,
                             Object value)
```
  - getFieldValues
```
protected void getFieldValues(long sequence)
```
  - isRecordBreak
```
protected boolean isRecordBreak(XmlNode node)
```
  - open
```
public void open()
          throws DataException
```
    Description copied from class: DataEndpoint
    
    Makes this endpoint ready for reading or writing.
    
    Overrides:
    
    open in class DataEndpoint
    
    Throws:
    
    DataException
  - close
```
public void close()
           throws DataException
```
    Description copied from class: DataEndpoint
    
    Indicates that this endpoint has finished reading or writing.
    
    Overrides:
    
    close in class DataEndpoint
    
    Throws:
    
    DataException
  - isLineageSupported
```
public boolean isLineageSupported()
```
    Overrides:
    
    isLineageSupported in class DataReader
  - addLineage
```
protected Record addLineage(Record record)
```
    Overrides:
    
    addLineage in class DataReader
  - addExceptionProperties
```
public DataException addExceptionProperties(DataException exception)
```
    Description copied from class: Endpoint
    
    Adds this endpoint's current state to a DataException. Since this method is called whenever an exception is thrown, subclasses should override it to add their specific information.
    
    Overrides:
    
    addExceptionProperties in class DataReader
  - isAutoCloseReader
```
public boolean isAutoCloseReader()
```
  - setAutoCloseReader
```
public XmlReader setAutoCloseReader(boolean autoCloseReader)
```

Axis	Abbreviated Syntax	Supported	Examples
ancestor
ancestor-or-self
attribute	@	yes	@lang or attribute::lang
child		yes	title or child::title
descendant		yes
descendant-or-self	//	yes	//book or /descendant-or-self::book/
following
following-sibling
namespace
parent	..
preceding
preceding-sibling
self	.	yes

Class XmlReader

Nested Class Summary

Nested classes/interfaces inherited from class com.northconcepts.datapipeline.core.DataEndpoint

Field Summary

Fields inherited from class com.northconcepts.datapipeline.core.DataReader

Fields inherited from class com.northconcepts.datapipeline.core.DataEndpoint

Fields inherited from class com.northconcepts.datapipeline.core.Endpoint

Fields inherited from class com.northconcepts.datapipeline.core.DataObject

Constructor Summary

Method Summary

Methods inherited from class com.northconcepts.datapipeline.core.DataReader

Methods inherited from class com.northconcepts.datapipeline.core.DataEndpoint

Methods inherited from class com.northconcepts.datapipeline.core.Endpoint

Methods inherited from class com.northconcepts.datapipeline.core.DataObject

Methods inherited from class java.lang.Object

Field Detail

recordBreaks

fields

reader

file

currentRecord

hasCascadingFields

Constructor Detail

XmlReader

XmlReader

XmlReader

XmlReader

Method Detail

isDebug

setDebug

getXmlNodeReader

isAddTextToParent

setAddTextToParent

addRecordBreak

addField

addField

addField

addField

getDuplicateFieldPolicy

setDuplicateFieldPolicy

isIgnoreNamespaces

setIgnoreNamespaces

createRecord

setSaveLineage

setDescription

readImpl

expandAndPushRecords

expandListFieldsAsFields

setRecordField

saveFieldValues

saveAncestorAttributeFieldValues

saveAncestorNodeFieldValues

addFieldValue

getFieldValues

isRecordBreak

open

close

isLineageSupported

addLineage

addExceptionProperties

isAutoCloseReader

setAutoCloseReader