A new release of Data Pipeline is now available for download: https://northconcepts.com/downloads/. This release includes a new Twitter search reader, custom aggregate operations, and much more.
Twitter Search Reader
Data Pipeline now provides a reader for searching Twitter. You can pull tweets into your pipelines for processing and/or conversion to Excel, database, or other formats.
See the TwitterSearchReader JavaDoc.
Aggregate Operations
You can now add your own on-the-fly aggregate operations. This can be done by creating an instance of AggregateReader.AggregateOperation
(anonymous or subclass) and then adding it to AggregateReader.add(AggregateOperation ... operations)
. You’ll then be able to collect and compile stats about your data as it streams by.
XPath Debugging
All XPath-based readers (XmlReader
, JsonReader
, and JavaBeanReader
) now support a debug flag.
Setting the flag forces the readers to log all XPaths they can see from your data stream in real-time. Regardless of whether or not they match your field and break rules.
This allows you to quickly build new data extraction jobs or fix broken ones by seeing exactly what the engine sees.
Transformations
This release includes transformations to:
- split datetime fields (
BasicFieldTransformer.dateTimeToDate()
anddateTimeToTime()
). - set a default value on failure instead of throwing an exception (
FieldTransformer.setValueOnException(Object valueOnException)
,getValueOnException()
, andhasValueOnException()
). - move fields around relative to other fields (added
com.northconcepts.datapipeline.transform.MoveFieldBefore
andMoveFieldAfter
).
License
The software license has been updated to reflect the new plans and make the grant irrevocable (except as set forth in the termination provisions). https://northconcepts.com/license/commercial/.
Other Changes
- added RecordList.addAll(DataReader reader)
- added RecordList.RecordList(DataReader reader) constructor
- JavaBeanReader now returns node/field names separate from values; use “//firstName/text()” instead of “//firstName” for values
- Field.getValueAsString() now returns the first and last bytes for BLOBS instead of the array’s default toString()
- replaced “\n” with OS line separator in DataException.getPropertiesAsString() and getMessageWithProperties()
- Record.clone now returns Record instead of Object
- added Record.moveFieldBefore(String fieldName, String beforeFieldName) and moveFieldAfter(String columnName, String afterFieldName)
- added CSVReader.getLineText() and getLineParser()
- added filter rule: com.northconcepts.datapipeline.filter.rule.IsNull
- added examples from blogs: com.northconcepts.datapipeline.examples.cookbook.blog.*
Bug Fixes
- XPath engine now matches root and ancestor attributes
JobTemplateImpl
no longer tries to reopen supplied endpoints if they are already openDataReaderLookup
no longer tries to reopen supplied endpoints if they are already openDeMux
now prevents sinks/readers from blocking forever if open failsDataException.getRecord()
no longer throws ClassCastException
You can download your copy of Data Pipeline at https://northconcepts.com/downloads/.