Data Pipeline 2.3.4 Now Available

A new release of Data Pipeline is now available for download: https://northconcepts.com/downloads/. This release includes a new Twitter search reader, custom aggregate operations, and much more.

Twitter Search Reader

Data Pipeline now provides a reader for searching Twitter. You can pull tweets into your pipelines for processing and/or conversion to Excel, database, or other formats.

See the TwitterSearchReader JavaDoc.

Aggregate Operations

You can now add your own on-the-fly aggregate operations. This can be done by creating an instance of AggregateReader.AggregateOperation (anonymous or subclass) and then adding it to AggregateReader.add(AggregateOperation ... operations). You’ll then be able to collect and compile stats about your data as it streams by.

XPath Debugging

All XPath-based readers (XmlReader, JsonReader, and JavaBeanReader) now support a debug flag.

Setting the flag forces the readers to log all XPaths they can see from your data stream in real-time. Regardless of whether or not they match your field and break rules.

This allows you to quickly build new data extraction jobs or fix broken ones by seeing exactly what the engine sees.

Transformations

This release includes transformations to:

split datetime fields (BasicFieldTransformer.dateTimeToDate() and dateTimeToTime()).
set a default value on failure instead of throwing an exception (FieldTransformer.setValueOnException(Object valueOnException), getValueOnException(), and hasValueOnException()).
move fields around relative to other fields (added com.northconcepts.datapipeline.transform.MoveFieldBefore and MoveFieldAfter).

License

The software license has been updated to reflect the new plans and make the grant irrevocable (except as set forth in the termination provisions). https://northconcepts.com/license/commercial/.

Other Changes

added RecordList.addAll(DataReader reader)
added RecordList.RecordList(DataReader reader) constructor
JavaBeanReader now returns node/field names separate from values; use “//firstName/text()” instead of “//firstName” for values
Field.getValueAsString() now returns the first and last bytes for BLOBS instead of the array’s default toString()
replaced “\n” with OS line separator in DataException.getPropertiesAsString() and getMessageWithProperties()
Record.clone now returns Record instead of Object
added Record.moveFieldBefore(String fieldName, String beforeFieldName) and moveFieldAfter(String columnName, String afterFieldName)
added CSVReader.getLineText() and getLineParser()
added filter rule: com.northconcepts.datapipeline.filter.rule.IsNull
added examples from blogs: com.northconcepts.datapipeline.examples.cookbook.blog.*

Bug Fixes

XPath engine now matches root and ancestor attributes
JobTemplateImpl no longer tries to reopen supplied endpoints if they are already open
DataReaderLookup no longer tries to reopen supplied endpoints if they are already open
DeMux now prevents sinks/readers from blocking forever if open fails
DataException.getRecord() no longer throws ClassCastException

You can download your copy of Data Pipeline at https://northconcepts.com/downloads/.

Data Pipeline 2.3.4 Now Available

Twitter Search Reader

Aggregate Operations

XPath Debugging

Transformations

License

Other Changes

Bug Fixes

About The DataPipeline Team

Leave a Reply Cancel reply

Data Pipeline

Docs

Company

Tools