What’s New in DataPipeline 6.0?

We’re pleased to announce the release of DataPipeline version 6.0. This release includes our new DataPipeline Foundations addon that brings decisioning, source-target data mapping, and other cool features to your software.

DataPipeline Foundations

Last year we made the decision to release some of the tools we use to build apps as part of DataPipeline. The goal was to provide you with building
blocks that are common to many data processing applications. DataPipeline Foundations is the culmination of that effort.

The new library includes:

  1. Data Mapping
  2. Decision Tables
  3. Decision Trees
  4. Declarative Pipelines
  5. Schema-based data Validation
  6. JDBC Metadata API

You can learn more about DataPipeline Foundations and get started with it here: https://northconcepts.com/docs/foundations/.

BigDecimal and BigInteger Field Types

DataPipeline now fully supports large, arbitrary precision numbers natively across the framework. All readers, writers, and transformers have been
updated to handle the new BIG_INTEGER and BIG_DECIMAL field types. In cases where large numbers are not supported (like Excel), the new types are saved as strings and require explicit conversion from string when reading.

See an example of how to read big decimals and big integers in Excel .

Record Changes

The Record class has been updated with several new conveniences.

  1. Retrieving a field’s value at any level of nesting is now easier. The getFieldValueAsXXX(fieldPathExpression, defaultValue) methods were added to return a field’s value in a single call if the field exists and is not null, otherwise, they return the default value. See
    getFieldValueAsInteger
    and getFieldValueAsBoolean for examples.
  2. Checking if a field with a non-null value exists at any level of nesting can now be done in a single call using the new
    containsNonNullField(fieldPathExpression)
    method.
  3. All fields in a record can now be retrieved in a safe-to-modify ArrayList using the new getFields() method.
  4. The default Java serialization for Record has been overridden to improve performance and reduce the size of data emitted. See how to serialize and deserialize records.

Troubleshooting

We’ve made some improvements to the Diagnostic class to help troubleshoot your setup. Just call log method to see the data on your console or call toString() to capture the contents programmatically. See how to log diagnostic info.

Expression Language

DataPipeline 6 adds improvements to the dynamic expression language (DPEL) around usability and safety.

  1. DPEL now allows you to control which methods can be called by supporting method blacklisting and whitelisting. It also blacklists potentially harmful packages and classes such as System and Runtime by default. You can always whitelist them to allow call if needed. See how to blacklist and whitelist functions in the Datapipeline expression language.
  2. Error messages have been improved and standardized across the board. Messages have also been moved to a central class as a precursor to
    externalization and translation (I18N) via resource bundles in future.
  3. One specific area of error message improvement is around method call in the expression language. Messages now clearly distinguish between the following causes, calling out the method name and expression:
    1. No method found matching arguments.
    2. No class name specified for a method call. (DPEL requires fully-qualified names or aliases to fully-qualified names.)
    3. Method is blacklisted.
    4. Exception invoking method in expression.
  4. DPEL now treats null as zero in cases where it’s appropriate to do so (like addition and subtraction: 1 + null == 1). Otherwise, it returns null for expressions containing a null instead of throwing a NullPointerException (like divide and multiply: null / 1 == null).
  5. A few new functions were added to the DPEL (like capitalize, uncapitalize, and swapCase).
  6. The matchesRegex(String string, String regex, int flags) method was replaced with the easier to use matchesRegex(String string, String regex, boolean ignoreCase, boolean dotAll, boolean multiLine).

Older Changes – What’s New in DataPipeline 5.2

DataPipeline 5.2 was soft-released to several customers on an as-needed basis, here are those changes as part of 6.0.

Jackson Upgrade

The Jackson, JSON library dependency has been upgraded from version 1.x to version 2.x. DataPipeline 6 depends on the latest 2.x version of Jackson (2.11.3).

JDBC

JdbcWriter now supports configurable insert strategies via the IInsert interface. You can see the insert strategies and examples on the JDBC / Relational Databases page along with the list of supported databases.

Similarly, the JdbcUpsertWriter class now includes specialized upsert strategies via the IUpsert interface for Oracle, PostgreSQL and Sybase. The list of supported upsert strategies is also on the JDBC / Relational Databases page.

GitHub Examples

We’ve moved all our examples onto GitHub. See https://github.com/NorthConcepts/DataPipeline-Examples.

We’ll likely stop shipping examples in the zip file in the near future and may do away with the zip distribution altogether. Please email us if this
change will present a problem for you.

Changelog

You can see the full set of changes in the change log: https://northconcepts.com/changelog.

Contact us

Please email us if you have questions or would like to discuss your specific use-cases and needs.

Happy coding!

About The DataPipeline Team

We make Data Pipeline — a lightweight ETL framework for Java. Use it to filter, transform, and aggregate data on-the-fly in your web, mobile, and desktop apps. Learn more about it at northconcepts.com.

Leave a Reply

Your email address will not be published. Required fields are marked *
You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">