DataPipeline 8.2 Released

DataPipeline Release 8.2
Last month we released version 8.2.0 of DataPipline. Here’s what you can expect.

Core Changes

  1. SQL builder classes now support unions and sub-queries.
  2. Added FieldList.contains(List<String> fieldNames), containedWithin(String … fieldNames), and containedWithin(List<String> fieldNames).
  3. The expression languages can now produce expressions even when they contain syntax errors. This is used in the Foundations library in DataMapping, DecisionTable, DecisionTree, and others to allow work-in-progress to be saved and restored from XML and JSON.
  4. The expression languages improves ClassCastException messages by adding the candidate’s value and type.
  5. MergeUpsert now automatically terminates with a semicolon for Microsoft SQL Server and SAP ASE (Sybase) databases.
  6. SybaseUpsert now terminates with a semicolon by default.
  7. Added JdbcConnectionFactory.close() for scenarios where the factory needs to handle lifecycle termination. For example, when the factory holds a single, pre-connected database connection.
  8. Job now emits less debug logs

Foundations Changes

  1. DataMapping and FieldMapping now return any invalid expressions with syntax errors as problems. See how to retrieve DataMapping problems.
  2. Added Dataset.isDataLoading(), getDataLoadException(), getMaxRecordsToLoad(), and afterLoad() to better support real-time, interactive, and UI-based use-cases.
  3. Added DataMappingEditor for interactive, UI-based use-cases (in SimpleDataHub) that involve data mapping, sorting, and pagination of a source dataset.
  4. Added DataMappingReader/DataMappingWriter.onFailure(Record record, DataException exception, DataMappingResult result) to allow error handling to be intercepted/overridden.
  5. Added DataMappingResult.getDataMappingResult(Record record) and setDataMappingResult(Record record, DataMappingResult result) to attach DataMappingResult to Record as session properties.
  6. Added DataMappingValidator.checkValidExpression() to look for expressions with syntax errors.
  7. Added Column.getInferredTextualValueCount(), getTextual() to count text values in columns.
  8. Improved the type detection algorithm used in Column.getInferredFieldType().
  9. BUGFIX: DatasetReader.readImpl() will now wait until the dataset produces records or closes instead of returning null when no records are immediately available.
  10. BUGFIX: SchemaValidator.checkEntityRelationshipCardinality now correctly looks for invalid one-to-many relationships to return in the problems list.
  11. Added initial release of DetectPrimaryKeysInDataset tool.

Integration Changes

  1. Added JiraService.close() to explicitly and eagerly close the REST endpoint.

FileSystem Changes

  1. Added AmazonS3FileSystem.getEndpointConfiguration(), setEndpointConfiguration(EndpointConfiguration endpointConfiguration), and setClient(AmazonS3 client).
  2. getClient() now returns AmazonS3 instead of AmazonS3Client.
  3. BUGFIX: AmazonS3FileSystem can now write empty files to S3.

See the CHANGELOG for the full set of updates in DP 8.2.0.

Also see the JavaDocs and examples for more info.

Happy coding!

About The DataPipeline Team

We make Data Pipeline — a lightweight ETL framework for Java. Use it to filter, transform, and aggregate data on-the-fly in your web, mobile, and desktop apps. Learn more about it at

Leave a Reply

Your email address will not be published. Required fields are marked *
You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">