Welcome to the DataPipeline 7.0 release. Since our last update, the DataPipeline team has been hard at work adding more declarative components, new integrations, new transformations, and generally making the framework easier to use. Our goal is to make simple use-cases easy and complex ones less difficult to implement.
Automatic Data Conversion And Validation
With this goal in mind, we’ve added schema-based data conversions and validation. The DataMapping and AbstractPipeline classes now have optional sourceEntity and targetEntity properties to validate and convert incoming and outgoing data. The EntityDef class can now convert data using one or several
mapXXX() methods and validate using the
validateXXX() methods. You can also use EntityDef in a job with the new SchemaTransformer.
This release ensures the components in DataPipeline Foundations can be saved and loaded from JSON and XML. This makes it easier to externalize parts of your app in a database or configuration file — including your own custom components.
The new data lineage feature optionally adds metadata to your records and fields indicating where they were loaded from. This can help with audits, data reconciliation, and general troubleshooting.
LocalDateTime, LocalDate, and LocalTime Support
In addition to the classic date types (
java.sql.Timestamp), DataPipeline now supports
LocalTime. These new types are converted behind-the-scenes to and from the classical types. You’ll find overloaded support in the Field, Record, and transformation classes.
BigDecimal and BigInteger Field Types
Last release we gave BigDecimal and BigInteger their own dedicated field types. This release, we’ve overloaded methods and constructors throughout DataPipeline and added specific handling to make BigDecimal and BigInteger first class citizens.
Expression Language Functions
We’ve improved error messages in the dynamic expression language and added several new functions, including:
- recordContainsValue(), recordContainsNonNullValue(), recordContainsField(), recordContainsNonNullField(), getValue()
- toBoolean(), capitalize(), uncapitalize(), swapCase(), substring(String string, int beginIndex)
- toDate(), toTime(), and toDatetime() now accept Object instead of java.util.Date to convert from more types (LocalDate, LocalDateTime, LocalTime, ZonedDateTime, OffsetDateTime, Instant, and String)
JsonRecordWriter and XmlRecordWriter
The JSON and XML packages now include writers that maintain each record’s natural structure as if you’d call toJson() to to Xml() on each one.
We now have several readers to handle encryption and decryption. We support both symmetric, secret key encryption as well as asymmetric, public-private key encryption. You can encrypt all fields in records or specify the fields to encrypt.
DataPipeline adds the following integrations with this release.
- Apache Parquet: ParquetDataReader, ParquetDataWriter
- Apache Orc: OrcDataReader, OrcDataWriter
- Jira: JiraEpicReader, JiraIssueReader, JiraProjectReader, JiraSprintReader
- Bloomberg: BloombergMessage and BloombergMessageReader
- Twitter v2: TwitterFilterStreamReader, TwitterFollowerReader, TwitterFollowingReader, TwitterSearchReader, TwitterTimelineMentionsReader, TwitterTimelineTweetsReader
- TemplateWriter now supports nested expressions directly with ability to disable in constructor (to revert to previous behaviour)
This release is one of our biggest yet and has been a long time coming. The above list are just the highlights, you can see the full change log at the link below.
Your feedback is always appreciated and encouraged. Please email us with your requests, questions and comments.