Data Pipeline 3.0 Now Available

We’re pleased to announce the release of version 3.0 of our Data Pipeline engine.

This release includes the new Sliding Window Aggregations feature to perform continuous SQL group-by operations on streaming data.

We’ve improved the performance of the XPath based readers (JsonReader, XmlReader, and JavaBeanReader), included new conveniences to reduce your code size, and added several new transformers and filters.

We’re also now offering a free 30-day trial for you to take the premium and enterprise features out for a test drive.

Sliding Window Aggregations

  • GroupByReader now provides sliding window aggregations via its new setNewWindowStrategy(NewWindowStrategy) and setCloseWindowStrategy(CloseWindowStrategy) methods
  • Added convenience methods to GroupByReader
  • updated GroupAverage and GroupSum to hold their running total and return value as BigDecimal
  • updated GroupFirst to allow null first elements
  • Renamed GroupField to GroupOperationField
  • Added GroupOperation.getFilter(), setFilter(Filter filter), getExcludeNulls(), setExcludeNulls(boolean excludeNulls), getTargetFieldName(), and setTargetFieldName(String targetFieldName)

Performance

  • Improved performance of XPath-based readers (JsonReader, XmlReader, and JavaBeanReader)

Transformations

  • Added com.northconcepts.datapipeline.filter.rule.DateIsBefore, DateIsAfter, and DateIs
  • Added SplitArrayField transformer
  • Added convenience methods SortingReader.asc(String fieldName) and desc(String fieldName)
  • Added FieldFilter.isThrowExceptionOnMissingField() and setThrowExceptionOnMissingField(boolean)
  • Added convenience methods to FieldFilter
  • Added com.northconcepts.datapipeline.transform.BasicFieldTransformer.replaceAll() and replaceFirst()
  • Added BasicFieldTransformer.intervalToXXX convenience methods

Tracking and monitoring

  • Added DataEndpoint.getElapsedTime(), isCaptureElapsedTime(), and setCaptureElapsedTime(boolean captureElapsedTime)
  • Added DataEndpoint.getOpenedOn() and getClosedOn() to return system time in milliseconds when readers and writers were opened and closed
  • Added DataEndpoint.getOpenElapsedTime returns the number of milliseconds this endpoint was (or has been) opened for
  • Exceptions now include timestamps for when endpoints were opened and closed if relevant
  • Added Record.getCreatedOn() to return the time in milliseconds when the record was created
  • Updated JobTemplateImpl to log original and secondary exceptions as errors; original behaviour is not affected
  • Updated DateEndpoint.toString() to include record count, opened time, closed time, and description if available

Other Changes

  • Added Field.getValueAsBigDecimal(), getValueAsBigInteger(), setValue(BigDecimal value), and setValue(BigInteger value)
  • Updated Field.toString() and getValueAsString() to strips insignificant zeros after decimals in BigDecimal values
  • Added DataEndpoint.assertClosed() for operations that need to ensure the endpoint has finished reading
  • Updated SimpleJsonWriter to write date, time, and datetime fields using the following patterns respectively: “yyyy-MM-dd”, “HH:mm:ss.SSS”, “yyyy-MM-dd’T’HH:mm:ss.SSS”
  • Added TailProxyReader to keep the last n records in-memory for asynchronous retrievel as data flows through
  • Added TailWriter to keep the last n records written to it in-memory for asynchronous retrievel
  • DataEndpoint.VENDOR, PRODUCT, and PRODUCT_VERSION are now public

Bug Fixes

  • fixed ClassCastException in Record.copySessionPropertiesFrom(Session)
  • CSVWriter now quotes values starting with zero (0) to prevent other apps from trimming numbers with zeros
  • GroupMaximum and GroupMinimum no longer throw ClassCastException when comparing different types of numbers (i.e. Double vs Long, BigDecimal vs short, etc.)
  • GroupMaximum and GroupMinimum no longer throw ClassCastException when comparing Strings with different types (i.e. String vs Long). Both values are compared as Strings
  • SimpleJsonWriter now writes well-formed JSON, even when an exception occurs.
  • XmlReader now matches unclosed nodes early on record break if they are used in the current record
  • TransformingReader now skips subsequent transformers in the same transformation-set if records are marked as deleted by earlier operators

Let us know which new features you are looking for next.

About The DataPipeline Team

We make Data Pipeline — a lightweight ETL framework for Java. Use it to filter, transform, and aggregate data on-the-fly in your web, mobile, and desktop apps. Learn more about it at northconcepts.com.

Leave a Reply

Your email address will not be published. Required fields are marked *
You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">