Data Pipeline
Data Pipeline's embedded data transformation engine makes it easy for Java applications to convert, manipulate, and transform data with only a few lines of code.
The engine has readers and writers for common data formats like XML, CSV, and Excel. It also has operators to filter, validate, lookup, and more.
Benefits
Embedded
Plugs into your product or application. No servers, installation, setup, configuration, or network hops.
Write transformations using Java
Use the language and tools you already know.
Small footprint
Low memory and disk overhead.
Large data sets
Handle gigabytes of data with ease.
Real‐time
Process data as it comes in, no delays.
Fully customizable
Enhance the toolkit to fit your unique needs; plug‐in your own logic or modify existing behaviour.
Easy to use
Get started quickly.
Supported
Get the help you need.
Open, visible source
No guessing, know exactly what the toolkit is doing.
Structured design
Simple to understand, use, and extend.
Pre‐built components
Leverage built‐in endpoints and operations.
Flexible data representation
Choose how best to structure your data.
Formats
Comma Separated Values (CSV)
Supports user-defined delimited values.
Excel
Supports Excel formats 97, 2003, 2007, and 2010.
XML/XPath
Streaming XML reader using XPath queries. Template-based XML writer using built-in expression language.
JDBC
Fixed-Width/Fixed Length Records (FLR)
RTF
Web Server Logs
File System
In-Memory
Native
Built-in serialization format.
Features
Filter
Rule‐based record filtering.
Validation
Rule‐based data validation.
Transformations
Built-in or user-defined transformation.
Lookups / Joins
Sort large datasets
Remove duplicates (Dedup)
Metering
Throttling
Aggregation
Streaming data
Expression language
Multithreading
Job management
Detailed exception reporting
Out‐of‐band data
Attach temporary, transient data to any field or record