Data Pipeline

Lightweight ETL Framework for Java

See the Code

Why Use Data Pipeline

Build ETL in Java

Code your extract, transform, load pipelines using a high performance language that fits your team's skills, has a mature toolset, and is easy to understand and maintain.

Embedded or Standalone

Integrate pipelines into your web, mobile, desktop, and batch applications or run them as separate, standalone jobs.

Run Pipelines Locally

Develop and test pipelines locally on your desktop using your existing development and debugging tools.

Script Transformations

When you need to move quickly, skip the compile step and script transformations in JavaScript, Groovy, and other languages that run on the Java Virtual Machine.

Manage Change

Track changes in Git or other source control systems, code review ETL logic with your team, and plug pipeline development into your CI/CD process.

Stream Real-Time or Batch

Set your pipelines to run on a schedule, when data is available, when an event or manual trigger occurs, or you can run them continuously to gain insight in real-time.

Customize

Enhance the engine to fit your unique needs. Plug in your own logic or modify existing behavior to your specific requirements.

Get Meaningful Errors

When exceptions occur, get the exact line of code, the data that was being processed, and a readable description of all the transformations in the pipeline.

Process In-Memory

Processing data one piece at a time as it moves through the pipeline can be more than 100x faster than first storing it to disk to query or process later.

Trusted by Many

Mobile Analytics