Category Archives: Java

24 ETL Tools for Java Developers

ETL Tools for Java Developers

ETL is a process for performing data extraction, transformation and loading. The process extracts data from a variety of sources and formats, transforms it into a standard structure, and loads it into a database, file, web service, or other system for analysis, visualization, machine learning, etc.

ETL tools come in a wide variety of shapes.  Some run on your desktop or on-premise servers, while others run as SaaS in the cloud.  Some are code-based, built on standard programming languages that many developers already know.  Others are built on a custom DSL (domain specific language) in an attempt to be more intentional and require less code.  Others still are completely graphical, only offering programming interfaces for complex transformations.

What follows is a list of ETL tools for developers already familiar with Java and the JVM (Java Virtual Machine) to clean, validate, filter, and prepare your data for use.

Continue reading

Spring Batch vs Data Pipeline – ETL Job Example

Data Pipeline vs Spring Batch

I was reading a blog at Java Code Geeks on how to create a Spring Batch ETL Job.  What struck me about the example was the amount of code required by the framework for such a routine task.  In this blog, you’ll see how to accomplish the same task of summarize a million stock trades to find the open, close, high, and low prices for each symbol using our Data Pipeline framework.

Continue reading

How to speed up JDBC inserts?

How to speed up JDBC inserts

One question I like to ask in interviews is: how would you speed up inserts when using JDBC?

This simple question usually shows me how knowledgeable the developer is with databases in general and JDBC specifically.

If you ever find yourself needing to insert data quickly to a SQL database (and not just being asked it in an interview), here are some options to consider.
Continue reading

How to Query Java Objects with XPath

How to Query Java Objects with XPathData Pipeline’s query engine allows you to use XPath to query XML, JSON, and Java objects.  This walkthrough will show you how to query Java objects using XPath and save the results to a CSV file.  While the reading and writing will be done with the JavaBeanReader and CSVWriter classes, you can swap out the CSVWriter for any other endpoint or transformation that Data Pipeline supports. Continue reading