Category Archives: Data Pipeline

18 ETL Tools for Java Developers (Updated 2023)

ETL Tools for Java Developers

Updated: May 2023

ETL is a process for performing data extraction, transformation and loading. The process extracts data from a variety of sources and formats, transforms it into a standard structure, and loads it into a database, file, web service, or other system for analysis, visualization, machine learning, etc.

ETL tools come in a wide variety of shapes.  Some run on your desktop or on-premises servers, while others run as SaaS in the cloud.  Some are code-based, built on standard programming languages that many developers already know.  Others are built on a custom DSL (domain specific language) in an attempt to be more intentional and require less code.  Others still are completely graphical, only offering programming interfaces for complex transformations.

What follows is a list of ETL tools for developers already familiar with Java and the JVM (Java Virtual Machine) to clean, validate, filter, and prepare your data for use.

Continue reading

How to Export Emails from Gmail to Excel with Data Pipeline

Export emails from Gmail and G Suite to Excel

Updated: July 2021

If you have ever tried to export emails to Excel for analysis, you know it is not exactly straightforward.  Maybe you need to find the top companies contacting you and your sales team.  Maybe you need to perform text or sentiment analysis on the contents of your messages.  Or maybe you’re creating visualizations to better understand who’s emailing you.  This east guide will show you how you can use Data Pipeline to search and read emails from Gmail or G Suite, process them any way you like, and store them in Excel.

Continue reading