How to Export Emails from Gmail to Excel with Data Pipeline

Export emails from Gmail and G Suite to Excel

Updated: July 2021

If you have ever tried to export emails to Excel for analysis, you know it is not exactly straightforward.  Maybe you need to find the top companies contacting you and your sales team.  Maybe you need to perform text or sentiment analysis on the contents of your messages.  Or maybe you’re creating visualizations to better understand who’s emailing you.  This east guide will show you how you can use Data Pipeline to search and read emails from Gmail or G Suite, process them any way you like, and store them in Excel.

 

1. Create an Application Specific Password in Gmail

Before we start, create a separate password for this new app to access your emails.  This way you won’t have to use your main password, which can be difficult if you have 2-factor authentication turned on.  It also lets you remove access by deleting the password when you’re done exporting the data without affecting anything else.

  1. Go to My Account.Open my Google account
  2. Select Signing in to Google.Sign in to Google
  3. Select App passwords. Gmail Application specific passwords
  4. Choose Other (Custom name). Select custom application name
  5. Give it a name like Export emails to CSV. Export emails to CSV app
  6. Retrieve your new password. Save geerated password

2. Read the latest 10 emails from your inbox

This first step will read the most recent emails from your inbox and write them to System.out. One benefit of starting here is it gives you an idea of how much data you can get for each message.

All of the examples that follow will rely on the following constants.

Set USER to your email address and PASSWD to the new password you generated above.

You should always use a LimitReader with some sane value since email boxes can be huge and you don’t want to spend all day on your first test.  However, you can always remove the LimitReader line if you’d like to get all emails.

3. Read the latest 10 sent emails

By default, EmailReader pulls emails from your Inbox folder.  You can change this by setting the folder name to anything you like.  In this case GmailFolders.SENT_MAIL to read your latest sent messages only.

4. Search for emails by subject

Reading from your Inbox and Sent folders are useful, but what if you need to find all emails with “Trial License” in their subject?

First, you’ll need to set the search term to a SubjectTerm, with your query, and then you’ll need to set the folder name to All Mail to pull matching messages from all folders.

Since EmailReader uses the javax.mail classes behind the scenes, it will try to run the search efficiently on your email server if it supports it.  This way you’re not waiting for all messages to be sent to the app for filtering.

5. Search for emails by date and subject

Once you start searching, you’ll likely want to add dates into the mix.  You can use the AndTerm to combine searches together.  In this case, you’re looking for messages with “Trial License” in the subject sent on or after midnight July 1st, 2017.

GE in ReceivedDateTerm.GE means greater than or equal (>=).  You’ll find all the other operators you need for all your comparisons there.

We’ll also add one more constant to make parsing the date easier.

6. Export emails to Excel

This last example is the main code you’ll need to export your Gmail emails to Excel.

Like before, it searches for Trial Licenses from July onward and limits the results to 100 emails.  Next, it selects just the fields you want to save – from, to, sentDate, subject, and content – out of all the eligible fields.

The content field is flattened since emails can have multiple contents (plain text and HTML).  The content field is then limited to the first 32767 characters – the maximum size of a cell in Excel.

Finally, the Excel document is written by the pipeline and saved to disk.

 

Happy coding!

 

 

 

About Dele Taylor

We make Data Pipeline — a lightweight ETL framework for Java. Use it to filter, transform, and aggregate data on-the-fly in your web, mobile, and desktop apps. Learn more about it at northconcepts.com.

Leave a Reply

Your email address will not be published. Required fields are marked *
You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">