Export CSV and Excel from Java web apps With Data Pipeline

Data Pipeline is typically used in non-GUI applications to move data from one place and/or format to another. However, Data Pipeline can just as easily be plugged into your web, mobile, and desktop applications. Its small size and Java-centric approach makes it perfect for handling your data conversion and data manipulation use-cases.

This article will demonstrate how to download CSV (comma-separated values) and Excel data from Java web applications using Data Pipeline. It will also show one way to plug Data Pipeline into your JSPs.

Overview

The example we use here is a simple, read-only web app to display a list of events (conferences, meetings, appointments, etc.). The app’s functionality can be divided into 3 groups:

  1. Create and populate the database
  2. Read and transform data
  3. Export / render data

The overall design is based on the Model–View–Controller (MVC) architecture. Data Pipeline’s Record and RecordList classes will constitute the model. A single JSP will present the view. A RESTful resource (using JBoss’ RESTEasy framework) will act as the controller.

The full source and configuration is included with the download for this blog.

1. Create and populate the database

The first step is to setup a database and fill it with test data. To save time, we use a HyperSQL, in-memory database with no persistence to disk.

event-table

The data model is minimal, just a single event table that we create and populate each time the app is loaded.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
    static {
        // Create EVENT table
        db.executeUpdate("CREATE TABLE EVENT (" + 
        		"    EVENT_ID INTEGER GENERATED BY DEFAULT AS IDENTITY (START WITH 1, INCREMENT BY 1) not null, " + 
        		"    EVENT_TYPE VARCHAR(20)," + 
        		"    TITLE VARCHAR(200)," + 
        		"    SUMMARY LONGVARCHAR," + 
        		"    LOCATION VARCHAR(8000)," + 
                "    START_TIME DATETIME," + 
        		"    DURATION_MINUTES NUMERIC(6,0)" + 
        		")");
 
        // Populate EVENT table with data
        for (int i = 0; i < 10; i++) {
            db.executeUpdate("INSERT INTO EVENT(EVENT_TYPE, TITLE, SUMMARY, LOCATION, START_TIME, DURATION_MINUTES) " + 
            		"  VALUES('Business', 'Title "+i+"', 'Summary "+i+"', 'Location "+i+"', '2012-12-03 8:45:00', 135)");
        }
    }

2. Read and transform events

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    private DataReader getEvents() {
        DataReader reader = new JdbcReader(db.getConnection(), "SELECT * FROM EVENT");
 
        reader = new TransformingReader(reader)
            .add(new CopyField("DURATION_MINUTES", "DURATION", true))
            .add(new BasicFieldTransformer("DURATION").numberToMinutes())
            ;
 
        reader = new TransformingReader(reader)
            .add(new SetCalculatedField("END_TIME", "START_TIME + DURATION"))
            ;
 
        reader = new TransformingReader(reader)
            .add(new CopyField("START_TIME", "START", true))
            .add(new BasicFieldTransformer("START").dateTimeToString("EE MMM d, yyyy 'at' h:mm a"))
            ;
 
        reader = new TransformingReader(reader)
            .add(new CopyField("END_TIME", "END", true))
            .add(new BasicFieldTransformer("END").dateTimeToString("h:mm a"))
            ;
 
        return reader;
    }

The next step is to read the events data using getEvents(). This method will be used internally by the data exporters and JSP render. Its job is to retrieve a stream of records from the database and enrich it with new fields. The retrieval step is done by instantiating a JdbcReader and passing it a database connection and a SQL query. The JdbcReader is then wrapped by several TransformingReaders that do the following:

  1. Copy the existing DURATION_MINUTES field to a new field (DURATION), then convert the new field from a number to an interval (as minutes). Data Pipeline intervals, like Oracle intervals, hold a quantity of time that can be used in datetime expressions.
  2. Create a new calculated field (END_TIME) by adding START_TIME and DURATION
  3. Copy START_TIME to START then convert it from a datetime to a formatted string
  4. copy END_TIME to END then convert it from a datetime to a formatted string

This demo uses several transformers for clarity, but the add(Transformer...) calls could just have easily been made on a single TransformingReader instance.

3. Export CSV

1
2
3
4
5
6
7
8
9
10
11
12
13
14
    @GET
    @Path("/events.csv")
    public Response getEventsAsCsv(@Context HttpServletResponse response) throws Throwable {
        DataReader reader = getEvents();
 
        response.setContentType("text/csv");
        response.setHeader("Content-Disposition", "attachment; filename=\"events.csv\"");
        PrintWriter printWriter = response.getWriter();
        DataWriter writer = new CSVWriter(printWriter).setFieldNamesInFirstRow(true);
 
        JobTemplate.DEFAULT.transfer(reader, writer);
 
        return Response.ok().build();
    }

Next we serve up the first file using getEventsAsCsv(). This method takes the record stream from getEvents() and converts it to a CSV text stream on-the-fly. getEventsAsCsv() can be used to return huge datasets since the CSVWriter is writing records directly to the HTTP response stream without any buffering or batching.

4. Export Excel

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
    @GET
    @Path("/events.excel")
    public Response getEventsAsExcel(@Context HttpServletResponse response) throws Throwable {
        DataReader reader = getEvents();
 
        ExcelDocument excelDocument = new ExcelDocument();
        DataWriter writer = new ExcelWriter(excelDocument).setFieldNamesInFirstRow(true);
 
        JobTemplate.DEFAULT.transfer(reader, writer);
 
        response.setContentType("application/vnd.ms-excel");
        response.setHeader("Content-Disposition", "attachment; filename=\"events.xls\"");
        OutputStream outputStream = response.getOutputStream();
        excelDocument.save(outputStream);
 
        return Response.ok().build();
    }

The second file is served up using getEventsAsExcel(). This method takes the record stream from getEvents() and converts it to an in-memory Excel document before sending the entire thing to the HTTP response stream. Unlike getEventsAsCsv(), this method has to allocate memory to buffer all of the Excel document’s records.

5. Render JSP

The final step is to present the following data grid and download links for the CSV and Excel files. Since this is an example, we’ll display all records in the table and omit any paging, sorting, or filtering functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
    @GET
    @Path("/events")
    public void getEventsAsHtml(
            @Context HttpServletRequest request,
            @Context HttpServletResponse response) throws Throwable {
 
        DataReader reader = getEvents();
        MemoryWriter writer = new  MemoryWriter();
        JobTemplate.DEFAULT.transfer(reader, writer);
 
        RecordList recordList = writer.getRecordList();
        request.setAttribute("recordList", recordList);
        request.getRequestDispatcher("/WEB-INF/jsp/events.jsp").forward(request, response);
    }

Like the previous methods, getEventsAsHtml() reads the record stream from getEvents(), but in this case collects them into a RecordList. The collected records are then passed onto the JSP for rendering.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
<%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>
<%@ page import="com.northconcepts.datapipeline.core.Record" %>
<%@ page import="com.northconcepts.datapipeline.core.RecordList" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Events</title>
</head>
<body>
Export as <a href="/events1/events.csv">CSV</a> | <a href="/events1/events.excel">Excel</a>
<% RecordList recordList = (RecordList)request.getAttribute("recordList"); %>
<table cellpadding="4" cellspacing="0" border="1">
    <thead>
        <tr>
        <% for(String fieldName : recordList.get(0).getFieldNames()) { %>
            <td><%=fieldName%></td>
        <% } %>
        </tr>
    </thead>
    <tbody>
        <% for(int row=0; row<recordList.getRecordCount(); row++) { %>
        <tr>
            <% Record record = recordList.get(row);
               for(int col=0; col<record.getFieldCount(); col++) {
            %>
                <td><%=record.getField(col).getValue()%></td>
            <% } %>
        </tr>
        <% } %>
    </tbody>
</table>
 
</body>
</html>

The JSP renders the 2 download links followed by the data grid. Since the grid is built dynamically, it should handle most tables (as long as there’s data and any HTML in the fields are escaped).

Running the example

That’s it. To try the example app for yourself:

  1. Download the project.
  2. Run the ant build script.
  3. Deploy the generated war to your server of choice.
  4. Visit http://localhost:8080/events1/ (adjust the port as needed).

 

Happy coding!

Download

The web app download contains the entire source code (including Eclipse project and Ant build). The example’s source code is licensed under the terms of the Apache License, Version 2.0.

Java Exception Tracking

If you enjoyed this, I’ve got an exception tracking tool for Java coming out. Sign-up at StackHunter.com to be notified when it does.

About Dele Taylor

Dele Taylor is the founder of North Concepts: http://NorthConcepts.com. You can follow him on Twitter @DeleTaylor and Google +DeleTaylor.
CSV, Data Pipeline, Excel, Java, Web | permalink

Comments

  1. nandha

    Hi ,

    I need to export data to excel from html (excel content type) which has more than 90k rows.
    this leads to memory and server crash issue.
    Can this product help me to streamline this process
    thanks,
    nandha

  2. Pingback: North Concepts

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>