Read from an Excel File

Updated: Feb 21, 2022

This example demonstrates how to read from an excel file in Java using the ExcelDocument and ExcelReader classes.

The demo code reads an excel file and displays its contents in the standard console. However, the contents of the excel file may be sent to other output types like CSV, Xml,Database,etc.

There are other examples which demo how to write to an excel file.

Input Excel file

Java Code Listing

package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import org.apache.log4j.Logger;

import com.northconcepts.datapipeline.core.DataEndpoint;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.Record;
import com.northconcepts.datapipeline.excel.ExcelDocument;
import com.northconcepts.datapipeline.excel.ExcelReader;

public class ReadFromAnExcelFile {

    public static final Logger log = DataEndpoint.log;

    public static void main(String[] args) throws Throwable {
        ExcelDocument document = new ExcelDocument().open(new File("data/output/test.xlsx"));
        DataReader reader = new ExcelReader(document)
                //                .setSheetName("credit-balance")
                .setFieldNamesInFirstRow(true);

        reader.open();
        try {
            Record record;
            while ((record = reader.read()) != null) {
                log.info(record);
            }
        } finally {
            reader.close();
        }
    }
}

Code walkthrough

  1. An ExcelDocument is created corresponding to the input file.
  2. The input file is opened via ExcelDocument.open method using the file path of the input file credit-balance-01.xls.
  3. A new ExcelReader is created and opened via ExcelReader.open.
  4. A while loop iterates through the input data.
  5. Each row in the excel sheet is read as a Record object and written to the standard console via the datapipeline logger.
  6. After the while loop completes, the ExcelReader is closed via reader.close method.

ExcelReader and ExcelDocument

ExcelReader is an input reader which can be used to read an excel file. ExcelDocument is a class that encapsulates an excel document. Normally, when CSV or other types of files are read, the input file is opened directly via the reader class. For ex. you can refer to the read a CSV file example. However in the case of excel documents, the files need to be opened/saved via ExcelDocument class. ExcelReader is used purely for reading the excel file.
ExcelReader has a method setSheetName which can be used to set the name of the spreadsheet in the excel document. The ExcelReader.setFieldNamesInFirstRow(true) method can be used to specify that the names specified in the first row of the input data should be used as field names. If this method is not invoked, the fields would be named as A1, A2, etc. If those fields names need to be changed, a rename transformation can be added on top of ExcelReader or any other type (Refer Rename a field for example).

Console output

16:32:27,427  INFO [main] datapipeline:34 - Record {
    0:[Account]:INT=[101]:Integer
    1:[LastName]:STRING=[Reeves]:String
    2:[FirstName]:STRING=[Keanu]:String
    3:[Balance]:DOUBLE=[9315.45]:Double
    4:[CreditLimit]:INT=[10000]:Integer
    5:[AccountCreated]:STRING=[1/17/1998]:String
}

16:32:27,437  INFO [main] datapipeline:34 - Record {
    0:[Account]:INT=[312]:Integer
    1:[LastName]:STRING=[Butler]:String
    2:[FirstName]:STRING=[Gerard]:String
    3:[Balance]:INT=[90]:Integer
    4:[CreditLimit]:INT=[1000]:Integer
    5:[AccountCreated]:DATE=[2003-06-08]:Date
}

16:32:27,439  INFO [main] datapipeline:34 - Record {
    0:[Account]:INT=[868]:Integer
    1:[LastName]:STRING=[Hewitt]:String
    2:[FirstName]:STRING=[Jennifer Love]:String
    3:[Balance]:INT=[0]:Integer
    4:[CreditLimit]:INT=[17000]:Integer
    5:[AccountCreated]:STRING=[5/25/1985]:String
}

16:32:27,439  INFO [main] datapipeline:34 - Record {
    0:[Account]:INT=[761]:Integer
    1:[LastName]:STRING=[Pinkett-Smith]:String
    2:[FirstName]:STRING=[Jada]:String
    3:[Balance]:DOUBLE=[49654.87]:Double
    4:[CreditLimit]:INT=[100000]:Integer
    5:[AccountCreated]:DATE=[2006-05-12]:Date
}

16:32:27,440  INFO [main] datapipeline:34 - Record {
    0:[Account]:INT=[317]:Integer
    1:[LastName]:STRING=[Murray]:String
    2:[FirstName]:STRING=[Bill]:String
    3:[Balance]:DOUBLE=[789.65]:Double
    4:[CreditLimit]:INT=[5000]:Integer
    5:[AccountCreated]:DATE=[2007-05-02]:Date
}
Mobile Analytics