Read from an Excel File
This example demonstrates how to read from an excel file in Java using the ExcelDocument and ExcelReader classes.
The demo code reads an excel file and displays its contents in the standard console. However, the contents of the excel file may be sent to other output types like CSV, Xml,Database,etc.
There are other examples which demo how to write to an excel file.
Input Excel file
Java Code Listing
package com.northconcepts.datapipeline.examples.cookbook; import java.io.File; import org.apache.log4j.Logger; import com.northconcepts.datapipeline.core.DataEndpoint; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.Record; import com.northconcepts.datapipeline.excel.ExcelDocument; import com.northconcepts.datapipeline.excel.ExcelReader; public class ReadFromAnExcelFile { public static final Logger log = DataEndpoint.log; public static void main(String[] args) throws Throwable { ExcelDocument document = new ExcelDocument().open(new File("data/output/test.xlsx")); DataReader reader = new ExcelReader(document) // .setSheetName("credit-balance") .setFieldNamesInFirstRow(true); reader.open(); try { Record record; while ((record = reader.read()) != null) { log.info(record); } } finally { reader.close(); } } }
Code walkthrough
- An ExcelDocument is created corresponding to the input file.
- The input file is opened via ExcelDocument.open
method using the file path of the input file
credit-balance-01.xls
. - A new ExcelReader is created and opened via ExcelReader.open.
- A while loop iterates through the input data.
- Each row in the excel sheet is read as a Record object and written to the standard console via the datapipeline logger.
- After the while loop completes, the ExcelReader is closed via reader.close method.
ExcelReader and ExcelDocument
ExcelReader is an input reader which can be used to read an excel file.
ExcelDocument is a class that encapsulates an excel document.
Normally, when CSV or other types of files are read, the input file is opened directly via the reader class. For ex. you can refer to the
read a CSV file example.
However in the case of excel documents, the files need to be opened/saved via ExcelDocument
class. ExcelReader is used purely for reading the excel file.
ExcelReader has a method
setSheetName which can be used to set the name
of the spreadsheet in the excel document. The ExcelReader.setFieldNamesInFirstRow(true)
method can be used to specify that the names specified in the first row of the input data should be used as field names. If this method is not invoked, the fields would be named
as A1, A2, etc. If those fields names need to be changed, a rename transformation can be added on top of
ExcelReader or any other type
(Refer Rename a field for example).
Console output
16:32:27,427 INFO [main] datapipeline:34 - Record { 0:[Account]:INT=[101]:Integer 1:[LastName]:STRING=[Reeves]:String 2:[FirstName]:STRING=[Keanu]:String 3:[Balance]:DOUBLE=[9315.45]:Double 4:[CreditLimit]:INT=[10000]:Integer 5:[AccountCreated]:STRING=[1/17/1998]:String } 16:32:27,437 INFO [main] datapipeline:34 - Record { 0:[Account]:INT=[312]:Integer 1:[LastName]:STRING=[Butler]:String 2:[FirstName]:STRING=[Gerard]:String 3:[Balance]:INT=[90]:Integer 4:[CreditLimit]:INT=[1000]:Integer 5:[AccountCreated]:DATE=[2003-06-08]:Date } 16:32:27,439 INFO [main] datapipeline:34 - Record { 0:[Account]:INT=[868]:Integer 1:[LastName]:STRING=[Hewitt]:String 2:[FirstName]:STRING=[Jennifer Love]:String 3:[Balance]:INT=[0]:Integer 4:[CreditLimit]:INT=[17000]:Integer 5:[AccountCreated]:STRING=[5/25/1985]:String } 16:32:27,439 INFO [main] datapipeline:34 - Record { 0:[Account]:INT=[761]:Integer 1:[LastName]:STRING=[Pinkett-Smith]:String 2:[FirstName]:STRING=[Jada]:String 3:[Balance]:DOUBLE=[49654.87]:Double 4:[CreditLimit]:INT=[100000]:Integer 5:[AccountCreated]:DATE=[2006-05-12]:Date } 16:32:27,440 INFO [main] datapipeline:34 - Record { 0:[Account]:INT=[317]:Integer 1:[LastName]:STRING=[Murray]:String 2:[FirstName]:STRING=[Bill]:String 3:[Balance]:DOUBLE=[789.65]:Double 4:[CreditLimit]:INT=[5000]:Integer 5:[AccountCreated]:DATE=[2007-05-02]:Date }