Write Excel to Amazon S3

Updated: Aug 14, 2023

This example writes an Excel file to Amazon S3 on the fly. This can be used when generating reports, managing datasets, and collaborating on spreadsheet-based projects where the convenience of cloud storage enhances data accessibility and team collaboration. By leveraging DataPipeline, you can ensure your Excel files are securely stored in Amazon S3 and easily accessible from anywhere.

 

Java Code Listing

package com.northconcepts.datapipeline.examples.amazons3;

import java.io.File;
import java.io.OutputStream;

import com.northconcepts.datapipeline.amazons3.AmazonS3FileSystem;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.excel.ExcelDocument;
import com.northconcepts.datapipeline.excel.ExcelReader;
import com.northconcepts.datapipeline.excel.ExcelWriter;
import com.northconcepts.datapipeline.job.Job;

public class WriteExcelToAmazonS3 {
    
    private static final String ACCESS_KEY = "YOUR ACCESS KEY";
    private static final String SECRET_KEY = "YOUR SECRET KEY";

    public static void main(String[] args) throws Throwable {
        AmazonS3FileSystem s3 = new AmazonS3FileSystem();
        s3.setBasicAWSCredentials(ACCESS_KEY, SECRET_KEY);
        s3.open();

        try {
            OutputStream outputStream = s3.writeMultipartFile("datapipeline-test-01", "output/call-center-inbound-call.xlsx");

            ExcelDocument document = new ExcelDocument(ExcelDocument.ProviderType.POI_XSSF_SAX)
                    .open(new File("example/data/input/call-center-inbound-call.xlsx"));

            DataReader reader = new ExcelReader(document)
                    .setFieldNamesInFirstRow(true);

            DataWriter writer = new ExcelWriter(document).setSheetName("inbound-calls");

            Job.run(reader, writer);

            document.save(outputStream);
        } finally {
            s3.close();
        }
    }

}

 

Code Walkthrough

  1. First, AWS S3 access credentials are introduced and connection to the service is established.
  2. OutputStream instance is created with a bucket and object location paths within S3.
  3. An ExcelDocument is created corresponding to the input file, and the Excel provider type is also specified.
  4. The input file is opened via ExcelDocument.open() method using the file path of the input file call-center-inbound-call.xlsx.
  5. A new ExcelReader that wraps the above ExcelDocument object is created.
  6. ExcelWriter is created to write records to an Excel file.
  7. Job.run(reader, writer) is used to transfer the data from ExcelReader to ExcelWriter.
  8. The document is saved to the OutputStream and S3 connection is closed.

 

ExcelReader and ExcelDocument

ExcelReader is an input reader which can be used to read an Excel file. ExcelDocument is a class that encapsulates an Excel document. Normally, when CSV or other types of files are read, the input file is opened directly via the reader class. For example, you can refer to the read a CSV file example. However, in the case of Excel documents, the files need to be opened/saved via ExcelDocument class.

Mobile Analytics