Write to Amazon S3 Using Multipart Streaming

Updated: Apr 2, 2022

This examples shows how to read a local CSV file and write it to an Amazon S3 bucket using continuous streaming.

CSV input (truncated)

stock,time,price,shares
JHX,09:30:00.00,57,95
JNJ,09:30:00.00,91.14,548
OPK,09:30:00.00,8.3,300
OPK,09:30:00.00,8.3,63
OMC,09:30:00.00,74.53,100
OMC,09:30:00.00,74.53,24
TWTR,09:30:00.00,64.89,100
TWTR,09:30:00.00,64.89,25
TWTR,09:30:00.00,64.89,245
TWTR,09:30:00.00,64.89,55
USB,09:30:00.00,39.71,400
USB,09:30:00.00,39.71,359
USB,09:30:00.00,39.71,41
USB,09:30:00.00,39.71,259
USB,09:30:00.00,39.71,100
VALE,09:30:00.00,14.88,900
VALE,09:30:00.00,14.88,1000
VALE,09:30:00.00,14.88,100
VALE,09:30:00.00,14.88,1000
VALE,09:30:00.00,14.88,260
VALE,09:30:00.00,14.88,100
BSBR,09:30:00.00,5.87,1100
BSBR,09:30:00.00,5.87,800
BRK.B,09:30:00.00,118.35,422

Java Code

/*
 * Copyright (c) 2006-2022 North Concepts Inc.  All rights reserved.
 * Proprietary and Confidential.  Use is subject to license terms.
 * 
 * https://northconcepts.com/data-pipeline/licensing/
 */
package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;
import java.io.OutputStream;
import java.io.OutputStreamWriter;

import com.northconcepts.datapipeline.amazons3.AmazonS3FileSystem;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.csv.CSVWriter;
import com.northconcepts.datapipeline.job.Job;

public class WriteToAmazonS3UsingMultipartStreaming {
    
    private static final String ACCESS_KEY = "YOUR ACCESS KEY";
    private static final String SECRET_KEY = "YOUR SECRET KEY";

    public static void main(String[] args) throws Throwable {
        AmazonS3FileSystem s3 = new AmazonS3FileSystem();
        s3.setBasicAWSCredentials(ACCESS_KEY, SECRET_KEY);
//        s3.setDebug(true);
        s3.open();
        try {
            // Create AWS S3 streaming, multi-part OutputStream 
            OutputStream outputStream = s3.writeMultipartFile("datapipeline-test-01", "output/trades.csv");

            DataReader reader = new CSVReader(new File("example/data/input/trades.csv"))
                    .setFieldNamesInFirstRow(true);
                
            DataWriter writer = new CSVWriter(new OutputStreamWriter(outputStream, "utf-8"))
                    .setFieldNamesInFirstRow(true);
            
            Job.run(reader, writer);
            
            System.out.println("Done.");
        } finally {
            s3.close();
        }
    }

}


Code Walkthrough

  1. A multipart OutputStream is created to upload the output file in the Amazon S3 bucket datapipeline-test-01.
  2. A CSVReader is created to read from the local file trades.csv.
  3. An OutputStreamWriter is then created which is wrapped in a CSVWriter to write the data to a CSV file.

All Examples

Mobile Analytics