Write to Amazon S3 Using Multipart Streaming
Updated: Apr 2, 2022
This examples shows how to read a local CSV file and write it to an Amazon S3 bucket using continuous streaming.
CSV input (truncated)
stock,time,price,shares JHX,09:30:00.00,57,95 JNJ,09:30:00.00,91.14,548 OPK,09:30:00.00,8.3,300 OPK,09:30:00.00,8.3,63 OMC,09:30:00.00,74.53,100 OMC,09:30:00.00,74.53,24 TWTR,09:30:00.00,64.89,100 TWTR,09:30:00.00,64.89,25 TWTR,09:30:00.00,64.89,245 TWTR,09:30:00.00,64.89,55 USB,09:30:00.00,39.71,400 USB,09:30:00.00,39.71,359 USB,09:30:00.00,39.71,41 USB,09:30:00.00,39.71,259 USB,09:30:00.00,39.71,100 VALE,09:30:00.00,14.88,900 VALE,09:30:00.00,14.88,1000 VALE,09:30:00.00,14.88,100 VALE,09:30:00.00,14.88,1000 VALE,09:30:00.00,14.88,260 VALE,09:30:00.00,14.88,100 BSBR,09:30:00.00,5.87,1100 BSBR,09:30:00.00,5.87,800 BRK.B,09:30:00.00,118.35,422
Java Code
/* * Copyright (c) 2006-2022 North Concepts Inc. All rights reserved. * Proprietary and Confidential. Use is subject to license terms. * * https://northconcepts.com/data-pipeline/licensing/ */ package com.northconcepts.datapipeline.examples.cookbook; import java.io.File; import java.io.OutputStream; import java.io.OutputStreamWriter; import com.northconcepts.datapipeline.amazons3.AmazonS3FileSystem; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.DataWriter; import com.northconcepts.datapipeline.csv.CSVReader; import com.northconcepts.datapipeline.csv.CSVWriter; import com.northconcepts.datapipeline.job.Job; public class WriteToAmazonS3UsingMultipartStreaming { private static final String ACCESS_KEY = "YOUR ACCESS KEY"; private static final String SECRET_KEY = "YOUR SECRET KEY"; public static void main(String[] args) throws Throwable { AmazonS3FileSystem s3 = new AmazonS3FileSystem(); s3.setBasicAWSCredentials(ACCESS_KEY, SECRET_KEY); // s3.setDebug(true); s3.open(); try { // Create AWS S3 streaming, multi-part OutputStream OutputStream outputStream = s3.writeMultipartFile("datapipeline-test-01", "output/trades.csv"); DataReader reader = new CSVReader(new File("example/data/input/trades.csv")) .setFieldNamesInFirstRow(true); DataWriter writer = new CSVWriter(new OutputStreamWriter(outputStream, "utf-8")) .setFieldNamesInFirstRow(true); Job.run(reader, writer); System.out.println("Done."); } finally { s3.close(); } } }
Code Walkthrough
- A multipart OutputStream is created to upload the output file in the Amazon S3 bucket
datapipeline-test-01
.- A CSVReader is created to read from the local file
trades.csv
.- An OutputStreamWriter is then created which is wrapped in a CSVWriter to write the data to a CSV file.