Measure Performance of Reader and Writer

Updated: Jun 4, 2023

This example will show you how you can measure the performance of both Reader and Writer.

To accomplish this an Endpoint will be used to tell the readers and writers to record their processing time.

Input CSV file

stock,time,price,shares
JHX,09:30:00.00,57,95
JNJ,09:30:00.00,91.14,548
OPK,09:30:00.00,8.3,300
OPK,09:30:00.00,8.3,63
OMC,09:30:00.00,74.53,100
OMC,09:30:00.00,74.53,24
TWTR,09:30:00.00,64.89,100
.
.
.
IR,09:38:14.05,61.4201,100
TE,09:38:14.05,17.16,100
TE,09:38:14.05,17.16,100
MSI,09:38:14.05,67.02,82
TE,09:38:14.05,17.16,600
TE,09:38:14.05,17.16,100
TE,09:38:14.05,17.16,100
TE,09:38:14.05,17.16,200

Java code listing

package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.Endpoint;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.csv.CSVWriter;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.transform.BasicFieldTransformer;
import com.northconcepts.datapipeline.transform.CopyField;
import com.northconcepts.datapipeline.transform.RenameField;
import com.northconcepts.datapipeline.transform.TransformingReader;
import com.northconcepts.datapipeline.transform.TransformingWriter;

public class MeasurePerformanceOfReaderAndWriter {

	public static void main(String[] args) {
		//set flag - captureElapsedTime to measure performance.
		Endpoint.setCaptureElapsedTime(true);
		
		DataReader reader = new CSVReader(new File("example/data/input/trades.csv"))
				.setFieldNamesInFirstRow(true);
		
		reader = new TransformingReader(reader)
				.add(new BasicFieldTransformer("price").stringToDouble())
				.add(new BasicFieldTransformer("time").stringToTime("hh:mm:s"));

		DataWriter writer = new CSVWriter(new File("example/data/output/writer_performance_measurement.csv"));

		writer = new TransformingWriter(writer)
				.add(new CopyField("stock", "stock_name"))
				.add(new RenameField("shares", "share_count"));
		
		Job job = Job.run(reader, writer);

		System.out.println("Job Running Time:- " + job.getRunningTimeAsString());
		System.out.println("Total Records transferred:- " + job.getRecordsTransferred());
		
		System.out.println("Time taken by TranformingReader:- " + reader.getSelfTimeAsString());
		System.out.println("Time taken by CSVReader:- " + reader.getNestedEndpoint().getSelfTimeAsString());
		
		System.out.println("Time taken by TransformingWriter:- " + writer.getSelfTimeAsString());
		System.out.println("Time taken by CSVWriter:- " + writer.getNestedEndpoint().getSelfTimeAsString());
		
	}
	
}

Code walkthrough

  1. Endpoint.setCaptureElapsedTime(true) enables recording of processing time.
  2. A CSVReader is created using the file path of the input file trades.csv.
  3. The CSVReader.setFieldNamesInFirstRow(true) method is invoked to specify that the names specified in the first row should be used as field names.
  4. TransformingReader is used to transform or change the data from string to their required type e.g. .add(new BasicFieldTransformer("price").stringToDouble()) is used to transform the price data from string to double.
  5. CSVWriter is created corresponding to the output file writer_performance_measurement.csv
  6. TransformingWriter is a proxy that applies transformations to records passing through.
    • .add(new CopyField("stock", "stock_name")) is used to create a new field i.e. stock_name and copy all the data present in stock to it.
    • .add(new RenameField("shares", "share_count")) renames the field from shares to share_count.
  7. Data is transferred from the reader to the writer via Job.run() method.

CSV Output

stock,time,price,share_count,stock_name
JHX,09:30:00,57.0,95,JHX
JNJ,09:30:00,91.14,548,JNJ
OPK,09:30:00,8.3,300,OPK
OPK,09:30:00,8.3,63,OPK
.
.
.
IR,09:38:14,61.4201,100,IR
TE,09:38:14,17.16,100,TE
TE,09:38:14,17.16,100,TE
MSI,09:38:14,67.02,82,MSI
TE,09:38:14,17.16,600,TE
TE,09:38:14,17.16,100,TE
TE,09:38:14,17.16,100,TE
TE,09:38:14,17.16,200,TE

Console Output

Job Running Time:- 4 Seconds, 58 Milliseconds
Total Records transferred:- 999999
Time taken by TranformingReader:- 1 Second, 415 Milliseconds
Time taken by CSVReader:- 1 Second, 257 Milliseconds
Time taken by TransformingWriter:- 333 Milliseconds
Time taken by CSVWriter:- 965 Milliseconds
Mobile Analytics