Measure Performance of Reader and Writer
Updated: Jun 4, 2023
This example will show you how you can measure the performance of both Reader and Writer.
To accomplish this an Endpoint will be used to tell the readers and writers to record their processing time.
Input CSV file
stock,time,price,shares JHX,09:30:00.00,57,95 JNJ,09:30:00.00,91.14,548 OPK,09:30:00.00,8.3,300 OPK,09:30:00.00,8.3,63 OMC,09:30:00.00,74.53,100 OMC,09:30:00.00,74.53,24 TWTR,09:30:00.00,64.89,100 . . . IR,09:38:14.05,61.4201,100 TE,09:38:14.05,17.16,100 TE,09:38:14.05,17.16,100 MSI,09:38:14.05,67.02,82 TE,09:38:14.05,17.16,600 TE,09:38:14.05,17.16,100 TE,09:38:14.05,17.16,100 TE,09:38:14.05,17.16,200
Java code listing
package com.northconcepts.datapipeline.examples.cookbook; import java.io.File; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.DataWriter; import com.northconcepts.datapipeline.core.Endpoint; import com.northconcepts.datapipeline.csv.CSVReader; import com.northconcepts.datapipeline.csv.CSVWriter; import com.northconcepts.datapipeline.job.Job; import com.northconcepts.datapipeline.transform.BasicFieldTransformer; import com.northconcepts.datapipeline.transform.CopyField; import com.northconcepts.datapipeline.transform.RenameField; import com.northconcepts.datapipeline.transform.TransformingReader; import com.northconcepts.datapipeline.transform.TransformingWriter; public class MeasurePerformanceOfReaderAndWriter { public static void main(String[] args) { //set flag - captureElapsedTime to measure performance. Endpoint.setCaptureElapsedTime(true); DataReader reader = new CSVReader(new File("example/data/input/trades.csv")) .setFieldNamesInFirstRow(true); reader = new TransformingReader(reader) .add(new BasicFieldTransformer("price").stringToDouble()) .add(new BasicFieldTransformer("time").stringToTime("hh:mm:s")); DataWriter writer = new CSVWriter(new File("example/data/output/writer_performance_measurement.csv")); writer = new TransformingWriter(writer) .add(new CopyField("stock", "stock_name")) .add(new RenameField("shares", "share_count")); Job job = Job.run(reader, writer); System.out.println("Job Running Time:- " + job.getRunningTimeAsString()); System.out.println("Total Records transferred:- " + job.getRecordsTransferred()); System.out.println("Time taken by TranformingReader:- " + reader.getSelfTimeAsString()); System.out.println("Time taken by CSVReader:- " + reader.getNestedEndpoint().getSelfTimeAsString()); System.out.println("Time taken by TransformingWriter:- " + writer.getSelfTimeAsString()); System.out.println("Time taken by CSVWriter:- " + writer.getNestedEndpoint().getSelfTimeAsString()); } }
Code walkthrough
Endpoint.setCaptureElapsedTime(true)
enables recording of processing time.- A CSVReader is created using the file path of the input file
trades.csv
. - The
CSVReader.setFieldNamesInFirstRow(true)
method is invoked to specify that the names specified in the first row should be used as field names. - TransformingReader is used to transform or change the data from string to their required type e.g.
.add(new BasicFieldTransformer("price").stringToDouble())
is used to transform the price data from string to double. - CSVWriter is created corresponding to the output file
writer_performance_measurement.csv
- TransformingWriter is a proxy that applies transformations to records passing through.
.add(new CopyField("stock", "stock_name"))
is used to create a new field i.e.stock_name
and copy all the data present instock
to it..add(new RenameField("shares", "share_count"))
renames the field fromshares
toshare_count
.- Data is transferred from the
reader
to thewriter
via Job.run() method.
CSV Output
stock,time,price,share_count,stock_name JHX,09:30:00,57.0,95,JHX JNJ,09:30:00,91.14,548,JNJ OPK,09:30:00,8.3,300,OPK OPK,09:30:00,8.3,63,OPK . . . IR,09:38:14,61.4201,100,IR TE,09:38:14,17.16,100,TE TE,09:38:14,17.16,100,TE MSI,09:38:14,67.02,82,MSI TE,09:38:14,17.16,600,TE TE,09:38:14,17.16,100,TE TE,09:38:14,17.16,100,TE TE,09:38:14,17.16,200,TE
Console Output
Job Running Time:- 4 Seconds, 58 Milliseconds Total Records transferred:- 999999 Time taken by TranformingReader:- 1 Second, 415 Milliseconds Time taken by CSVReader:- 1 Second, 257 Milliseconds Time taken by TransformingWriter:- 333 Milliseconds Time taken by CSVWriter:- 965 Milliseconds