Manipulate Fields
Updated: Jun 23, 2023
In this example, you will learn how to apply various operations to column data, including modifying field types, parsing values, rounding numbers, and more using DataPipeline. This is a flexible and efficient way to transform and manipulate data within columns to achieve your desired data format and structure.
The library provides capabilities for cleaning and normalizing column data. Users can remove unwanted characters, trim leading or trailing spaces, convert case (e.g., to lowercase or uppercase), and perform other data cleansing operations to ensure data consistency and quality.
Java Code Listing
package com.northconcepts.datapipeline.examples.cookbook; import java.sql.Date; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.DataWriter; import com.northconcepts.datapipeline.core.FieldType; import com.northconcepts.datapipeline.core.Record; import com.northconcepts.datapipeline.core.RecordList; import com.northconcepts.datapipeline.core.StreamWriter; import com.northconcepts.datapipeline.job.Job; import com.northconcepts.datapipeline.memory.MemoryReader; import com.northconcepts.datapipeline.transform.BasicFieldTransformer; import com.northconcepts.datapipeline.transform.TransformingReader; import com.northconcepts.datapipeline.transform.format.Rounder; public class ManipulateFields { public static void main(String[] args) { // Setup test data //-------------------------------------------------------------- RecordList recordList = new RecordList(); Record record1 = new Record(); record1.setField("Account", "JW19850512AST"); record1.setField("Name", "John Wayne"); record1.setField("Balance", 156.35); record1.setField("LastPaymentDate", new Date(2007-1900, 1, 13)); recordList.add(record1); Record record2 = new Record(); record2.getField("Account", true).setValue("PP20010204PIJ"); record2.getField("Name", true).setValue("Peter Parker"); record2.getField("Balance", true).setValue(-120.85); record2.getField("LastPaymentDate", true).setNull(FieldType.DATE); recordList.add(record2); DataReader reader = new MemoryReader(recordList); DataWriter writer = new StreamWriter(System.out); // Setup transformations //-------------------------------------------------------------- TransformingReader transformingReader = new TransformingReader(reader); transformingReader.add(new BasicFieldTransformer("LastPaymentDate") .dateToString("EEE MMM d, yyyy") // 'LastPaymentDate' to string .nullToValue("No payments") // convert any null 'LastPaymentDate' to 'No payments' ); transformingReader.add(new BasicFieldTransformer("Balance") .round(new Rounder(Rounder.RoundingPolicy.HALF_ODD, 1)) // round 'Balance' to 1 decimal .numberToString("$#,##0.00;($#,##0.00)") // format 'Balance' as string ); transformingReader.add(new BasicFieldTransformer("Account") .substring(2, 10) // extract date-string from 'Account' .stringToDate("yyyyMMdd") // parse date-string .dateTimeToString("'Created' EEE MMM d, yyyy") // format date as string ); // Run transformation job //-------------------------------------------------------------- Job.run(transformingReader, writer); } }
Code walkthrough
- Test data is set up. RecordList object is created and two records are added to it.
- In order to add fields to each record,
Record.setField()
method is used. It acceptsfieldName
andvalue
arguments. For two records, values are added for four fields: Account, Name, Balance, LastPaymentDate. - MemoryReader instance is created to obtain records from an in-memory RecordList.
- TransformingReader is a proxy that applies transformations to records passing through.
- In order to apply transformation logic, BasicFieldTransformer is used in the following scenarios:
- "LastPaymentDate" values are parsed from Date to String type. Null values are changed to "No payments" text.
- The "Balance" field is rounded to one decimal point. Then the values are formatted with the following pattern to String:
"$#,##0.00"
. - When it comes to "Account" field, the date string is first e7xtracted from the given value. The result is then parsed into Date type and, then reformatted with a custom pattern to String type.
- Data is transferred from the
transformingReader
to theCSVWriter
via Job.run() method.
Output
----------------------------------------------- 0 - Record (MODIFIED) { 0:[Account]:STRING=[Created Sun May 12, 1985]:String 1:[Name]:STRING=[John Wayne]:String 2:[Balance]:STRING=[$156.30]:String 3:[LastPaymentDate]:STRING=[Tue Feb 13, 2007]:String } ----------------------------------------------- 1 - Record (MODIFIED) { 0:[Account]:STRING=[Created Sun Feb 4, 2001]:String 1:[Name]:STRING=[Peter Parker]:String 2:[Balance]:STRING=[($120.90)]:String 3:[LastPaymentDate]:STRING=[No payments]:String } ----------------------------------------------- 2 records