Set a Calculated Field at Runtime

In this example, you will learn how you can use DataPipeline to add calculated fields to a dataset at runtime, allowing you to derive new values based on existing data. It provides a flexible and dynamic way to perform calculations on the dataset, enhancing data analysis and processing capabilities.

Users can leverage the example to add calculated fields that enrich the dataset with additional information. For example, users can calculate derived metrics, perform mathematical operations, or apply business rules to generate new fields that provide deeper insights into the data.

Input CSV file

Account,LastName,FirstName,Balance,CreditLimit,AccountCreated,Rating
101,Reeves,Keanu,9315.45,10000.00,1/17/1998,A
312,Butler,Gerard,90.00,1000.00,8/6/2003,B
868,Hewitt,Jennifer Love,0,17000.00,5/25/1985,B
761,Pinkett-Smith,Jada,49654.87,100000.00,12/5/2006,A
317,Murray,Bill,789.65,5000.00,2/5/2007,C

Java Code Listing

package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.transform.SetCalculatedField;
import com.northconcepts.datapipeline.transform.TransformingReader;

public class SetACalculatedFieldAtRuntime {
    
    public static void main(String[] args) throws Throwable {
        DataReader reader = new CSVReader(new File("example/data/input/credit-balance-01.csv"))
            .setFieldNamesInFirstRow(true);
        
        // Since CSV fields are strings, they need to be parsed before subtraction
        reader = new TransformingReader(reader)
                .add(new SetCalculatedField("AvailableCredit", "parseDouble(CreditLimit) - parseDouble(Balance)"));
        
        Job.run(reader, new StreamWriter(System.out));
    }
    
}

Code walkthrough

  1. CSVReader is created corresponding to the input file credit-balance-01.csv.
  2. The CSVReader.setFieldNamesInFirstRow(true) method is invoked to specify that the names specified in the first row should be used as field names.
  3. A TransformingReader is created to apply one or more transformations to the incoming data sequentially.
  4. SetCalculatedField object accepts a name argument specifying a new field name in the output, and an expression argument containing calculation operations. In the given example, the "AvailableCredit" field is being added to the dataset. When it comes to the expression part, String values in the "CreditLimit" and "Balance" fields are first parsed to double type and then subtracted from each other: "parseDouble(CreditLimit) - parseDouble(Balance)".
  5.  Data is transferred from the reader to the StreamWriter(System.out) via Job.run() method.

Output

-----------------------------------------------
0 - Record (MODIFIED) {
    0:[Account]:STRING=[101]:String
    1:[LastName]:STRING=[Reeves]:String
    2:[FirstName]:STRING=[Keanu]:String
    3:[Balance]:STRING=[9315.45]:String
    4:[CreditLimit]:STRING=[10000.00]:String
    5:[AccountCreated]:STRING=[1/17/1998]:String
    6:[Rating]:STRING=[A]:String
    7:[AvailableCredit]:DOUBLE=[684.5499999999993]:Double
}

-----------------------------------------------
1 - Record (MODIFIED) {
    0:[Account]:STRING=[312]:String
    1:[LastName]:STRING=[Butler]:String
    2:[FirstName]:STRING=[Gerard]:String
    3:[Balance]:STRING=[90.00]:String
    4:[CreditLimit]:STRING=[1000.00]:String
    5:[AccountCreated]:STRING=[8/6/2003]:String
    6:[Rating]:STRING=[B]:String
    7:[AvailableCredit]:DOUBLE=[910.0]:Double
}

-----------------------------------------------
2 - Record (MODIFIED) {
    0:[Account]:STRING=[868]:String
    1:[LastName]:STRING=[Hewitt]:String
    2:[FirstName]:STRING=[Jennifer Love]:String
    3:[Balance]:STRING=[0]:String
    4:[CreditLimit]:STRING=[17000.00]:String
    5:[AccountCreated]:STRING=[5/25/1985]:String
    6:[Rating]:STRING=[B]:String
    7:[AvailableCredit]:DOUBLE=[17000.0]:Double
}

-----------------------------------------------
3 - Record (MODIFIED) {
    0:[Account]:STRING=[761]:String
    1:[LastName]:STRING=[Pinkett-Smith]:String
    2:[FirstName]:STRING=[Jada]:String
    3:[Balance]:STRING=[49654.87]:String
    4:[CreditLimit]:STRING=[100000.00]:String
    5:[AccountCreated]:STRING=[12/5/2006]:String
    6:[Rating]:STRING=[A]:String
    7:[AvailableCredit]:DOUBLE=[50345.13]:Double
}

-----------------------------------------------
4 - Record (MODIFIED) {
    0:[Account]:STRING=[317]:String
    1:[LastName]:STRING=[Murray]:String
    2:[FirstName]:STRING=[Bill]:String
    3:[Balance]:STRING=[789.65]:String
    4:[CreditLimit]:STRING=[5000.00]:String
    5:[AccountCreated]:STRING=[2/5/2007]:String
    6:[Rating]:STRING=[C]:String
    7:[AvailableCredit]:DOUBLE=[4210.35]:Double
}

-----------------------------------------------
5 records
Mobile Analytics