Conditionally Transform Data

In this example, you will learn how you can use DataPipeline to read JSON files and perform data transformations on the column data based on specified conditions. It provides a powerful mechanism for manipulating JSON data, allowing users to modify and update column values dynamically.

Users can employ the example to perform data validation checks on JSON data. By defining conditions and associated actions, users can identify and handle erroneous or inconsistent column values, ensuring data integrity and consistency.

Input JSON files

[{"stageName":"John Wayne","realName":"Marion Robert Morrison","gender":"male","city":"Winterset","balance":156.35},
{"stageName":"Spiderman","realName":"Peter Parker","gender":"male","city":"New York","balance":-0.96}]

Java Code Listing

package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.filter.FilterExpression;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.json.SimpleJsonReader;
import com.northconcepts.datapipeline.transform.SetField;
import com.northconcepts.datapipeline.transform.TransformingReader;

public class ConditionallyTransformData {

    public static void main(String[] args) {
        DataReader reader = new SimpleJsonReader(
                new File("example/data/input/simple-json-input.json"));
        
        reader = new TransformingReader(reader)
                .setCondition(new FilterExpression("balance < 0"))
                .add(new SetField("balance", 0.0));
        
        DataWriter writer = new StreamWriter(System.out);
        
        Job.run(reader, writer);
    }
/* input
[{"stageName":"John Wayne","realName":"Marion Robert Morrison","gender":"male","city":"Winterset","balance":156.35},
{"stageName":"Spiderman","realName":"Peter Parker","gender":"male","city":"New York","balance":-0.96}]
*/
// output - spiderman's balance becomes zero
}

Code walkthrough

  1. SimpleJsonReader is created corresponding to the input file simple-json-input.json.
  2. TransformingReader is a proxy that applies transformations to records passing through.
  3. Condition is applied with FilterExpression. In the example, balance values are checked if they are smaller than 0.
  4. The next step includes what logic to implement when the above condition is met. In the given example, "balance" is set to 0 with the SetField instance.
  5. Data is transferred from the reader to the CSVWriter via Job.run() method. 

Output

-----------------------------------------------
0 - Record (MODIFIED) {
    0:[stageName]:STRING=[John Wayne]:String
    1:[realName]:STRING=[Marion Robert Morrison]:String
    2:[gender]:STRING=[male]:String
    3:[city]:STRING=[Winterset]:String
    4:[balance]:DOUBLE=[156.35]:Double
}

-----------------------------------------------
1 - Record (MODIFIED) {
    0:[stageName]:STRING=[Spiderman]:String
    1:[realName]:STRING=[Peter Parker]:String
    2:[gender]:STRING=[male]:String
    3:[city]:STRING=[New York]:String
    4:[balance]:DOUBLE=[0.0]:Double
}

-----------------------------------------------
2 records

 

Mobile Analytics