Compare Records Using Diff
Updated: Feb 7, 2024
This example demonstrates how you can use DataPipeline to compare records. This feature lets you determine which fields have been added, modified, and removed. Since records can be nested or contain arrays, the entire data trees will be compared. This can be useful for auditing and managing individual record transformations.
Java Code
package com.northconcepts.datapipeline.foundations.examples.difference; import java.math.BigDecimal; import java.time.LocalDate; import com.northconcepts.datapipeline.core.Record; import com.northconcepts.datapipeline.foundations.difference.RecordDiff; public class CompareRecords { public static void main(String[] args) { Record oldRecord = new Record() .setField("name", "John Doe") .setField("dob", LocalDate.parse("2000-01-01")) .setField("languages", new String[] {"English", "French"}) .setField("height", 1.70) .setField("netIncome", new BigDecimal(100_286.99)); Record newRecord = new Record() .setField("name", "John Doe") .setField("age", 24) .setField("languages", new String[] {"English", "French", "Spanish"}) .setField("height", 1.73) .setField("netIncome", new BigDecimal(120_286.99)); RecordDiff diff = RecordDiff.diff("userRecord", oldRecord, newRecord, "height", "netIncome"); // Diff will report following changes: // name - NONE // dob - REMOVED // age - ADDED // English - NONE // French - NONE // Spanish - ADDED // height and newIncome will not be reported as they are excluded from comparison. System.out.println("Record Diff: " + diff); } }
Code Walkthrough
- An
oldRecord
is created and initialized with the following fields:name
,dob
,languages
,height
, andnetIncome
. - A
newRecord
is created and initialized with the following fields:name
,age
,languages
,height
, andnetIncome
. - RecordDiff instance is created by calling
diff
() method with the following arguments: name of the diff, the old record, the new record, and a list of field names to exclude from the comparison. - The difference between the
oldRecord
and thenewRecord
are reported as:- CHANGED - if the property is just updated.
- ADDED - if a new property is added (e.g.,
age
field only exists in thenewRecord
). - REMOVED - if an existing property is removed (e.g.,
dob
exists in theoldRecord
but not in thenewRecord
). - NONE - if there is no change for the specific property (e.g.,
name
field remains similar on both records).
- Any changes to fields listed for exclusion from the comparison (i.e.,
height
andnetIncome
) are not reported.
Console Output
Record Diff: { "children" : [ { "name" : "name", "type" : "NONE" }, { "name" : "dob", "type" : "REMOVED" }, { "name" : "English", "type" : "NONE" }, { "name" : "French", "type" : "NONE" }, { "name" : "Spanish", "type" : "ADDED" }, { "name" : "age", "type" : "ADDED" } ], "name" : "userRecord", "type" : "CHANGED" }