Add a Sequence Number Column when Values Change
            Updated: Jun 20, 2023 
        
        
        
    In this example, you will learn how you can use DataPipeline to transform data on the fly by adding a new Sequence Number Column that tracks the changes in a specified field. It provides users with the capability to assign a unique sequence number to each record based on the changes in the specified field, facilitating data tracking and analysis.
The example can be used to create versions of the data based on changes in the specified field. By assigning sequence numbers to track changes, users can analyze the historical progression of the data and perform a comparative analysis between different versions, facilitating historical trend analysis and understanding data evolution.
Input CSV file
Account,LastName,FirstName,Balance,CreditLimit,AccountCreated,Rating 101,Reeves,Keanu,9315.45,10000.00,1/17/1998,A 312,Butler,Gerard,90.00,1000.00,8/6/2003,B 868,Hewitt,Jennifer Love,0,17000.00,5/25/1985,B 761,Pinkett-Smith,Jada,49654.87,100000.00,12/5/2006,A 317,Murray,Bill,789.65,5000.00,2/5/2007,C
Java code listing
package com.northconcepts.datapipeline.examples.cookbook;
import java.io.File;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.transform.SetBatchSequenceNumberField;
import com.northconcepts.datapipeline.transform.TransformingReader;
public class AddASequenceNumberColumnWhenValuesChange {
    public static void main(String[] args) {
        DataReader reader = new CSVReader(new File("example/data/input/credit-balance-01.csv"))
                .setFieldNamesInFirstRow(true);
        TransformingReader transformingReader = new TransformingReader(reader);
        transformingReader.add(new SetBatchSequenceNumberField("BatchNumber", 10, 5, "Rating"));
        Job.run(transformingReader, new StreamWriter(System.out));
    }
}
Code walkthrough
- CSVReader is created corresponding to the input file credit-balance-01.csv.
- The CSVReader.setFieldNamesInFirstRow(true)method is invoked to specify that the names specified in the first row should be used as field names.
- A TransformingReader is created to sequentially apply one or more transformations to the incoming data.
- transformingReader.add(new SetBatchSequenceNumberField("BatchNumber", 10, 5, "Rating"))is used to add a new sequence number column to the data from the- reader.
- SetBatchSequenceNumberField("BatchNumber", 10, 5, "Rating")accepts four arguments i.e. name(- "BatchNumber"), start(- 10), step(- 5) and watchFields(- "Rating"). This method uses var args, meaning that for the last watchFields field, a variable number of String arguments can be specified.
- SetBatchSequenceNumberFieldis used to include a sequential field (auto increment) with the specified increment (step) when the values in the specified- watchFieldsdiffer from the previous record. For this particular example, the- BatchNumberfield initiates with a value of- 10for the first record and monitors changes in the- Ratingfield. If the value is different from the previous record's- Ratingfield, the- BatchNumberis incremented by- 5i.e. 10, 15, 20 ...
- Data is transferred from the transformingReaderto theStreamWriter(System.out)via Job.run() method.
Console Output
-----------------------------------------------
0 - Record (MODIFIED) {
    0:[Account]:STRING=[101]:String
    1:[LastName]:STRING=[Reeves]:String
    2:[FirstName]:STRING=[Keanu]:String
    3:[Balance]:STRING=[9315.45]:String
    4:[CreditLimit]:STRING=[10000.00]:String
    5:[AccountCreated]:STRING=[1/17/1998]:String
    6:[Rating]:STRING=[A]:String
    7:[BatchNumber]:LONG=[10]:Long
}
-----------------------------------------------
1 - Record (MODIFIED) {
    0:[Account]:STRING=[312]:String
    1:[LastName]:STRING=[Butler]:String
    2:[FirstName]:STRING=[Gerard]:String
    3:[Balance]:STRING=[90.00]:String
    4:[CreditLimit]:STRING=[1000.00]:String
    5:[AccountCreated]:STRING=[8/6/2003]:String
    6:[Rating]:STRING=[B]:String
    7:[BatchNumber]:LONG=[15]:Long
}
-----------------------------------------------
2 - Record (MODIFIED) {
    0:[Account]:STRING=[868]:String
    1:[LastName]:STRING=[Hewitt]:String
    2:[FirstName]:STRING=[Jennifer Love]:String
    3:[Balance]:STRING=[0]:String
    4:[CreditLimit]:STRING=[17000.00]:String
    5:[AccountCreated]:STRING=[5/25/1985]:String
    6:[Rating]:STRING=[B]:String
    7:[BatchNumber]:LONG=[15]:Long
}
-----------------------------------------------
3 - Record (MODIFIED) {
    0:[Account]:STRING=[761]:String
    1:[LastName]:STRING=[Pinkett-Smith]:String
    2:[FirstName]:STRING=[Jada]:String
    3:[Balance]:STRING=[49654.87]:String
    4:[CreditLimit]:STRING=[100000.00]:String
    5:[AccountCreated]:STRING=[12/5/2006]:String
    6:[Rating]:STRING=[A]:String
    7:[BatchNumber]:LONG=[20]:Long
}
-----------------------------------------------
4 - Record (MODIFIED) {
    0:[Account]:STRING=[317]:String
    1:[LastName]:STRING=[Murray]:String
    2:[FirstName]:STRING=[Bill]:String
    3:[Balance]:STRING=[789.65]:String
    4:[CreditLimit]:STRING=[5000.00]:String
    5:[AccountCreated]:STRING=[2/5/2007]:String
    6:[Rating]:STRING=[C]:String
    7:[BatchNumber]:LONG=[25]:Long
}
-----------------------------------------------
5 records
        
