Serialize and Deserialize Data
Updated: Aug 8, 2023
This example shows how data can be saved to DataPipeline's internal binary format and read back later. This serialization and deserialization can be used for a variety of purposes, including to:
- Stage data in a multi-step process
- Backup and restore data in a database or other source
- Transmit data over a network
- Store arbitrary data in a database or S3 file
Input File
Account,LastName,FirstName,Balance,CreditLimit,AccountCreated,Rating 101,Reeves,Keanu,9315.45,10000.00,1/17/1998,A 312,Butler,Gerard,90.00,1000.00,8/6/2003,B 868,Hewitt,Jennifer Love,0,17000.00,5/25/1985,B 761,Pinkett-Smith,Jada,49654.87,100000.00,12/5/2006,A 317,Murray,Bill,789.65,5000.00,2/5/2007,C
Java Code
package com.northconcepts.datapipeline.examples.cookbook;
import java.io.File;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.file.FileReader;
import com.northconcepts.datapipeline.file.FileWriter;
import com.northconcepts.datapipeline.job.Job;
public class SerializeDeserializeData {
public static void main(String[] args) {
File csvFile = new File("example/data/input/credit-balance-01.csv");
File binaryFile = new File("example/data/output/credit-balance-01.bin");
DataReader reader;
DataWriter writer;
// Serialize records to binary file
reader = new CSVReader(csvFile).setFieldNamesInFirstRow(true);
writer = new FileWriter(binaryFile);
Job.run(reader, writer);
// Deserialize records from binary file
reader = new FileReader(binaryFile);
writer = new StreamWriter(System.out);
Job.run(reader, writer);
}
}
Code Walkthrough
- A CSVReader is created to read from the local file
credit-balance-01.csv. - In order to store the serialized records, a binary file
credit-balance-01.binis created and specified in FileWriter instance. - Data is transferred from
readerto the writer declared in the previous step via Job.run(). - FileReader instance is used to deserialize data from the binary file.
- StreamWriter(System.out) is used to print the output to the console in a human-readable format.
Console Output
-----------------------------------------------
0 - Record (MODIFIED) {
0:[Account]:STRING=[101]:String
1:[LastName]:STRING=[Reeves]:String
2:[FirstName]:STRING=[Keanu]:String
3:[Balance]:STRING=[9315.45]:String
4:[CreditLimit]:STRING=[10000.00]:String
5:[AccountCreated]:STRING=[1/17/1998]:String
6:[Rating]:STRING=[A]:String
}
-----------------------------------------------
1 - Record (MODIFIED) {
0:[Account]:STRING=[312]:String
1:[LastName]:STRING=[Butler]:String
2:[FirstName]:STRING=[Gerard]:String
3:[Balance]:STRING=[90.00]:String
4:[CreditLimit]:STRING=[1000.00]:String
5:[AccountCreated]:STRING=[8/6/2003]:String
6:[Rating]:STRING=[B]:String
}
-----------------------------------------------
2 - Record (MODIFIED) {
0:[Account]:STRING=[868]:String
1:[LastName]:STRING=[Hewitt]:String
2:[FirstName]:STRING=[Jennifer Love]:String
3:[Balance]:STRING=[0]:String
4:[CreditLimit]:STRING=[17000.00]:String
5:[AccountCreated]:STRING=[5/25/1985]:String
6:[Rating]:STRING=[B]:String
}
-----------------------------------------------
3 - Record (MODIFIED) {
0:[Account]:STRING=[761]:String
1:[LastName]:STRING=[Pinkett-Smith]:String
2:[FirstName]:STRING=[Jada]:String
3:[Balance]:STRING=[49654.87]:String
4:[CreditLimit]:STRING=[100000.00]:String
5:[AccountCreated]:STRING=[12/5/2006]:String
6:[Rating]:STRING=[A]:String
}
-----------------------------------------------
4 - Record (MODIFIED) {
0:[Account]:STRING=[317]:String
1:[LastName]:STRING=[Murray]:String
2:[FirstName]:STRING=[Bill]:String
3:[Balance]:STRING=[789.65]:String
4:[CreditLimit]:STRING=[5000.00]:String
5:[AccountCreated]:STRING=[2/5/2007]:String
6:[Rating]:STRING=[C]:String
}
-----------------------------------------------
5 records
