Serialize and Deserialize Data
Updated: Aug 8, 2023
This example shows how data can be saved to DataPipeline's internal binary format and read back later. This serialization and deserialization can be used for a variety of purposes, including to:
- Stage data in a multi-step process
- Backup and restore data in a database or other source
- Transmit data over a network
- Store arbitrary data in a database or S3 file
Input File
Account,LastName,FirstName,Balance,CreditLimit,AccountCreated,Rating 101,Reeves,Keanu,9315.45,10000.00,1/17/1998,A 312,Butler,Gerard,90.00,1000.00,8/6/2003,B 868,Hewitt,Jennifer Love,0,17000.00,5/25/1985,B 761,Pinkett-Smith,Jada,49654.87,100000.00,12/5/2006,A 317,Murray,Bill,789.65,5000.00,2/5/2007,C
Java Code
package com.northconcepts.datapipeline.examples.cookbook; import java.io.File; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.DataWriter; import com.northconcepts.datapipeline.core.StreamWriter; import com.northconcepts.datapipeline.csv.CSVReader; import com.northconcepts.datapipeline.file.FileReader; import com.northconcepts.datapipeline.file.FileWriter; import com.northconcepts.datapipeline.job.Job; public class SerializeDeserializeData { public static void main(String[] args) { File csvFile = new File("example/data/input/credit-balance-01.csv"); File binaryFile = new File("example/data/output/credit-balance-01.bin"); DataReader reader; DataWriter writer; // Serialize records to binary file reader = new CSVReader(csvFile).setFieldNamesInFirstRow(true); writer = new FileWriter(binaryFile); Job.run(reader, writer); // Deserialize records from binary file reader = new FileReader(binaryFile); writer = new StreamWriter(System.out); Job.run(reader, writer); } }
Code Walkthrough
- A CSVReader is created to read from the local file
credit-balance-01.csv
. - In order to store the serialized records, a binary file
credit-balance-01.bin
is created and specified in FileWriter instance. - Data is transferred from
reader
to the writer declared in the previous step via Job.run(). - FileReader instance is used to deserialize data from the binary file.
- StreamWriter(System.out) is used to print the output to the console in a human-readable format.
Console Output
----------------------------------------------- 0 - Record (MODIFIED) { 0:[Account]:STRING=[101]:String 1:[LastName]:STRING=[Reeves]:String 2:[FirstName]:STRING=[Keanu]:String 3:[Balance]:STRING=[9315.45]:String 4:[CreditLimit]:STRING=[10000.00]:String 5:[AccountCreated]:STRING=[1/17/1998]:String 6:[Rating]:STRING=[A]:String } ----------------------------------------------- 1 - Record (MODIFIED) { 0:[Account]:STRING=[312]:String 1:[LastName]:STRING=[Butler]:String 2:[FirstName]:STRING=[Gerard]:String 3:[Balance]:STRING=[90.00]:String 4:[CreditLimit]:STRING=[1000.00]:String 5:[AccountCreated]:STRING=[8/6/2003]:String 6:[Rating]:STRING=[B]:String } ----------------------------------------------- 2 - Record (MODIFIED) { 0:[Account]:STRING=[868]:String 1:[LastName]:STRING=[Hewitt]:String 2:[FirstName]:STRING=[Jennifer Love]:String 3:[Balance]:STRING=[0]:String 4:[CreditLimit]:STRING=[17000.00]:String 5:[AccountCreated]:STRING=[5/25/1985]:String 6:[Rating]:STRING=[B]:String } ----------------------------------------------- 3 - Record (MODIFIED) { 0:[Account]:STRING=[761]:String 1:[LastName]:STRING=[Pinkett-Smith]:String 2:[FirstName]:STRING=[Jada]:String 3:[Balance]:STRING=[49654.87]:String 4:[CreditLimit]:STRING=[100000.00]:String 5:[AccountCreated]:STRING=[12/5/2006]:String 6:[Rating]:STRING=[A]:String } ----------------------------------------------- 4 - Record (MODIFIED) { 0:[Account]:STRING=[317]:String 1:[LastName]:STRING=[Murray]:String 2:[FirstName]:STRING=[Bill]:String 3:[Balance]:STRING=[789.65]:String 4:[CreditLimit]:STRING=[5000.00]:String 5:[AccountCreated]:STRING=[2/5/2007]:String 6:[Rating]:STRING=[C]:String } ----------------------------------------------- 5 records