Read Concatenated JSON
Updated: Sep 27, 2022
This example shows you how to read a concatenated JSON file using JsonRecordReader.
This example can be easily modified to show how to Write Concatenated JSON With Nested Data.
JSON input
{"id":"0001","type":"donut","name":"Cake","ppu":0.55,"batters":{"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's Food"}]},"topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},{"id":"5007","type":"Powdered Sugar"},{"id":"5006","type":"Chocolate with Sprinkles"},{"id":"5003","type":"Chocolate"},{"id":"5004","type":"Maple"}]}{"id":"0002","type":"donut","name":"Raised","ppu":0.55,"batters":{"batter":[{"id":"1001","type":"Regular"}]},"topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},{"id":"5003","type":"Chocolate"},{"id":"5004","type":"Maple"}]}
Java Code Listing
package com.northconcepts.datapipeline.examples.cookbook;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.json.JsonLinesWriter;
import com.northconcepts.datapipeline.json.JsonRecordReader;
import java.io.File;
public class ReadConcatenatedJson {
public static void main(String[] args) {
JsonRecordReader reader = new JsonRecordReader(new File("data/input/concatenated-json.jsonl"))
.addRecordBreak("/object");
DataWriter writer = new JsonLinesWriter(new File("data/output/output-json.jsonl"));
Job.run(reader, writer);
}
}
Code Walkthrough
- JsonRecordReader is created corresponding to the input file
concatenated-json.jsonl. .addRecordBreak()method is used to separate the records./objectspecify the path to the record you want to read. To see the paths to the records run the code in debug mode i.e.setDebug(true), it is false by default.JsonLinesWriterobject is created corresponding to the output fileoutput-json.jsonl.- Data is then transferred from the
readerto the output file via Job.run().
JsonRecordReader
JsonRecordReader is an input reader that can read records from an input JSON stream. A method JsonRecordReader.addRecordBreak tells the reader to return a new record using whatever fields have been assigned. This method is basically used to demarcate records.
Output
{"id":"0001","type":"donut","name":"Cake","ppu":0.55,"batters":{"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's Food"}]},"topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},{"id":"5007","type":"Powdered Sugar"},{"id":"5006","type":"Chocolate with Sprinkles"},{"id":"5003","type":"Chocolate"},{"id":"5004","type":"Maple"}]}
{"id":"0002","type":"donut","name":"Raised","ppu":0.55,"batters":{"batter":[{"id":"1001","type":"Regular"}]},"topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},{"id":"5003","type":"Chocolate"},{"id":"5004","type":"Maple"}]}
