Read Concatenated JSON
Updated: Sep 27, 2022
This example shows you how to read a concatenated JSON file using JsonRecordReader.
This example can be easily modified to show how to Write Concatenated JSON With Nested Data.
JSON input
{"id":"0001","type":"donut","name":"Cake","ppu":0.55,"batters":{"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's Food"}]},"topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},{"id":"5007","type":"Powdered Sugar"},{"id":"5006","type":"Chocolate with Sprinkles"},{"id":"5003","type":"Chocolate"},{"id":"5004","type":"Maple"}]}{"id":"0002","type":"donut","name":"Raised","ppu":0.55,"batters":{"batter":[{"id":"1001","type":"Regular"}]},"topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},{"id":"5003","type":"Chocolate"},{"id":"5004","type":"Maple"}]}
Java Code Listing
package com.northconcepts.datapipeline.examples.cookbook; import com.northconcepts.datapipeline.core.DataWriter; import com.northconcepts.datapipeline.job.Job; import com.northconcepts.datapipeline.json.JsonLinesWriter; import com.northconcepts.datapipeline.json.JsonRecordReader; import java.io.File; public class ReadConcatenatedJson { public static void main(String[] args) { JsonRecordReader reader = new JsonRecordReader(new File("data/input/concatenated-json.jsonl")) .addRecordBreak("/object"); DataWriter writer = new JsonLinesWriter(new File("data/output/output-json.jsonl")); Job.run(reader, writer); } }
Code Walkthrough
- JsonRecordReader is created corresponding to the input file
concatenated-json.jsonl
. .addRecordBreak()
method is used to separate the records./object
specify the path to the record you want to read. To see the paths to the records run the code in debug mode i.e.setDebug(true)
, it is false by default.JsonLinesWriter
object is created corresponding to the output fileoutput-json.jsonl
.- Data is then transferred from the
reader
to the output file via Job.run().
JsonRecordReader
JsonRecordReader is an input reader that can read records from an input JSON stream. A method JsonRecordReader.addRecordBreak
tells the reader to return a new record using whatever fields have been assigned. This method is basically used to demarcate records.
Output
{"id":"0001","type":"donut","name":"Cake","ppu":0.55,"batters":{"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's Food"}]},"topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},{"id":"5007","type":"Powdered Sugar"},{"id":"5006","type":"Chocolate with Sprinkles"},{"id":"5003","type":"Chocolate"},{"id":"5004","type":"Maple"}]} {"id":"0002","type":"donut","name":"Raised","ppu":0.55,"batters":{"batter":[{"id":"1001","type":"Regular"}]},"topping":[{"id":"5001","type":"None"},{"id":"5002","type":"Glazed"},{"id":"5005","type":"Sugar"},{"id":"5003","type":"Chocolate"},{"id":"5004","type":"Maple"}]}