Use partial wildcards in XPath when reading JSON
Updated: Mar 23, 2025
This example shows how you can use partial wildcards when reading JSON files using JsonReader. Wildcards allow you to match nodes dynamically, making it easier to extract data when node names follow a pattern but are not identical.
Input JSON file
{ "products": [ { "id": 1, "name": "Laptop", "Price ($)": 1200, "currency*": "USD" }, { "id": 2, "name": "Smartphone", "Price (€)": 900, "currency*": "EUR" }, { "id": 3, "name": "Tablet", "Price (£)": 750, "currency*": "GBP" }, { "id": 4, "name": "Headphones", "Price (¥)": 5000, "currency*": "JPY" } ] }
Java code listing
package com.northconcepts.datapipeline.examples.cookbook; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.DataWriter; import com.northconcepts.datapipeline.core.StreamWriter; import com.northconcepts.datapipeline.job.Job; import com.northconcepts.datapipeline.json.JsonReader; import java.io.File; public class ReadJsonFileUsingWildcard { public static void main(String[] args) { DataReader reader = new JsonReader(new File("example/data/input/products.json")) .addRecordBreak("//array/object") .addField("Id", "//array/object/id") .addField("Name", "//array/object/name") .addField("Price", "//array/object/`price (*)`") // A backtick is used to escape special symbols, and * will match all price nodes that have any characters inside the parentheses .addField("Currency", "//array/object/currency**") // An * is added to escape * that is in the original field ; DataWriter writer = new StreamWriter(System.out); Job.run(reader, writer); } }
Code walkthrough
- First a
JsonReader
is created corresponding to the input fileproducts.json
. - The
addRecordBreak("//array/object")
is invoked to demarcate records. - The
Id
,Name
,Price
, andCurrency
fields are populated via the JsonReader.addField. - A wildcard is used in
addField("Price", "//array/object/`price (*)`")
to match nodes whose names begin withprice
such as,"Price ($)"
,"Price (€)"
, and similar variations. Since parentheses have special meaning in XPath, backticks are used to escape them. - The
Currency
node in the data contains an asterisk (*
) in its name, so an additional asterisk is used to escape it. - StreamWriter is used to write the data to the console.
- Data is transferred from the
reader
to thewriter
viaJob.run()
method. See how to compile and run data pipeline jobs.
Console Output
----------------------------------------------- 0 - Record (MODIFIED) { 0:[Id]:LONG=[1]:Long 1:[Name]:STRING=[Laptop]:String 2:[Price]:LONG=[1200]:Long 3:[Currency]:STRING=[USD]:String } ----------------------------------------------- 1 - Record (MODIFIED) { 0:[Id]:LONG=[2]:Long 1:[Name]:STRING=[Smartphone]:String 2:[Price]:LONG=[900]:Long 3:[Currency]:STRING=[EUR]:String } ----------------------------------------------- 2 - Record (MODIFIED) { 0:[Id]:LONG=[3]:Long 1:[Name]:STRING=[Tablet]:String 2:[Price]:LONG=[750]:Long 3:[Currency]:STRING=[GBP]:String } ----------------------------------------------- 3 - Record (MODIFIED) { 0:[Id]:LONG=[4]:Long 1:[Name]:STRING=[Headphones]:String 2:[Price]:LONG=[5000]:Long 3:[Currency]:STRING=[JPY]:String } ----------------------------------------------- 4 records