Use partial wildcards in XPath when reading JSON
Updated: Mar 23, 2025
This example shows how you can use partial wildcards when reading JSON files using JsonReader. Wildcards allow you to match nodes dynamically, making it easier to extract data when node names follow a pattern but are not identical.
Input JSON file
{
"products": [
{
"id": 1,
"name": "Laptop",
"Price ($)": 1200,
"currency*": "USD"
},
{
"id": 2,
"name": "Smartphone",
"Price (€)": 900,
"currency*": "EUR"
},
{
"id": 3,
"name": "Tablet",
"Price (£)": 750,
"currency*": "GBP"
},
{
"id": 4,
"name": "Headphones",
"Price (¥)": 5000,
"currency*": "JPY"
}
]
}
Java code listing
package com.northconcepts.datapipeline.examples.cookbook;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.json.JsonReader;
import java.io.File;
public class ReadJsonFileUsingWildcard {
public static void main(String[] args) {
DataReader reader = new JsonReader(new File("example/data/input/products.json"))
.addRecordBreak("//array/object")
.addField("Id", "//array/object/id")
.addField("Name", "//array/object/name")
.addField("Price", "//array/object/`price (*)`") // A backtick is used to escape special symbols, and * will match all price nodes that have any characters inside the parentheses
.addField("Currency", "//array/object/currency**") // An * is added to escape * that is in the original field
;
DataWriter writer = new StreamWriter(System.out);
Job.run(reader, writer);
}
}
Code walkthrough
- First a
JsonReaderis created corresponding to the input fileproducts.json. - The
addRecordBreak("//array/object")is invoked to demarcate records. - The
Id,Name,Price, andCurrencyfields are populated via the JsonReader.addField. - A wildcard is used in
addField("Price", "//array/object/`price (*)`")to match nodes whose names begin withpricesuch as,"Price ($)","Price (€)", and similar variations. Since parentheses have special meaning in XPath, backticks are used to escape them. - The
Currencynode in the data contains an asterisk (*) in its name, so an additional asterisk is used to escape it. - StreamWriter is used to write the data to the console.
- Data is transferred from the
readerto thewriterviaJob.run()method. See how to compile and run data pipeline jobs.
Console Output
-----------------------------------------------
0 - Record (MODIFIED) {
0:[Id]:LONG=[1]:Long
1:[Name]:STRING=[Laptop]:String
2:[Price]:LONG=[1200]:Long
3:[Currency]:STRING=[USD]:String
}
-----------------------------------------------
1 - Record (MODIFIED) {
0:[Id]:LONG=[2]:Long
1:[Name]:STRING=[Smartphone]:String
2:[Price]:LONG=[900]:Long
3:[Currency]:STRING=[EUR]:String
}
-----------------------------------------------
2 - Record (MODIFIED) {
0:[Id]:LONG=[3]:Long
1:[Name]:STRING=[Tablet]:String
2:[Price]:LONG=[750]:Long
3:[Currency]:STRING=[GBP]:String
}
-----------------------------------------------
3 - Record (MODIFIED) {
0:[Id]:LONG=[4]:Long
1:[Name]:STRING=[Headphones]:String
2:[Price]:LONG=[5000]:Long
3:[Currency]:STRING=[JPY]:String
}
-----------------------------------------------
4 records
