Use partial wildcards in XPath when reading JSON

Updated: Mar 23, 2025

This example shows how you can use partial wildcards when reading JSON files using JsonReader. Wildcards allow you to match nodes dynamically, making it easier to extract data when node names follow a pattern but are not identical.

 

Input JSON file

{
  "products": [
    {
      "id": 1,
      "name": "Laptop",
      "Price ($)": 1200,
      "currency*": "USD"
    },
    {
      "id": 2,
      "name": "Smartphone",
      "Price (€)": 900,
      "currency*": "EUR"
    },
    {
      "id": 3,
      "name": "Tablet",
      "Price (£)": 750,
      "currency*": "GBP"
    },
    {
      "id": 4,
      "name": "Headphones",
      "Price (¥)": 5000,
      "currency*": "JPY"
    }
  ]
}

 

Java code listing

package com.northconcepts.datapipeline.examples.cookbook;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.json.JsonReader;

import java.io.File;

public class ReadJsonFileUsingWildcard {

    public static void main(String[] args) {
        DataReader reader = new JsonReader(new File("example/data/input/products.json"))
                .addRecordBreak("//array/object")
                .addField("Id", "//array/object/id")
                .addField("Name", "//array/object/name")
                .addField("Price", "//array/object/`price (*)`") // A backtick is used to escape special symbols, and * will match all price nodes that have any characters inside the parentheses
                .addField("Currency", "//array/object/currency**") // An * is added to escape * that is in the original field
                ;
        DataWriter writer = new StreamWriter(System.out);

        Job.run(reader, writer);
    }
}

 

Code walkthrough

  • First a JsonReader is created corresponding to the input file products.json.
  • The  addRecordBreak("//array/object") is invoked to demarcate records.
  • The Id, Name, Price, and Currency fields are populated via the JsonReader.addField.
  • A wildcard is used in addField("Price", "//array/object/`price (*)`") to match nodes whose names begin with price such as, "Price ($)", "Price (€)", and similar variations. Since parentheses have special meaning in XPath, backticks are used to escape them.
  • The Currency node in the data contains an asterisk (*) in its name, so an additional asterisk is used to escape it.
  • StreamWriter is used to write the data to the console.
  • Data is transferred from the reader to the writer via Job.run() method. See how to compile and run data pipeline jobs.

 

Console Output

-----------------------------------------------
0 - Record (MODIFIED) {
    0:[Id]:LONG=[1]:Long
    1:[Name]:STRING=[Laptop]:String
    2:[Price]:LONG=[1200]:Long
    3:[Currency]:STRING=[USD]:String
}

-----------------------------------------------
1 - Record (MODIFIED) {
    0:[Id]:LONG=[2]:Long
    1:[Name]:STRING=[Smartphone]:String
    2:[Price]:LONG=[900]:Long
    3:[Currency]:STRING=[EUR]:String
}

-----------------------------------------------
2 - Record (MODIFIED) {
    0:[Id]:LONG=[3]:Long
    1:[Name]:STRING=[Tablet]:String
    2:[Price]:LONG=[750]:Long
    3:[Currency]:STRING=[GBP]:String
}

-----------------------------------------------
3 - Record (MODIFIED) {
    0:[Id]:LONG=[4]:Long
    1:[Name]:STRING=[Headphones]:String
    2:[Price]:LONG=[5000]:Long
    3:[Currency]:STRING=[JPY]:String
}

-----------------------------------------------
4 records
Mobile Analytics