Use partial wildcards in XPath when reading XML
Updated: Mar 23, 2025
This example will show you how you can use partial wildcards when reading XML files using XmlReader. Wildcards allow you to match elements dynamically, making it easier to extract data when element names follow a pattern but are not identical.
Input XML file
<?xml version="1.0" encoding="ISO-8859-1"?> <products> <product> <id>1</id> <name>Laptop</name> <Price_Usd>1200</Price_Usd> <currency>USD</currency> </product> <product> <id>2</id> <name>Smartphone</name> <Price_Eur>900</Price_Eur> <currency>EUR</currency> </product> <product> <id>3</id> <name>Tablet</name> <Price_Gbp>750</Price_Gbp> <currency>GBP</currency> </product> <product> <id>4</id> <name>Headphones</name> <Price_Jpy>5000</Price_Jpy> <currency>JPY</currency> </product> </products>
Java code listing
package com.northconcepts.datapipeline.examples.cookbook; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.DataWriter; import com.northconcepts.datapipeline.core.StreamWriter; import com.northconcepts.datapipeline.job.Job; import com.northconcepts.datapipeline.xml.XmlReader; import java.io.File; public class ReadAnXmlFileUsingWildcard { public static void main(String[] args) { DataReader reader = new XmlReader(new File("example/data/input/products.xml")) .addRecordBreak("//product") .addField("Id", "//product/id") .addField("Name", "//product/name") .addField("Price", "//product/price_*") // This location path will match all nodes that starts with price_ .addField("Currency", "//product/currency") ; DataWriter writer = new StreamWriter(System.out); Job.run(reader, writer); } }
Code walkthrough
- First an
XmlReader
is created corresponding to the input fileproducts.xml
. - The
addRecordBreak("//product")
is invoked to return a new record whenever aproduct
element ends. - The
Id
,Name
,Price
, andCurrency
fields are populated via the XmlReader.addField. - A wildcard is used in
addField("Price", "//product/price_*")
to match elements whose name begins withprice
such as<Price_Usd>
,<Price_Eur>
, and similar variations. - StreamWriter is used to write the data to the console.
- Data is transferred from the
reader
to thewriter
viaJob.run()
method. See how to compile and run data pipeline jobs.
Console Output
----------------------------------------------- 0 - Record (MODIFIED) { 0:[Id]:STRING=[1]:String 1:[Name]:STRING=[Laptop]:String 2:[Price]:STRING=[1200]:String 3:[Currency]:STRING=[USD]:String } ----------------------------------------------- 1 - Record (MODIFIED) { 0:[Id]:STRING=[2]:String 1:[Name]:STRING=[Smartphone]:String 2:[Price]:STRING=[900]:String 3:[Currency]:STRING=[EUR]:String } ----------------------------------------------- 2 - Record (MODIFIED) { 0:[Id]:STRING=[3]:String 1:[Name]:STRING=[Tablet]:String 2:[Price]:STRING=[750]:String 3:[Currency]:STRING=[GBP]:String } ----------------------------------------------- 3 - Record (MODIFIED) { 0:[Id]:STRING=[4]:String 1:[Name]:STRING=[Headphones]:String 2:[Price]:STRING=[5000]:String 3:[Currency]:STRING=[JPY]:String } ----------------------------------------------- 4 records