How to Query Java Objects with XPath

Data Pipeline’s query engine allows you to use XPath to query XML, JSON, and Java objects. This walkthrough will show you how to query Java objects using XPath and save the results to a CSV file. While the reading and writing will be done with the JavaBeanReader and CSVWriter classes, you can swap out the CSVWriter for any other endpoint or transformation that Data Pipeline supports.

Java object model

The object model used as input is an ArrayList of Java beans called Signal. The Signal class has a variety of private fields, along with their corresponding getter and setter methods.

public class Signal {

    private String source;
    private String name;
    private String component;
    private int occurrence;
    private int size;
    private float bandwidth;
    private int sizewithHeader;
    private float bandwidthWithHeader;

    public Signal() {
    }

    public String getSource() {
        return source;
    }

    public void setSource(String source) {
        this.source = source;
    }

    ...

}

public class Signal {

private String source;

private String name;

private String component;

private int occurrence;

private int size;

private float bandwidth;

private int sizewithHeader;

private float bandwidthWithHeader;

public Signal() {

}

public String getSource() {

return source;

}

public void setSource(String source) {

this.source = source;

}

...

}

The XPath engine doesn’t require any change to read your classes — no interfaces to implement and no special annotations. The only restriction is that your model needs to be some combination of:

Java beans
Arrays
Collections (anything implementing java.lang.Iterable)
Maps

All other types, including strings, dates, primitives, boxed types, etc., are treated as single-valued types (leaves, instead of branch nodes).

Resulting CSV file

The transfer job will create a CSV file named output.csv with the following contents.

source,name,component,occurrence,size,bandwidth,sizewithHeader,bandwidthWithHeader
TestSource1,TestName1,TestComponent1,1,1,1.0,1,1.0
TestSource2,TestName2,TestComponent2,2,2,2.0,2,2.0

source,name,component,occurrence,size,bandwidth,sizewithHeader,bandwidthWithHeader

TestSource1,TestName1,TestComponent1,1,1,1.0,1,1.0

TestSource2,TestName2,TestComponent2,2,2,2.0,2,2.0

Data Pipeline code

Now once you have your object model and decided on the output format (CSV in this case), it’s time to write the Data Pipeline job code.

import java.io.File;
import java.util.ArrayList;
import java.util.List;

import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.csv.CSVWriter;
import com.northconcepts.datapipeline.javabean.JavaBeanReader;
import com.northconcepts.datapipeline.job.JobTemplate;

public class WriteJavaObjectToCsv {

    public static void main(String[] args) {

        List<Signal> messages = getTestData();

        JavaBeanReader reader = new JavaBeanReader("messages", messages);

        reader.addField("source", "//source");
        reader.addField("name", "//name");
        reader.addField("component", "//component");
        reader.addField("occurrence", "//occurrence");
        reader.addField("size", "//size");
        reader.addField("bandwidth", "//bandwidth");
        reader.addField("sizewithHeader", "//sizewithHeader");
        reader.addField("bandwidthWithHeader", "//bandwidthWithHeader");

        reader.addRecordBreak("//Signal");

        DataWriter writer = new CSVWriter(new File("example/data/output/output.csv"));

        JobTemplate.DEFAULT.transfer(reader, writer);
    }

    private static List<Signal> getTestData() {
        List<Signal> messages = new ArrayList<Signal>();

        messages.add(new Signal("TestSource1", "TestName1", "TestComponent1", 1, 1, 1.0f, 1, 1.0f));
        messages.add(new Signal("TestSource2", "TestName2", "TestComponent2", 2, 2, 2.0f, 2, 2.0f));

        return messages;
    }

}

import java.io.File;

import java.util.ArrayList;

import java.util.List;

import com.northconcepts.datapipeline.core.DataWriter;

import com.northconcepts.datapipeline.csv.CSVWriter;

import com.northconcepts.datapipeline.javabean.JavaBeanReader;

import com.northconcepts.datapipeline.job.JobTemplate;

public class WriteJavaObjectToCsv {

public static void main(String[] args) {

List<Signal> messages = getTestData();

JavaBeanReader reader = new JavaBeanReader("messages", messages);

reader.addField("source", "//source");

reader.addField("name", "//name");

reader.addField("component", "//component");

reader.addField("occurrence", "//occurrence");

reader.addField("size", "//size");

reader.addField("bandwidth", "//bandwidth");

reader.addField("sizewithHeader", "//sizewithHeader");

reader.addField("bandwidthWithHeader", "//bandwidthWithHeader");

reader.addRecordBreak("//Signal");

DataWriter writer = new CSVWriter(new File("example/data/output/output.csv"));

JobTemplate.DEFAULT.transfer(reader, writer);

}

private static List<Signal> getTestData() {

List<Signal> messages = new ArrayList<Signal>();

messages.add(new Signal("TestSource1", "TestName1", "TestComponent1", 1, 1, 1.0f, 1, 1.0f));

messages.add(new Signal("TestSource2", "TestName2", "TestComponent2", 2, 2, 2.0f, 2, 2.0f));

return messages;

}

There are basically only three parts to the above code.

1. Create your JavaBeanReader

Instantiate a new JavaBeanReader with your arbitrary document/root name and your object model.

List<Signal> messages = getTestData();
JavaBeanReader reader = new JavaBeanReader("messages", messages);

1 2	List<Signal> messages = getTestData(); JavaBeanReader reader = new JavaBeanReader("messages", messages);

Think of the "messages" param as the root node in an XML document. It can be used as part of your XPath query if needed: /messages//source.

The second param — your input object model — can be any of the types previously mentioned: array, collection, Java bean, etc.

2. Specify your fields and record breaks

Use the addField and addRecordBreak methods to tell the JavaBeanReader how to create records from your object model.

reader.addField("source", "//source");
reader.addField("name", "//name");
reader.addField("component", "//component");
reader.addField("occurrence", "//occurrence");
reader.addField("size", "//size");
reader.addField("bandwidth", "//bandwidth");
reader.addField("sizewithHeader", "//sizewithHeader");
reader.addField("bandwidthWithHeader", "//bandwidthWithHeader");

reader.addRecordBreak("//Signal");

reader.addField("source", "//source");

reader.addField("name", "//name");

reader.addField("component", "//component");

reader.addField("occurrence", "//occurrence");

reader.addField("size", "//size");

reader.addField("bandwidth", "//bandwidth");

reader.addField("sizewithHeader", "//sizewithHeader");

reader.addField("bandwidthWithHeader", "//bandwidthWithHeader");

reader.addRecordBreak("//Signal");

The addField method takes the name of the new, target field and an XPath 1.0 location path to identify each field of the record.

The addRecordBreak method takes another XPath 1.0 location path to identify each record’s boundary. The reader returns a new record whenever this pattern is matched. The returned record contains all fields (from addField matches) that have been captured up to that point.

3. Run the job

Create the desired target DataWriter and run the reader-to-writer transfer.

DataWriter writer = new CSVWriter(new File("example/data/output/output.csv"));

JobTemplate.DEFAULT.transfer(reader, writer);

DataWriter writer = new CSVWriter(new File("example/data/output/output.csv"));

JobTemplate.DEFAULT.transfer(reader, writer);

At this point, you can replace the CSVWriter with another DataWriter to produce a different output format. Here are several other examples of endpoints you can use:

XPath queries

The XPath 1.0 location paths used in the addField and addRecordBreak methods are a subset of the full spec.

This limitation is primarily due to Data Pipeline being built to stream data. To ensure the XML and JSON parsers run with low memory overhead, only forward matching, XPath expression are supported. You can see the list of supported expression on JavaBeanReader’s Javadoc.

Download Data Pipeline

The Data Pipeline library, including the online examples, are available for immediate download. Once you have it, see the getting started guide to start running the examples.

How to Query Java Objects with XPath

Java object model

Resulting CSV file

Data Pipeline code

1. Create your JavaBeanReader

2. Specify your fields and record breaks

3. Run the job

XPath queries

Download Data Pipeline

About Dele Taylor

One thought on “How to Query Java Objects with XPath”

Leave a Reply Cancel reply

Data Pipeline

Docs

Company

Tools