Read XML Records From File
In this example you are going to learn how to read records from an XML file using Data Pipeline.
This demo code reads records from an XML file via XmlRecordReader and print them on the console using StreamWriter.
This example can be modified to show you how to records to an XML file.
Input XML file
<?xml version="1.0" ?> <records> <record> <field name="stageName">John Wayne</field> <field name="realName">Marion Robert Morrison</field> <field name="gender">male</field> <field name="city">Winterset</field> <field name="balance">156.35</field> </record> <record> <field name="stageName">Spiderma<</field> <field name="realName">Peter Parker</field> <field name="gender">male</field> <field name="city">New York</field> <field name="balance">-0.96</field> </record> </records>
Java Code Listing
package com.northconcepts.datapipeline.examples.cookbook; import java.io.File; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.DataWriter; import com.northconcepts.datapipeline.core.StreamWriter; import com.northconcepts.datapipeline.job.Job; import com.northconcepts.datapipeline.xml.XmlRecordReader; public class ReadXmlRecordsFromFile { public static void main(String[] args) { DataReader reader = new XmlRecordReader(new File("example/data/input/simple-xml-input.xml")) .addRecordBreak("/records/record"); DataWriter writer = StreamWriter.newSystemOutWriter(); Job.run(reader, writer); } }
Code Walkthrough
- First, an XmlRecordReader object is created corresponding to the input file
simple-xml-input.xml
. - Data is transferred from XmlRecordReader to the console via Job.run() method. See how to compile and run data pipeline jobs.
XmlRecord and XmlRecordWriter
XmlRecordReader and XmlRecordWriter classes assist in reading and writing records from and to XML files. In this demo code, only XmlRecordReader is used to read records from XML files but we can also use XmlRecordWriter to write XML using the structure of each record's natural representation. A method XmlRecordReader.addRecordBreak
tells the reader to return a new record using whatever fields have been assigned. This method is basically used to demarcate records.
Console output
00:55:31,964 DEBUG [main] datapipeline:37 - DataPipeline v7.2.0-SNAPSHOT by North Concepts Inc. 00:55:35,953 DEBUG [main] datapipeline:615 - Job[1,job-1,Mon May 23 00:55:35 EAT 2022]::Start ----------------------------------------------- 0 - Record (MODIFIED) (has child records) { 0:[field]:ARRAY of RECORD=[[ Record (MODIFIED) (is child record) { 0:[@name]:STRING=[stageName]:String 1:[$text]:STRING=[John Wayne]:String }, Record (MODIFIED) (is child record) { 0:[@name]:STRING=[realName]:String 1:[$text]:STRING=[Marion Robert Morrison]:String }, Record (MODIFIED) (is child record) { 0:[@name]:STRING=[gender]:String 1:[$text]:STRING=[male]:String }, Record (MODIFIED) (is child record) { 0:[@name]:STRING=[city]:String 1:[$text]:STRING=[Winterset]:String }, Record (MODIFIED) (is child record) { 0:[@name]:STRING=[balance]:String 1:[$text]:STRING=[156.35]:String }]]:ArrayValue } ----------------------------------------------- 1 - Record (MODIFIED) (has child records) { 0:[field]:ARRAY of RECORD=[[ Record (MODIFIED) (is child record) { 0:[@name]:STRING=[stageName]:String 1:[$text]:STRING=[Spiderman]:String }, Record (MODIFIED) (is child record) { 0:[@name]:STRING=[realName]:String 1:[$text]:STRING=[Peter Parker]:String }, Record (MODIFIED) (is child record) { 0:[@name]:STRING=[gender]:String 1:[$text]:STRING=[male]:String }, Record (MODIFIED) (is child record) { 0:[@name]:STRING=[city]:String 1:[$text]:STRING=[New York]:String }, Record (MODIFIED) (is child record) { 0:[@name]:STRING=[balance]:String 1:[$text]:STRING=[-0.96]:String }]]:ArrayValue } ----------------------------------------------- 2 records 00:55:37,944 DEBUG [main] datapipeline:661 - job::Success