Read an Avro File
Updated: Jun 4, 2023
In this example you will learn how you can use AvroReader from DataPipline to read an avro file.
Avro is an open source project that provides data serialization and data exchange services for Apache Hadoop.
This example can be easily modified to show you how to write an Avro File.
Java Code listing
package com.northconcepts.datapipeline.examples.cookbook; import java.io.File; import com.northconcepts.datapipeline.avro.AvroReader; import com.northconcepts.datapipeline.core.DataReader; import com.northconcepts.datapipeline.core.StreamWriter; import com.northconcepts.datapipeline.job.Job; public class ReadAnAvroFile { public static void main(String[] args) { DataReader reader = new AvroReader(new File("example/data/input/twitter.avro")); Job.run(reader, new StreamWriter(System.out)); } /* output ----------------------------------------------- 0 - Record (MODIFIED) { 0:[username]:STRING=[miguno]:String 1:[tweet]:STRING=[Rock: Nerf paper, scissors is fine.]:String 2:[timestamp]:LONG=[1366150681]:Long } ----------------------------------------------- 1 - Record (MODIFIED) { 0:[username]:STRING=[BlizzardCS]:String 1:[tweet]:STRING=[Works as intended. Terran is IMBA.]:String 2:[timestamp]:LONG=[1366154481]:Long } ----------------------------------------------- 2 records */ }
Code walkthrough
- AvroReader is created corresponding to the input file
twitter.avro
. What this does is it reads the avro file and converts it into records that we can access throughreader
. - Job.run(reader, new StreamWriter(System.out) is then used to copy the data from the
reader
to theStreamWriter
. - StreamWriter is used to write the document in a human readable format. In this case the output is written to the console.
AvroReader
Used to read an Apache Avro file and convert the contents into Records.
To read a stream of avro file, AvroReader(InputStream inputStream)
constructor is used.
Console Output
-----------------------------------------------
0 - Record (MODIFIED) {
0:[username]:STRING=[miguno]:String
1:[tweet]:STRING=[Rock: Nerf paper, scissors is fine.]:String
2:[timestamp]:LONG=[1366150681]:Long
}
-----------------------------------------------
1 - Record (MODIFIED) {
0:[username]:STRING=[BlizzardCS]:String
1:[tweet]:STRING=[Works as intended. Terran is IMBA.]:String
2:[timestamp]:LONG=[1366154481]:Long
}
-----------------------------------------------
2 records
----------------------------------------------- 0 - Record (MODIFIED) { 0:[username]:STRING=[miguno]:String 1:[tweet]:STRING=[Rock: Nerf paper, scissors is fine.]:String 2:[timestamp]:LONG=[1366150681]:Long } ----------------------------------------------- 1 - Record (MODIFIED) { 0:[username]:STRING=[BlizzardCS]:String 1:[tweet]:STRING=[Works as intended. Terran is IMBA.]:String 2:[timestamp]:LONG=[1366154481]:Long } ----------------------------------------------- 2 records