Simple Aggregation
The simplest example of aggregation is to summarize an entire stream. This summary acts like a SQL group-by.
The following example accepts the input records from the CSV file and outputs one record for each unique combination of year and month, having calculated the count, sum, avg, min, and max for each combination.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | DataReader reader = new CSVReader( new File( "example/data/input/purchases.csv" )) .setFieldNamesInFirstRow( true ); // Convert strings to numbers reader = new TransformingReader(reader) .add( new BasicFieldTransformer( "year" ).stringToInt()) .add( new BasicFieldTransformer( "month" ).stringToInt()) .add( new BasicFieldTransformer( "unit_price" ).stringToDouble()) .add( new BasicFieldTransformer( "qty" ).stringToInt()) .add( new BasicFieldTransformer( "total" ).stringToDouble()) ; // Group by year and month reader = new GroupByReader(reader, "year" , "month" ) .count( "transactions" ) .sum( "total" , "total" ) .sum( "qty" , "total_qty" ) .avg( "total" , "avg_purchase" ) .min( "total" , "min_purchase" ) .max( "total" , "max_purchase" ) ; DataWriter writer = new StreamWriter(System.out); Job job = new Job(reader, writer); job.run(); |
The output looks something like this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | ----------------------------------------------- 0 - Record (MODIFIED) { 0:[year]:INT=[2014]:Integer 1:[month]:INT=[4]:Integer 2:[transactions]:LONG=[4]:Long 3:[total]:DOUBLE=[137.89]:BigDecimal 4:[total_qty]:DOUBLE=[11]:BigDecimal 5:[avg_purchase]:DOUBLE=[34.4725]:BigDecimal 6:[min_purchase]:DOUBLE=[17.99]:Double 7:[max_purchase]:DOUBLE=[71.96]:Double } ----------------------------------------------- 1 - Record (MODIFIED) { 0:[year]:INT=[2015]:Integer 1:[month]:INT=[6]:Integer 2:[transactions]:LONG=[9]:Long 3:[total]:DOUBLE=[1199.72]:BigDecimal 4:[total_qty]:DOUBLE=[28]:BigDecimal 5:[avg_purchase]:DOUBLE=[133.30222]:BigDecimal 6:[min_purchase]:DOUBLE=[11.98]:Double 7:[max_purchase]:DOUBLE=[749.97]:Double } ----------------------------------------------- ... ----------------------------------------------- 23 - Record (MODIFIED) { 0:[year]:INT=[2014]:Integer 1:[month]:INT=[9]:Integer 2:[transactions]:LONG=[3]:Long 3:[total]:DOUBLE=[271.96]:BigDecimal 4:[total_qty]:DOUBLE=[4]:BigDecimal 5:[avg_purchase]:DOUBLE=[90.65333]:BigDecimal 6:[min_purchase]:DOUBLE=[9.99]:Double 7:[max_purchase]:DOUBLE=[249.99]:Double } ----------------------------------------------- 24 records |