Simple Aggregation

The simplest example of aggregation is to summarize an entire stream. This summary acts like a SQL group-by.

The following example accepts the input records from the CSV file and outputs one record for each unique combination of year and month, having calculated the count, sum, avg, min, and max for each combination.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
DataReader reader = new CSVReader(new File("example/data/input/purchases.csv"))
    .setFieldNamesInFirstRow(true);
 
// Convert strings to numbers
reader = new TransformingReader(reader)
    .add(new BasicFieldTransformer("year").stringToInt())
    .add(new BasicFieldTransformer("month").stringToInt())
    .add(new BasicFieldTransformer("unit_price").stringToDouble())
    .add(new BasicFieldTransformer("qty").stringToInt())
    .add(new BasicFieldTransformer("total").stringToDouble())
    ;
 
// Group by year and month
reader = new GroupByReader(reader, "year", "month")
    .count("transactions")
    .sum("total", "total")
    .sum("qty", "total_qty")
    .avg("total", "avg_purchase")
    .min("total", "min_purchase")
    .max("total", "max_purchase")
    ;
 
DataWriter writer = new StreamWriter(System.out);
 
Job job = new Job(reader, writer);
job.run();

The output looks something like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-----------------------------------------------
0 - Record (MODIFIED) {
    0:[year]:INT=[2014]:Integer
    1:[month]:INT=[4]:Integer
    2:[transactions]:LONG=[4]:Long
    3:[total]:DOUBLE=[137.89]:BigDecimal
    4:[total_qty]:DOUBLE=[11]:BigDecimal
    5:[avg_purchase]:DOUBLE=[34.4725]:BigDecimal
    6:[min_purchase]:DOUBLE=[17.99]:Double
    7:[max_purchase]:DOUBLE=[71.96]:Double
}
 
-----------------------------------------------
1 - Record (MODIFIED) {
    0:[year]:INT=[2015]:Integer
    1:[month]:INT=[6]:Integer
    2:[transactions]:LONG=[9]:Long
    3:[total]:DOUBLE=[1199.72]:BigDecimal
    4:[total_qty]:DOUBLE=[28]:BigDecimal
    5:[avg_purchase]:DOUBLE=[133.30222]:BigDecimal
    6:[min_purchase]:DOUBLE=[11.98]:Double
    7:[max_purchase]:DOUBLE=[749.97]:Double
}
 
-----------------------------------------------
...
-----------------------------------------------
23 - Record (MODIFIED) {
    0:[year]:INT=[2014]:Integer
    1:[month]:INT=[9]:Integer
    2:[transactions]:LONG=[3]:Long
    3:[total]:DOUBLE=[271.96]:BigDecimal
    4:[total_qty]:DOUBLE=[4]:BigDecimal
    5:[avg_purchase]:DOUBLE=[90.65333]:BigDecimal
    6:[min_purchase]:DOUBLE=[9.99]:Double
    7:[max_purchase]:DOUBLE=[249.99]:Double
}
 
-----------------------------------------------
24 records
Mobile Analytics