Read a Web Server Log

Updated: May 31, 2022

This examples shows you how to read web server logs from access log files in Combined Log format using Data Pipeline.

In this demo code you are going to see how to use CombinedLogReader to read records from a web server access log file.

Input Log file

127.0.0.1 - - [10/Apr/2016:04:37:54 +0800] "GET / HTTP/1.1" 200 16400
0:0:0:0:0:0:0:1 - - [10/Apr/2016:04:54:10 +0800] "GET /docs/expression-language/ HTTP/1.1" 200 27484
0:0:0:0:0:0:0:1 - - [10/Apr/2016:04:54:12 +0800] "GET /css/sandstone/bootstrap.min.css HTTP/1.1" 304 -
0:0:0:0:0:0:0:1 - - [10/Apr/2016:04:54:12 +0800] "GET /fancybox/source/jquery.fancybox.css?v=2.1.5 HTTP/1.1" 304 -
0:0:0:0:0:0:0:1 - - [10/Apr/2016:04:54:12 +0800] "GET /fancybox/source/helpers/jquery.fancybox-buttons.css?v=1.0.5 HTTP/1.1" 304 -
0:0:0:0:0:0:0:1 - - [10/Apr/2016:04:54:12 +0800] "GET /fancybox/source/helpers/jquery.fancybox-thumbs.css?v=1.0.7 HTTP/1.1" 304 -

Only few input logs from the file are shown for clarity purpose.

Java Code Listing

/*
 * Copyright (c) 2006-2022 North Concepts Inc.  All rights reserved.
 * Proprietary and Confidential.  Use is subject to license terms.
 * 
 * https://northconcepts.com/data-pipeline/licensing/
 */
package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.weblog.CombinedLogReader;

public class ReadWebServerLogs {
    
    public static void main(String[] args)  throws Throwable {
        DataReader reader = new CombinedLogReader(
                new File("example/data/input/localhost_access_log.2016-04-10.txt"));
        
        Job.run(reader, new StreamWriter(System.out));
    }

}

Code Walkthrough

  1. CombinedLogReader is created to read from a log file localhost_access_log.2016-04-10.txt .
  2. Data are transferred from CombinedLogReader to the console via Job.run() method. See how to compile and run data pipeline jobs

CombinedLogReader

CombinedLogReader obtains records from a webserver's access log file using the Combined Log format.

Console output

04:05:12,075 DEBUG [main] datapipeline:37 - DataPipeline v7.2.0-SNAPSHOT by North Concepts Inc.
04:05:12,848 DEBUG [main] datapipeline:615 - Job[1,job-1,Wed May 25 04:05:12 EAT 2022]::Start
-----------------------------------------------
0 - Record {
    0:[remoteHost]:STRING=[127.0.0.1]:String
    1:[clientUsername]:STRING=[-]:String
    2:[authenticatedUsername]:STRING=[-]:String
    3:[date]:DATETIME=[Sat Apr 09 23:37:54 EAT 2016]:Date
    4:[request]:STRING=[GET / HTTP/1.1]:String
    5:[status]:STRING=[200]:String
    6:[bytes]:STRING=[16400]:String
    7:[referrer]:STRING=[null]
    8:[userAgent]:STRING=[null]
    9:[cookies]:STRING=[null]
}

-----------------------------------------------
1 - Record {
    0:[remoteHost]:STRING=[0:0:0:0:0:0:0:1]:String
    1:[clientUsername]:STRING=[-]:String
    2:[authenticatedUsername]:STRING=[-]:String
    3:[date]:DATETIME=[Sat Apr 09 23:54:10 EAT 2016]:Date
    4:[request]:STRING=[GET /docs/expression-language/ HTTP/1.1]:String
    5:[status]:STRING=[200]:String
    6:[bytes]:STRING=[27484]:String
    7:[referrer]:STRING=[null]
    8:[userAgent]:STRING=[null]
    9:[cookies]:STRING=[null]
}

-----------------------------------------------
2 - Record {
    0:[remoteHost]:STRING=[0:0:0:0:0:0:0:1]:String
    1:[clientUsername]:STRING=[-]:String
    2:[authenticatedUsername]:STRING=[-]:String
    3:[date]:DATETIME=[Sat Apr 09 23:54:12 EAT 2016]:Date
    4:[request]:STRING=[GET /css/sandstone/bootstrap.min.css HTTP/1.1]:String
    5:[status]:STRING=[304]:String
    6:[bytes]:STRING=[-]:String
    7:[referrer]:STRING=[null]
    8:[userAgent]:STRING=[null]
    9:[cookies]:STRING=[null]
}
.
.
.
.
.
248 - Record {
    0:[remoteHost]:STRING=[0:0:0:0:0:0:0:1]:String
    1:[clientUsername]:STRING=[-]:String
    2:[authenticatedUsername]:STRING=[-]:String
    3:[date]:DATETIME=[Sun Apr 10 03:54:04 EAT 2016]:Date
    4:[request]:STRING=[GET /img/datapipeline_logo300x60.png HTTP/1.1]:String
    5:[status]:STRING=[304]:String
    6:[bytes]:STRING=[-]:String
    7:[referrer]:STRING=[null]
    8:[userAgent]:STRING=[null]
    9:[cookies]:STRING=[null]
}

-----------------------------------------------
249 records
04:05:13,737 DEBUG [main] datapipeline:661 - job::Success

All Examples

Mobile Analytics