Get a Query Param from a URL

Updated: Jun 25, 2023

In this example, you will learn how you can use DataPipeline to extract URL parameters from web links. It allows users to easily parse and retrieve the parameters embedded in URLs, enabling further processing or analysis of the extracted information.

Input CSV file

"watch",
"http://www.youtube.com/watch?v=kBO5dh9qrIQ",
"http://www.youtube.com/watch?v=pImoir9Yux4",
"http://www.youtube.com/watch?v=Qa-eXwKz4QA",
"http://www.youtube.com/watch?v=c26s1xQkCLY"
...

Java Code Listing

package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.job.Job;
import com.northconcepts.datapipeline.transform.TransformingReader;
import com.northconcepts.datapipeline.transform.net.GetUrlQueryParam;

public class GetAQueryParamFromAUrl {

    public static void main(String[] args) {
        DataReader reader = new CSVReader(new File("example/data/input/watch-list.csv"))
                .setFieldNamesInFirstRow(true);

        TransformingReader transformingReader = new TransformingReader(reader);

        transformingReader.add(new GetUrlQueryParam("watch", "v", "param"));

        Job.run(transformingReader, new StreamWriter(System.out));
    }
}

Code walkthrough

  1. CSVReader is created corresponding to the input file watch-list.csv.
  2. The CSVReader.setFieldNamesInFirstRow(true) method is invoked to specify that the names specified in the first row should be used as field names.
  3. A TransformingReader is created to apply one or more transformations to the incoming data sequentially.
  4. GetUrlQueryParam retrieves all query parameter values in a URL for the given name and stores them in the target field. The constructor arguments for this object are
    • urlFieldName - the field where URLs are stored,
    • queryParamName - the query parameter's name,
    • targetFieldName-  the target field to store the parameter.
  5.  Data is transferred from the transformingReader to the StreamWriter(System.out) via Job.run() method.

Output

-----------------------------------------------
0 - Record (MODIFIED) {
    0:[watch]:STRING=[http://www.youtube.com/watch?v=kBO5dh9qrIQ]:String
    1:[B]:STRING=[null]
    2:[param]:STRING=[kBO5dh9qrIQ]:String
}

-----------------------------------------------
1 - Record (MODIFIED) {
    0:[watch]:STRING=[http://www.youtube.com/watch?v=pImoir9Yux4]:String
    1:[B]:STRING=[null]
    2:[param]:STRING=[pImoir9Yux4]:String
}

-----------------------------------------------
2 - Record (MODIFIED) {
    0:[watch]:STRING=[http://www.youtube.com/watch?v=Qa-eXwKz4QA]:String
    1:[B]:STRING=[null]
    2:[param]:STRING=[Qa-eXwKz4QA]:String
}

-----------------------------------------------
3 - Record (MODIFIED) {
    0:[watch]:STRING=[http://www.youtube.com/watch?v=c26s1xQkCLY]:String
    1:[B]:STRING=[null]
    2:[param]:STRING=[c26s1xQkCLY]:String
}
...
Mobile Analytics