How To Transfer Columns From One CSV File Into Another Using Java

This blog will show you how to pull selected columns from a CSV file containing IP geolocation data and save them into a second CSV file using our Data Pipeline Java library. As part of the transformation, you’ll also have the option to rearrange the order of the resulting columns.

Download Data Pipeline

IP Geolocation CSV file

The code reads an input CSV file created from the free IP geolocation database at http://dev.maxmind.com/geoip/legacy/geolite/.

1.0.0.0,1.0.0.255,16777216,16777471,AU,Australia
1.0.1.0,1.0.3.255,16777472,16778239,CN,China
1.0.4.0,1.0.7.255,16778240,16779263,AU,Australia
1.0.8.0,1.0.15.255,16779264,16781311,CN,China
1.0.16.0,1.0.31.255,16781312,16785407,JP,Japan
1.0.32.0,1.0.63.255,16785408,16793599,CN,China
1.0.64.0,1.0.127.255,16793600,16809983,JP,Japan
1.0.128.0,1.0.255.255,16809984,16842751,TH,Thailand

1.0.0.0,1.0.0.255,16777216,16777471,AU,Australia

1.0.1.0,1.0.3.255,16777472,16778239,CN,China

1.0.4.0,1.0.7.255,16778240,16779263,AU,Australia

1.0.8.0,1.0.15.255,16779264,16781311,CN,China

1.0.16.0,1.0.31.255,16781312,16785407,JP,Japan

1.0.32.0,1.0.63.255,16785408,16793599,CN,China

1.0.64.0,1.0.127.255,16793600,16809983,JP,Japan

1.0.128.0,1.0.255.255,16809984,16842751,TH,Thailand

Column Extraction Code

The Java code below creates a new CSV file containing the last two columns from the input CSV file.

package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataEndpoint;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.csv.CSVWriter;
import com.northconcepts.datapipeline.job.JobTemplate;
import com.northconcepts.datapipeline.transform.IncludeFields;
import com.northconcepts.datapipeline.transform.TransformingReader;

public class SelectivelyTransferCsvFile2 {

	public static void main(String[] args) {

		//Read the input CSV file
		DataReader reader = new CSVReader(new File(
				"example/data/input/GeoliteDataInput.csv"))
				.setFieldNamesInFirstRow(false);
		
		//Add a transforming reader to only include the necessary fields
		 TransformingReader transformingReader = new TransformingReader(reader);
		 transformingReader.add(new IncludeFields("E","F"));
		
		//Create output CSV file
		 DataWriter writer = new CSVWriter(new File("example/data/output/GeoliteDataOutput.csv"))
         .setFieldNamesInFirstRow(false);
		 
		 //Transfer data from the input to output file
		 JobTemplate.DEFAULT.transfer(transformingReader, writer);
		
	}
}

package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataEndpoint;

import com.northconcepts.datapipeline.core.DataReader;

import com.northconcepts.datapipeline.core.DataWriter;

import com.northconcepts.datapipeline.csv.CSVReader;

import com.northconcepts.datapipeline.csv.CSVWriter;

import com.northconcepts.datapipeline.job.JobTemplate;

import com.northconcepts.datapipeline.transform.IncludeFields;

import com.northconcepts.datapipeline.transform.TransformingReader;

public class SelectivelyTransferCsvFile2 {

public static void main(String[] args) {

//Read the input CSV file

DataReader reader = new CSVReader(new File(

"example/data/input/GeoliteDataInput.csv"))

.setFieldNamesInFirstRow(false);

//Add a transforming reader to only include the necessary fields

TransformingReader transformingReader = new TransformingReader(reader);

transformingReader.add(new IncludeFields("E","F"));

//Create output CSV file

DataWriter writer = new CSVWriter(new File("example/data/output/GeoliteDataOutput.csv"))

.setFieldNamesInFirstRow(false);

//Transfer data from the input to output file

JobTemplate.DEFAULT.transfer(transformingReader, writer);

}

CSVReader

A new CSVReader is created to parse the GeoliteDataInput.csv input file. The setFieldNamesInFirstRow(false) method call causes CSVReader to create synthetic names for each column starting with “A”, “B”, “C”, etc. Similar to MS Excel’s column naming. If true had been passed in, then the actual column names in the file would have been used.

DataReader reader = new CSVReader(new File("example/data/input/GeoliteDataInput.csv"))
  .setFieldNamesInFirstRow(false);

1 2	DataReader reader = new CSVReader(new File("example/data/input/GeoliteDataInput.csv")) .setFieldNamesInFirstRow(false);

Select Fields Data Transformation

A TransformingReader is wrapped around the initial CSVReader to allow the data to be manipulated on-the-fly. The actual modifications are specified by adding Transformer subclasses — IncludeFields in this case — to the TransformingReader.

TransformingReader transformingReader = new TransformingReader(reader);
transformingReader.add(new IncludeFields("E","F"));

1 2	TransformingReader transformingReader = new TransformingReader(reader); transformingReader.add(new IncludeFields("E","F"));

The IncludeFields class specifies which columns from the input CSV file need to be included in the output CSV file. All other columns are discarded from this point forward. IncludeFields also arranges the columns in the specified order. Passing in “F”, then “E” would have swapped the country name and country code columns.

ExcludeFields

An alternative to selecting the fields to retain with IncludeFields would be to specify the fields to drop using ExcludeFields. Unlike IncludeFields, this transformer cannot rearrange columns. Deciding which to use usually comes down to which is easier — whitelisting or blacklisting — and whether or not columns need to be rearranged.

CSVWriter

A CSVWriter is created to generate the new GeoliteDataOutput.csv output file. The setFieldNamesInFirstRow(false) method is called in this case to to prevent CSVWriter from saving the synthetic field names added during the read step.

DataWriter writer = new CSVWriter(new File("example/data/output/GeoliteDataOutput.csv"))
.setFieldNamesInFirstRow(false);

1 2	DataWriter writer = new CSVWriter(new File("example/data/output/GeoliteDataOutput.csv")) .setFieldNamesInFirstRow(false);

Running the Transfer Job

The actual data is transfered from the reader to writer using the default implementation of JobTemplate.

JobTemplate.DEFAULT.transfer(transformingReader, writer);

1	JobTemplate.DEFAULT.transfer(transformingReader, writer);

Output CSV file

AU,Australia
CN,China
AU,Australia
CN,China
JP,Japan
CN,China
JP,Japan
TH,Thailand

AU,Australia

CN,China

AU,Australia

CN,China

JP,Japan

CN,China

JP,Japan

TH,Thailand

Download Data Pipeline

You can get started with Data Pipeline by downloading it and reviewing the getting started page.

You can also view additional examples by visiting https://northconcepts.com/data-pipeline/examples/.

Download Data Pipeline

By haneef - September 29, 2014 1:40 AM Reply

This above code generate csv file with limit filesize .Ex:-i am giving input file size=21.4 MB
but it generates output file 224kb only.how to resolve this.
- By Dele Taylor - September 29, 2014 7:19 AM Reply
  
  Haneef, I’ll follow-up with you over email to get more info on what you need.
Pingback: How To Transfer Columns From One CSV File Into Another Using Java | Dinesh Ram Kali.
By dia - September 14, 2017 3:55 PM Reply

when I am adding this code in eclipse I am getting errors for all import statements
like import ********** can not be resolved can any one help to resolve this issue plz ASAP
- By Dele Taylor - October 11, 2017 12:44 AM Reply
  
  Dia — you’ll need to download our Data Pipeline library to resolve the imports. Cheers.

How To Transfer Columns From One CSV File Into Another Using Java

IP Geolocation CSV file

Column Extraction Code

CSVReader

Select Fields Data Transformation

ExcludeFields

CSVWriter

Running the Transfer Job

Output CSV file

Download Data Pipeline

About The DataPipeline Team

5 thoughts on “How To Transfer Columns From One CSV File Into Another Using Java”

Leave a Reply Cancel reply

Data Pipeline

Docs

Company

Tools