North Concepts
home » data pipeline » frequently asked questions

Frequently Asked Questions

Overview

If your question isn't answered here or in the Examples, section, please look in the discussion forums and create a new topic if necessary.

1. General

1.1What does Data Pipeline do?
1.2What if I need additional features?
1.3Where do I go with further questions?
1.4How do I get support?
1.5Are you continuing to develop Data Pipeline?

3. Development

3.1How do I get started?
3.2Why does my program stall?
3.3Why do I get out of memory exception?
3.4Which version of Java does it support?
3.5Are there any debugging or logging functions?
3.6What libraries are required?
3.7How do I cache data?
3.8Does Data Pipeline support searching?
3.9Does Data Pipeline support filtering?
3.10How do I load MS Excel files that use VLookup.

2. Licensing

2.1What are the licensing options?
2.2Can Data Pipeline be used for educational or non-commercial purposes?

1. General

1.1 What does Data Pipeline do?
Data Pipeline makes it easy to add data conversion, processing, and transformation to Java applications.

The toolkit has readers and writers for physical files (CSV, Excel, Fixed-width), plus chainable secondary readers for processing and transforming data (filter, remove duplicates, lookups, validation).
1.2 What if I need additional features?
Data Pipeline is actively being updated with new features to reflect our client's needs. If you have an idea for a new feature, please post a message in the discussion forums and we'll attempt to include it in a future release.

If your request is of an urgent nature, please email us, we would be happy to provide you a quote to enhance or customize Data Pipeline to your needs.
1.3 Where do I go with further questions?
We host discussion forums where you can post questions for the communuity.  We will read and answer forum questions as time permits.

We also provide commercial support through email for an additional fee on an annual basis. Commercial support provides you with the most responsive support from our experts, it features:
  • Priority treatment
  • Dedicated Email Address from our Technical Support Team
  • Less than 24 hour response time during business days
If you've purchased a Commercial Support Package from us, please send your questions to support@northconcepts.com. Please include the version # and as much detail about your issue as possible.
1.4 How do I get support?
We provide a Commercial Support Package, which can be purchased on an annual basis, or you may raise questions for the community through our discussion forums.
1.5 Are you continuing to develop Data Pipeline?
Yes.  We are actively updating Data Pipeline with new features to reflect our client's needs.

2. Licensing

2.1 What are the licensing options?
Data Pipeline has two licensing options:

Open Source (GPL)

Please see the Open Source License Agreement for complete details.

Commercial

Please see the Purchase page and the Commercial License Agreement for complete details.

If you are interested in an alternative licensing option (such as for OEM or educational purposes), please email us your needs.
2.2 Can Data Pipeline be used for educational or non-commercial purposes?
The Open Source License should suffice in most such cases.  You can always purchase a Commercial Licence or email us your specific requirements.

3. Development

3.1 How do I get started?
Our Getting Started Guide and Examples will have you up and running quickly.
3.2 Why does my program stall?
Data Pipeline may stall if DataReader.read() is called before DataReader.open().
3.3 Why do I get out of memory exception?
Java's default, maximum memory size of 64 MB may not be adequate for your application.  Try using the -Xmx Java option to set a higher maximum.
3.4 Which version of Java does Data Pipeline support?
Data Pipeline supports Sun Java Standard Edition SDK 1.5 or greater.
3.5 Are there any debugging or logging functions?
Yes. DebugReader can be layered onto another DataReader to log everything it reads.
3.6 What libraries are required?
Data Pipeline is distributed with all the necessary libraries (except for JDBC drivers):
  • ANTLR (antlr-2.7.5.jar)
  • Apache POI (poi-3.0-alpha2-20060616.jar, poi-contrib-3.0-alpha2-20060616.jar, poi-scratchpad-3.0-alpha2-20060616.jar)
  • JExcel (jxl.jar)
3.7 How do I cache data?
Data can be cached in a RecordList in one of two ways:
1. Automatically, by writing to a MemoryWriter then calling getRecordList().
/*
* Copyright (c) 2006-2008 North Concepts Inc.  All rights reserved.
* Proprietary and Confidential.  Use is subject to license terms.
*
* http://northconcepts.com/data-pipeline/licensing/
*
*/
package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import org.apache.log4j.Logger;

import com.northconcepts.datapipeline.core.DataEndpoint;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.RecordList;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.job.JobTemplate;
import com.northconcepts.datapipeline.memory.MemoryWriter;

public class WriteToMemory {
   
   
public static final Logger log = DataEndpoint.log;

   
public static void main(String[] args) {
       
DataReader reader = new CSVReader(new File("credit-balance.csv"))
           
.setFieldNamesInFirstRow(true);

        MemoryWriter memoryWriter =
new  MemoryWriter();
   
        JobTemplate.DEFAULT.transfer
(reader, memoryWriter);
       
        RecordList recordList = memoryWriter.getRecordList
();
       
for (int i = 0; i < recordList.getRecordCount(); i++) {
           
log.debug(recordList.get(i));
       
}
    }

}

2. Manually, by creating a RecordList and adding Records to it.
/*
* Copyright (c) 2006-2008 North Concepts Inc.  All rights reserved.
* Proprietary and Confidential.  Use is subject to license terms.
*
* http://northconcepts.com/data-pipeline/licensing/
*
*/
package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.Record;
import com.northconcepts.datapipeline.core.RecordList;
import com.northconcepts.datapipeline.csv.CSVWriter;
import com.northconcepts.datapipeline.job.JobTemplate;
import com.northconcepts.datapipeline.memory.MemoryReader;

public class ReadFromMemory {

   
public static void main(String[] args) {
       
Record record1 = new Record();
        record1.getField
("name", true).setValue("John Wayne");
        record1.getField
("balance", true).setValue(156.35);

        Record record2 =
new Record();
        record2.getField
("name", true).setValue("Peter Parker");
        record2.getField
("balance", true).setValue(0.96);
       
        RecordList recordList =
new RecordList();
        recordList.add
(record1);
        recordList.add
(record2);
       
        DataReader reader =
new MemoryReader(recordList);
        DataWriter writer =
new CSVWriter(new File("credit-balance2.csv"));
   
        JobTemplate.DEFAULT.transfer
(reader, writer);
   
}
   
}

These examples, and others, can be found in the Examples section.
3.8 Does Data Pipeline support searching?
Yes. RecordLists can be searched by calling findAll(), findFirst(), or findLast() . Searches can also be performed by filtering incoming data.
3.9 Does Data Pipeline support filtering?
Yes. FilteringReader provides a means of selecting records either programmatically or using the run-time expression language.
/*
* Copyright (c) 2006-2008 North Concepts Inc.  All rights reserved.
* Proprietary and Confidential.  Use is subject to license terms.
*
* http://northconcepts.com/data-pipeline/licensing/
*
*/
package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.filter.FieldFilter;
import com.northconcepts.datapipeline.filter.FilterExpression;
import com.northconcepts.datapipeline.filter.FilteringReader;
import com.northconcepts.datapipeline.filter.rule.IsJavaType;
import com.northconcepts.datapipeline.filter.rule.IsNotNull;
import com.northconcepts.datapipeline.filter.rule.PatternMatch;
import com.northconcepts.datapipeline.filter.rule.ValueMatch;
import com.northconcepts.datapipeline.job.JobTemplate;

public class FilterRecords {
   
   
public static void main(String[] args) throws Throwable {
       
DataReader reader = new CSVReader(new File("credit-balance.csv"))
           
.setFieldNamesInFirstRow(true);
       
        FilteringReader filteringReader =
new FilteringReader(reader);
       
        filteringReader.add
(new FieldFilter("Rating")
               
.addRule(new IsNotNull())
               
.addRule(new IsJavaType(String.class))
               
.addRule(new ValueMatch().add("A").add("B").add("C")));
       
        filteringReader.add
(new FieldFilter("Account")
               
.addRule(new IsNotNull())
               
.addRule(new IsJavaType(String.class))
               
.addRule(new PatternMatch("[0-9]*")));
       
        filteringReader.add
(new FilterExpression(
               
"parseDouble(CreditLimit) >= 0 && parseDouble(CreditLimit) <= 100000 and parseDouble(Balance) <= parseDouble(CreditLimit)"));

        DataWriter writer =
new StreamWriter(System.out);

        JobTemplate.DEFAULT.transfer
(filteringReader, writer);
   
}

}

The above examples, and more, can be found in the Examples section.
3.10 How do I load MS Excel files that use VLookup.
Simply reading an MS Excel file with the evaluateExpressions property set to true (by default) will cause VLookups to be evaluated.
/*
* Copyright (c) 2006-2008 North Concepts Inc.  All rights reserved.
* Proprietary and Confidential.  Use is subject to license terms.
*
* http://northconcepts.com/data-pipeline/licensing/
*
*/
package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import org.apache.log4j.Logger;

import com.northconcepts.datapipeline.core.DataEndpoint;
import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.Record;
import com.northconcepts.datapipeline.excel.ExcelDocument;
import com.northconcepts.datapipeline.excel.ExcelReader;

public class ReadFromAnExcelFile {

   
public static final Logger log = DataEndpoint.log;

   
public static void main(String[] args) throws Throwable {
       
ExcelDocument document = new ExcelDocument().open(new File("credit-balance.xls"));
        DataReader reader =
new ExcelReader(document)
           
.setSheetName("credit-balance")
           
.setFieldNamesInFirstRow(true);

        reader.open
();
       
try {
           
Record record;
           
while ((record = reader.read()) != null) {
               
log.debug(record);
           
}
        }
finally {
           
reader.close();
       
}
    }
}

The above examples, along with many more, can be found in the Examples section.
Quick Links Documentation Contact
© 2007, 2008 North Concepts Inc.   All rights reserved.