Continue After an Error
This example shows you how to continue executing Data-pipeline code even after an error occurs by handling DataException. The demo code reads a CSV file that is badly formatted. When an error occurs on that line, the code does not halt, instead it skips that line and continues reading subsequent records. Such error handling can be applied to other types of data transfers too.
There are other examples which show how to debug code or handle exceptions
Input CSV file
Account,LastName,FirstName,Balance,CreditLimit,AccountCreated,Rating 101,Reeves,Keanu,9315.45,10000.00,1/17/1998,A 312,"Butler,Gerard,90.00,1000.00,8/6/2003,B 868,Hewitt,Jennifer Love,0,17000.00,5/25/1985,B 761,Pinkett-Smith,Jada,49654.87,100000.00,12/5/2006,A 317,Murray,Bill,789.65,5000.00,2/5/2007,C
The second row in this CSV (line 3) has an unterminated quote.
Java Code Listing
package com.northconcepts.datapipeline.examples.cookbook; import java.io.File; import org.apache.log4j.Logger; import com.northconcepts.datapipeline.core.DataEndpoint; import com.northconcepts.datapipeline.core.DataException; import com.northconcepts.datapipeline.core.Record; import com.northconcepts.datapipeline.csv.CSVReader; public class ContinueAfterAnError { public static final Logger log = DataEndpoint.log; public static void main(String[] args) { // open a CSV file with an error in the second record (unterminated quote) CSVReader csvReader = new CSVReader(new File("example/data/input/bad-credit-balance-01.csv")) { // override CSVReader.read() to ignore exceptions public Record read() throws DataException { try { return super.read(); } catch (DataException e) { log.warn(e, e); // e.printStackTrace(); return read(); // read the next line } } }; csvReader.setFieldNamesInFirstRow(true); csvReader.open(); try { Record record; while ((record = csvReader.read()) != null) { log.info(record); } } finally { csvReader.close(); } } }
Code walkthrough
- An instance of the CSVReader is created using the file path of the
input file
bad-credit-balance-01.csv
. - The CSVReader.read method is overridden to handle exceptions in case of a read error.
- The CSV file is then opened via CSVReader.open() method.
- A while loop iterates through the input data.
- Each record is read as a Record object and printed to the console via the Datapipeline logger.
- The CSVReader is closed via the reader.close method in a finally block after the while loop completes.
Exception handling
In this demo code, exception handling is done by overidding the CSVReader.read
method to handle exceptions. This overridden version simply invokes the superclass read
method in a try block. Catch block is specified for
DataException. Since the input is badly formatted,
a DataException occurs which is caught and a warning is logged via
the Datapipeline logger. The next line is then read.
Instead of overridding the CSVReader.read
method, the exception can also be handled by incorporating a try-catch block around this method in the while loop. Either approach is correct and can be used.
Console output
12:18:57,370 INFO [main] datapipeline:43 - Record { 0:[Account]:STRING=[101]:String 1:[LastName]:STRING=[Reeves]:String 2:[FirstName]:STRING=[Keanu]:String 3:[Balance]:STRING=[9315.45]:String 4:[CreditLimit]:STRING=[10000.00]:String 5:[AccountCreated]:STRING=[1/17/1998]:String 6:[Rating]:STRING=[A]:String } 12:18:57,374 WARN [main] datapipeline:31 - com.northconcepts.datapipeline.core.DataException: unterminated string: expected ", found end-of-file com.northconcepts.datapipeline.core.DataException: unterminated string: expected ", found end-of-file ------------------------------- AbstractReader.fieldNames=[Account,LastName,FirstName,Balance,CreditLimit,AccountCreated,Rating] AbstractReader.fieldNamesInFirstRow=[true] AbstractReader.firstRow=[false] AbstractReader.lastRow=[-1] AbstractReader.startingRow=[0] CSVReader.allowMultiLineText=[false] CSVReader.column=[43] CSVReader.fieldSeparator=[,] CSVReader.lineText=[312,"Butler,Gerard,90.00,1000.00,8/6/2003,B] CSVReader.newLine=[ ] CSVReader.quoteChar=["] CSVReader.trimFields=[true] DataEndpoint.description=[null] DataEndpoint.state=[OPENED] DataEndpoint.thread=[main] DataEndpoint.timestamp=[2014.04.09-12:18:57.373] DataReader.bufferSize=[0] DataReader.recordCount=[1] TextReader.file=[example\data\input\bad-credit-balance-01.csv] TextReader.line=[3] char=[?] printable char=[end-of-file] record=[Record (MODIFIED) { 0:[Account]:STRING=[312]:String 1:[LastName]:STRING=[null] 2:[FirstName]:STRING=[null] 3:[Balance]:STRING=[null] 4:[CreditLimit]:STRING=[null] 5:[AccountCreated]:STRING=[null] 6:[Rating]:STRING=[null] ...] ------------------------------- at com.northconcepts.datapipeline.core.StringParser.match(StringParser.java:69) at com.northconcepts.datapipeline.csv.CSVReader.matchString(CSVReader.java:222) at com.northconcepts.datapipeline.csv.CSVReader.matchValue(CSVReader.java:189) at com.northconcepts.datapipeline.csv.CSVReader.fillRecord(CSVReader.java:151) at com.northconcepts.datapipeline.core.AbstractReader.readImpl(AbstractReader.java:107) at com.northconcepts.datapipeline.core.TextReader.readImpl(TextReader.java:95) at com.northconcepts.datapipeline.core.DataReader.read(DataReader.java:141) at com.northconcepts.datapipeline.core.AbstractReader.read(AbstractReader.java:96) at com.northconcepts.datapipeline.examples.cookbook.ContinueAfterAnError$1.read(ContinueAfterAnError.java:29) at com.northconcepts.datapipeline.examples.cookbook.ContinueAfterAnError.main(ContinueAfterAnError.java:42) 12:18:57,377 INFO [main] datapipeline:43 - Record { 0:[Account]:STRING=[868]:String 1:[LastName]:STRING=[Hewitt]:String 2:[FirstName]:STRING=[Jennifer Love]:String 3:[Balance]:STRING=[0]:String 4:[CreditLimit]:STRING=[17000.00]:String 5:[AccountCreated]:STRING=[5/25/1985]:String 6:[Rating]:STRING=[B]:String } 12:18:57,378 INFO [main] datapipeline:43 - Record { 0:[Account]:STRING=[761]:String 1:[LastName]:STRING=[Pinkett-Smith]:String 2:[FirstName]:STRING=[Jada]:String 3:[Balance]:STRING=[49654.87]:String 4:[CreditLimit]:STRING=[100000.00]:String 5:[AccountCreated]:STRING=[12/5/2006]:String 6:[Rating]:STRING=[A]:String } 12:18:57,378 INFO [main] datapipeline:43 - Record { 0:[Account]:STRING=[317]:String 1:[LastName]:STRING=[Murray]:String 2:[FirstName]:STRING=[Bill]:String 3:[Balance]:STRING=[789.65]:String 4:[CreditLimit]:STRING=[5000.00]:String 5:[AccountCreated]:STRING=[2/5/2007]:String 6:[Rating]:STRING=[C]:String }
As can be seen from the above output, the first record is read properly. While reading the second record, an exception occurs. Its stacktrace is printed to the console. The subsequent records are still read and printed on the console. So the occurrence of the exception does not cause the program to exit.