Truncate large Excel cell values

Updated: Mar 24, 2025

A cell in an Excel file can have a maximum of 32,767 characters. If a source (CSV, Parquet, Database etc) has a field that contains more than 32,767 characters, then ExcelWriter can handle this scenario in the following ways:

  1. LargeCellHandler.FAIL - If a field from the source has more than 32,767 characters, then the process is interrupted with an exception message as Value is too long for Excel cell in description field; expected maximum 32767, found 33650.
  2. LargeCellHandler.TRUNCATE - If a field from the source has more than 32,767 characters, then the field value is truncated to 32,767 characters.
  3. LargeCellHandler.SKIP - If a field from the source has more than 32,767 characters, then a null value is considered for such fields.

 

This example demonstrates how to handle large cell values while writing an Excel file using LargeCellHandler.

Java Code Listing

/*
 * Copyright (c) 2006-2025 North Concepts Inc.  All rights reserved.
 * Proprietary and Confidential.  Use is subject to license terms.
 * 
 * https://northconcepts.com/data-pipeline/licensing/
 */
package com.northconcepts.datapipeline.examples.cookbook;

import java.io.File;

import com.northconcepts.datapipeline.core.DataReader;
import com.northconcepts.datapipeline.core.DataWriter;
import com.northconcepts.datapipeline.core.StreamWriter;
import com.northconcepts.datapipeline.csv.CSVReader;
import com.northconcepts.datapipeline.excel.ExcelDocument;
import com.northconcepts.datapipeline.excel.ExcelDocument.ProviderType;
import com.northconcepts.datapipeline.excel.ExcelReader;
import com.northconcepts.datapipeline.excel.ExcelWriter;
import com.northconcepts.datapipeline.excel.LargeCellHandler;
import com.northconcepts.datapipeline.job.Job;

public class TruncateLargeExcelCellValues {

    private static final String INPUT_FILE_PATH = "data/input/csv_with_large_cell_value.csv";
    private static final String OUTPUT_EXCEL_FILE_PATH = "data/output/truncate-large-excel-cell-values.xlsx";
    
    public static void main(String[] args) throws Throwable {
        DataReader reader = new CSVReader(new File(INPUT_FILE_PATH))
                .setFieldNamesInFirstRow(true);
        
        ExcelDocument document = new ExcelDocument(ProviderType.POI_SXSSF);
        DataWriter writer = new ExcelWriter(document)
                                  .setFieldNamesInFirstRow(true)
                                  // The default is LargeCellHandler.FAIL
                                  // Other possible options are TRUNCATE & SKIP
                                  .setLargeCellHandler(LargeCellHandler.TRUNCATE);

        Job.run(reader, writer);
        document.save(new File(OUTPUT_EXCEL_FILE_PATH));
        
        readExcelAndPrintToConsole();
    }
    
    private static void readExcelAndPrintToConsole() {
        ExcelDocument document = new ExcelDocument(ProviderType.POI_XSSF_SAX)
                .open(new File(OUTPUT_EXCEL_FILE_PATH));
        DataReader reader = new ExcelReader(document)
                                  .setFieldNamesInFirstRow(true)
                                  .setAutoCloseDocument(true);
        
        Job.run(reader, StreamWriter.newSystemOutWriter());
    }
}

Code walkthrough

  1. CSVReader is created using the file path of the input CSV file csv_with_large_cell_value.csv.
  2. ExcelDocument and ExcelWriter instances are created.
  3. Set LargeCellHandler to the desired enum. Here, we used LargeCellHandler.TRUNCATE that will truncate the values after 32,767 characters.
  4. Data is transferred from the input CSV file to the output Excel file.
  5. Finally, the Excel document is saved via the ExcelDocument.save method using the file path of the output file truncate-large-excel-cell-values.xlsx.
  6. Try using other enums - LargeCellHandler.FAIL (default - will fail the process) and LargeCellHandler.SKIP (skip writing such fields to output excel).

ExcelReader and ExcelDocument

ExcelReader is an input reader which can be used to read an excel file. ExcelDocument is a class that encapsulates an excel document. Normally, when CSV or other types of files are read, the input file is opened directly via the reader class. For ex. you can refer to the read a CSV file example. However in the case of excel documents, the files need to be opened/saved via ExcelDocument class. ExcelReader is used purely for reading the excel file.
ExcelReader has a method setSheetName which can be used to set the name of the spreadsheet in the excel document. The ExcelReader.setFieldNamesInFirstRow(true) method can be used to specify that the names specified in the first row of the input data should be used as field names. If this method is not invoked, the fields would be named as A1, A2, etc. If those fields names need to be changed, a rename transformation can be added on top of ExcelReader or any other type (Refer Rename a field for example).

Console output

-----------------------------------------------
0 - Record {
    0:[id]:STRING=[1]:String
    1:[accountName]:STRING=[Test Account 1]:String
    2:[description]:STRING=[The test description for record 1.]:String
}

-----------------------------------------------
1 - Record {
    0:[id]:STRING=[2]:String
    1:[accountName]:STRING=[Test Account 2]:String
    2:[description]:STRING=[1 - The DPEL engine is used by several endpoints to parse and evaluate expressions at runtime. This allows expressions to be tre...32766]:String
}

-----------------------------------------------
2 - Record {
    0:[id]:STRING=[3]:String
    1:[accountName]:STRING=[Test Account 3]:String
    2:[description]:STRING=[test description]:String
}

-----------------------------------------------
3 records

 

Mobile Analytics