public class OrcDataWriter extends IntegrationWriter
DataEndpoint.State
lastRecord, PRODUCT, PRODUCT_VERSION, VENDOR, XML_INPUT_FACTORY_KEY
BUFFER_SIZE, captureElapsedTime, DEFAULT_READ_BUFFER_SIZE
id, log, name, TIMESTAMP_FORMAT
Constructor and Description |
---|
OrcDataWriter(File file)
Write ORC data to a file.
|
OrcDataWriter(Path path)
Write ORC data to a path like S3, Hadoop etc.
|
Modifier and Type | Method and Description |
---|---|
DataException |
addExceptionProperties(DataException exception)
Adds this endpoint's current state to a
DataException . |
void |
close()
Indicates that this endpoint has finished reading or writing.
|
int |
getBatchSize()
Indicates the maximum number of records to buffer when writing ORC data (default 1024).
|
int |
getBigDecimalPrecision()
Returns default precision value for BigDecimal type of values.
|
int |
getBigDecimalScale()
Indicates the scale used when writing BigDecimal values (default 10).
|
CompressionKind |
getCompressionKind()
Indicates the compression used for writing (default NONE).
|
Configuration |
getConfig()
Returns the Orc configuration parameters.
|
Path |
getPath()
Returns the
Path of the ORC file being written. |
RoundingMode |
getRoundingMode()
Indicates the rounding algorithm used for all BigDecimal values (default is
RoundingMode.HALF_UP ). |
RowBatchVersion |
getRowBatchVersion()
Returns
RowBatchVersion to store ORC data. |
TypeDescription |
getSchema()
Indicates the schema used to write the file.
|
boolean |
isOverwrite()
Overwrites the file if true, otherwise, throws an exception if the file already exists.
|
boolean |
isUseTimestampInstant()
Indicates if
Instant s should be used to store datetime values. |
boolean |
isUTC()
Indicates if datetime values are adjusted to UTC.
|
boolean |
isWriteUuidAsString()
Indicates whether UUID values are written as strings or bytes.
|
void |
open()
Makes this endpoint ready for reading or writing.
|
OrcDataWriter |
setBatchSize(int batchSize)
Indicates the maximum number of records to buffer when writing ORC data (default 1024).
|
OrcDataWriter |
setBigDecimalPrecision(int bigDecimalPrecision)
Sets default precision value for BigDecimal type of values.
|
OrcDataWriter |
setBigDecimalScale(int bigDecimalScale)
Indicates the scale used when writing BigDecimal values (default 10).
|
OrcDataWriter |
setCompressionKind(CompressionKind compressionKind)
Indicates the compression used for writing (default NONE).
|
OrcDataWriter |
setConfig(Configuration config)
Sets the Orc configuration parameters.
|
OrcDataWriter |
setOverwrite(boolean overwrite)
Overwrites the file if true, otherwise, throws an exception if the file already exists.
|
OrcDataWriter |
setRoundingMode(RoundingMode roundingMode)
Indicates the rounding algorithm used for all BigDecimal values (default is
RoundingMode.HALF_UP ). |
OrcDataWriter |
setRowBatchVersion(RowBatchVersion rowBatchVersion)
Sets
RowBatchVersion to store ORC data. |
OrcDataWriter |
setSchema(TypeDescription schema)
Indicates the schema used to write the file.
|
OrcDataWriter |
setUseTimestampInstant(boolean useTimestampInstant)
Indicates if
Instant s should be used to store datetime values. |
OrcDataWriter |
setUTC(boolean isUTC)
Indicates if datetime values are adjusted to UTC.
|
OrcDataWriter |
setWriteUuidAsString(boolean writeUuidAsString)
Indicates whether UUID values are written as strings or bytes.
|
protected void |
writeImpl(Record record)
Overridden by subclasses to write the specified record to this
DataWriter . |
available, getNestedEndpoint, getNestedWriter, getRootEndpoint, getRootWriter, write
decrementRecordCount, enableJmx, getLastRecord, getRecordCount, getRecordCountAsBigInteger, getRecordCountAsString, incrementRecordCount, isRecordCountBigInteger, resetRecordCount, toString
addElapsedtime, assertClosed, assertNotOpened, assertOpened, finalize, getClosedOn, getDescription, getElapsedTime, getElapsedTimeAsString, getOpenedOn, getOpenElapsedTime, getOpenElapsedTimeAsString, getSelfTime, getSelfTimeAsString, getState, isCaptureElapsedTime, isClosed, isOpen, setCaptureElapsedTime, setDescription
public OrcDataWriter(File file)
public OrcDataWriter(Path path)
protected void writeImpl(Record record) throws Throwable
DataWriter
DataWriter
.writeImpl
in class DataWriter
Throwable
public void open() throws DataException
DataEndpoint
open
in class IntegrationWriter
DataException
public void close() throws DataException
DataEndpoint
close
in class DataEndpoint
DataException
public TypeDescription getSchema()
public OrcDataWriter setSchema(TypeDescription schema)
public Configuration getConfig()
public OrcDataWriter setConfig(Configuration config)
public int getBatchSize()
public OrcDataWriter setBatchSize(int batchSize)
public boolean isUseTimestampInstant()
Instant
s should be used to store datetime values.public OrcDataWriter setUseTimestampInstant(boolean useTimestampInstant)
Instant
s should be used to store datetime values.public boolean isOverwrite()
public OrcDataWriter setOverwrite(boolean overwrite)
public DataException addExceptionProperties(DataException exception)
Endpoint
DataException
. Since this method is called whenever an
exception is thrown, subclasses should override it to add their specific information.addExceptionProperties
in class DataWriter
public boolean isUTC()
public OrcDataWriter setUTC(boolean isUTC)
public boolean isWriteUuidAsString()
Indicates whether UUID values are written as strings or bytes. If set to false, UUID values will be written as bytes.
Default value is true.
public OrcDataWriter setWriteUuidAsString(boolean writeUuidAsString)
Indicates whether UUID values are written as strings or bytes. If set to false, UUID values will be written as bytes.
Default value is true.
public RowBatchVersion getRowBatchVersion()
RowBatchVersion
to store ORC data. precision > TypeDescription.MAX_DECIMAL64_PRECISION (18) then DecimalColumnVector
is used instead of Decimal64ColumnVector.public OrcDataWriter setRowBatchVersion(RowBatchVersion rowBatchVersion)
RowBatchVersion
to store ORC data. If BigDecimal value has precision > TypeDescription.MAX_DECIMAL64_PRECISION (18) then
DecimalColumnVector is used instead of Decimal64ColumnVector.public int getBigDecimalPrecision()
public OrcDataWriter setBigDecimalPrecision(int bigDecimalPrecision)
public int getBigDecimalScale()
public OrcDataWriter setBigDecimalScale(int bigDecimalScale)
public RoundingMode getRoundingMode()
RoundingMode.HALF_UP
).public OrcDataWriter setRoundingMode(RoundingMode roundingMode)
RoundingMode.HALF_UP
).public CompressionKind getCompressionKind()
public OrcDataWriter setCompressionKind(CompressionKind compressionKind)
public Path getPath()
Path
of the ORC file being written.Copyright (c) 2006-2024 North Concepts Inc. All Rights Reserved.