public class RemoveDuplicatesReader extends ProxyReader
RemoveDuplicatesReader(DataReader)) or subset
of fields (RemoveDuplicatesReader(DataReader, FieldList)).DataEndpoint.StatefieldLineage, recordLineagelastRecord, PRODUCT, PRODUCT_VERSION, VENDOR, XML_INPUT_FACTORY_KEYBUFFER_SIZE, captureElapsedTime, DEFAULT_READ_BUFFER_SIZEid, log, name, TIMESTAMP_FORMAT| Constructor and Description |
|---|
RemoveDuplicatesReader(DataReader targetDataReader)
Remove duplicate records matching all fields.
|
RemoveDuplicatesReader(DataReader targetDataReader,
FieldList fields)
Removes duplicate records based on the matching
FieldList. |
RemoveDuplicatesReader(DataReader targetDataReader,
String... fields)
Remove duplicate records based on their field names.
|
| Modifier and Type | Method and Description |
|---|---|
DataException |
addExceptionProperties(DataException exception)
Adds this endpoint's current state to a
DataException. |
void |
close()
Indicates that this endpoint has finished reading or writing.
|
DataWriter |
getDiscardWriter()
Returns the discard sink for discarded records or
null if one was not assigned. |
long |
getDuplicateRecordCount()
Returns the number of duplicate records that have been read.
|
long |
getUniqueRecordCount()
Returns the number of unique records that have been read.
|
protected Record |
interceptRecord(Record record) |
protected void |
onDuplicate(Record record)
Called for each duplicate record.
|
protected void |
onUnique(Record record)
Called for each unique record.
|
void |
open()
Makes this endpoint ready for reading or writing.
|
RemoveDuplicatesReader |
setDiscardWriter(DataWriter writer)
Assign a discard sink for duplicate records.
|
available, getNestedReader, map, map, readImpl, setNestedDataReader, setNestedDataReaderaddLineage, getBufferSize, getNestedEndpoint, getReader, getRootEndpoint, getRootReader, isExhausted, isLineageSupported, isSaveLineage, peek, pop, push, read, setSaveLineage, skipdecrementRecordCount, enableJmx, getLastRecord, getRecordCount, getRecordCountAsBigInteger, getRecordCountAsString, incrementRecordCount, isRecordCountBigInteger, resetRecordCount, toStringaddElapsedtime, assertClosed, assertNotOpened, assertOpened, finalize, getClosedOn, getDescription, getElapsedTime, getElapsedTimeAsString, getOpenedOn, getOpenElapsedTime, getOpenElapsedTimeAsString, getSelfTime, getSelfTimeAsString, getState, isCaptureElapsedTime, isClosed, isOpen, setCaptureElapsedTime, setDescriptionpublic RemoveDuplicatesReader(DataReader targetDataReader, FieldList fields)
FieldList.targetDataReader - source DataReaderfields - the FieldList to be matchpublic RemoveDuplicatesReader(DataReader targetDataReader, String... fields)
targetDataReader - source DataReaderfields - the field names to be checked for duplicatespublic RemoveDuplicatesReader(DataReader targetDataReader)
targetDataReader - source DataReaderpublic DataWriter getDiscardWriter()
null if one was not assigned.public RemoveDuplicatesReader setDiscardWriter(DataWriter writer)
writer - the discard sinkpublic long getUniqueRecordCount()
public long getDuplicateRecordCount()
public void open()
throws DataException
DataEndpointopen in class ProxyReaderDataExceptionpublic void close()
throws DataException
DataEndpointclose in class ProxyReaderDataExceptionprotected Record interceptRecord(Record record) throws Throwable
interceptRecord in class ProxyReaderThrowableprotected void onUnique(Record record)
protected void onDuplicate(Record record)
public DataException addExceptionProperties(DataException exception)
EndpointDataException. Since this method is called whenever an
exception is thrown, subclasses should override it to add their specific information.addExceptionProperties in class ProxyReaderCopyright (c) 2006-2025 North Concepts Inc. All Rights Reserved.