Records
Each record is a mutable data structure containing zero or more fields. Each field has a name, type, and value.
Field Names
Fields have a default name that are assigned automatically when not explicitly set. The first field in a record is named A, the second is B, the 26th is Z, the 27th is AA, then AB, AC, and so on. If a field with the default name already exists, then a number is added to the name (starting with 2) until a unique name is found.
Field Values
Field values can contain:
- Single values (like a string, integer, boolean, or date)
- Byte arrays
- Any Java object
- Other records
- Arrays containing any combination of the above (including other arrays)
Records, fields, and values all extend a base Node class allowing you to work with tabular data (Excel, CSV, JDBC) and hierarchical data (JSON, XML) using the same API.
Since each record contains its own set of fields, it's possible for each record in a stream to contain a completely (or somewhat) different set of fields.
Field Types
Field types can be any one of the values defined in the FieldType enum.
If Data Pipeline cannot match a field's value to one of the enum values — for example your Customer class — it will set the type as FieldType.UNDEFINED
.