Records

Each record is a mutable data structure containing zero or more fields. Each field has a name, type, and value.

Field Names

Fields have a default name that are assigned automatically when not explicitly set. The first field in a record is named A, the second is B, the 26th is Z, the 27th is AA, then AB, AC, and so on. If a field with the default name already exists, then a number is added to the name (starting with 2) until a unique name is found.

Field Values

Field values can contain:

  • Single values (like a string, integer, boolean, or date)
  • Byte arrays
  • Any Java object
  • Other records
  • Arrays containing any combination of the above (including other arrays)

Records, fields, and values all extend a base Node class allowing you to work with tabular data (Excel, CSV, JDBC) and hierarchical data (JSON, XML) using the same API.

Since each record contains its own set of fields, it's possible for each record in a stream to contain a completely (or somewhat) different set of fields.

Field Types

Field types can be any one of the values defined in the FieldType enum. If Data Pipeline cannot match a field's value to one of the enum values — for example your Customer class — it will set the type as FieldType.UNDEFINED.

Mobile Analytics