View Source arrow_ipc_record_batch (arrow v0.1.0)

Provides a record and functions to deal with RecordBatches

A RecordBatch[1] represents a list of equal length arrays and their coresponding buffers. This module provides a record and a function to manage all the metadata required to represent a RecordBatch. Metadata such as:

  1. length: The number of rows or records. In other words, the length of an array.
  2. nodes: A list of maps, where each map has the length and null count of an array
  3. buffers: A list of maps, where each map has the length and the offset (from the beginning of the message body) of a buffer of an array.
  4. compression: The compression applied on the body of the Record Batch. Can either by undefined (i.e. no compression), zstd for Zstandard[2], or lz4_frame for LZ4 Frame[3]. Defaults to undefined.

Currently, compression is not supported, but it has been added for forwards comapatibility.

You can find RecordBatches in the Arrow spec here[4].

[1]: https://github.com/apache/arrow/blob/16328f0ccc73b7df665b4a18feb6adf26b7aa0e2/format/Message.fbs#L81-L102

[2]: https://facebook.github.io/zstd/

[3]: https://android.googlesource.com/platform/external/lz4/+/HEAD/doc/lz4_Frame_format.md

[4]: https://arrow.apache.org/docs/format/Columnar.html#recordbatch-message

Link to this section Summary

Functions

Creates a RecordBatch given a list of arrays

Link to this section Types

-type buffer() :: #{offset => non_neg_integer(), length => pos_integer()}.
-type field_node() :: #{length => pos_integer(), null_count => non_neg_integer()}.

Link to this section Functions

-spec from_erlang(Arrays :: [#array{}]) -> RecordBatch :: #record_batch{}.
Creates a RecordBatch given a list of arrays