About AxisArray ###################### ``AxisArray`` is a specialized message format used within the `ezmsg` framework to represent multi-dimensional data structures. In simple terms they are **N-dimensional arrays with labeled axes and optional metadata**. It is designed for applications in signal processing, scientific computing, and data analysis, where both the data and its context are important. ``AxisArray`` originally took inspiration from the functionality of the `xarray `_ library. ``AxisArray`` is built-in to ezmsg. Import using: .. code-block:: python from ezmsg.core.util import AxisArray .. warning:: Importing ``AxisArray`` from ``ezmsg.core.util`` will import the `numpy` library. For this reason, we have implemented ezmsg in such a way that if you do not import ``AxisArray``, `numpy` will not be imported either. This is ideal for users wanting very lightweight applications and have no need for the functionality of `numpy`. |ezmsg_logo_small| Why use AxisArray? **************************************** The purpose of including `AxisArray` as part of ezmsg stems from wanting to avoid having to support a vast array of different message types. This is a major cause of bloat with other similar messaging platforms. For this to be useful, we have designed `AxisArray` to be convenient and flexible. It is important that it can be used for the many different use cases we have encountered. At its core it stores a numpy N-dimensional array, an array of axis labels, as well as multiple metadata attributes. |ezmsg_logo_small| Description ********************************* An ``AxisArray`` is a multi-dimensional array with named axes. Each axis can have a name and a set of labels for its elements. This allows for more intuitive indexing and manipulation of the data. An `AxisArray` has the following attributes: - ``data``: a numpy ndarray containing the actual data. - ``dims``: a list of axis names. - ``axes``: a dictionary mapping axis names to their label information. - ``attrs``: a dictionary for storing additional metadata. - ``key``: a unique identifier for the array. Unsurprisingly, all of this must be self-consistent: the number of axis names in ``dims`` must match the number of dimensions in ``data``, and the axis names in ``axes`` should match the ones in ``dims``. The label information in ``axes`` refers to the 'value' of each axis index, e.g., for a time axis, the labels might be timestamps. We provide three commonly used axes type objects: - A ``LinearAxis``: represents a linear axis with evenly spaced values - you just need the ``offset`` (start value) and the ``gain`` (step size). An example of this would be simple numerical index (offset=0, gain=1) or regularly spaced time samples (offset=start time, gain=1/sampling rate). - A ``TimeAxis``: this is a `LinearAxis` that represents a time axis. Its ``unit`` attribute is by default set to seconds (s). - A ``CoordinateAxis``: this is our continuous/dense axis, which can represent any continuous variable, such as frequency or spatial coordinates. You provide the actual values for each index in a ``data`` array of values. The `AxisArray` class provides several methods for manipulating and accessing the data, designed to allow standard numpy array manipulation without losing track of how the axes are affected. For a list of such methods, see :ref:`axisarray_methods`. |ezmsg_logo_small| Recommended Use ********************************************** When to use AxisArray ======================= If any of the following are true, then `AxisArray` is an appropriate messaging format: - When the underlying data is in array form - When it is prudent to keep information about the data array axes (eg. labels, units, spacing) - When you want to store the data with additional metadata (eg. data origin device name, study name, date, attempt number, etc.) - If you plan to use the signal processing extension package `ezmsg-sigproc`. Anything that is inherently multi-dimensional and needs to be processed in a multi-dimensional context, ``AxisArray`` is the preferred message format in ezmsg. We have designed the ezmsg signal processing extension package `ezmsg-sigproc` with `AxisArray` in mind. We do not recommend using `AxisArray` for very simple data types like integers, floats, or strings. `AxisArray` has the flexibility to store this kind of data: this can be done by sending the data as a zero-dimensional array, or by sending an empty data field and storing the simple data in the ``attrs`` field. However, for the sake of both memory and time efficiency, we would recommend simply using the basic types (int, float, str) as the message type instead. .. tip:: One example of a convenient use of the metadata components of `AxisArray` is in the case of adaptive transformers: we need to be able to send trigger messages that tell the adaptive transformer to change state/configuration in some way. We can do this by having an element of ``attrs`` that is called ``"trigger"`` set to ``True`` along with whatever parameters of the transformer we would like to alter (and any data needed for computation of the new parameters). The transformer can then check for this ``"trigger"`` attribute in the incoming `AxisArray` messages and change its state accordingly. .. _axisarray_methods: `AxisArray` utility methods ============================= There are many built in methods for manipulating `AxisArray` objects, such as: - `as2d`: Get a 2D view of the data with the specified dimension as the first axis. - `shape2d`: Calculate the 2D shape when viewing array with specified axis first. - `slice_along_axis`: Slice the input array along a specified axis using the given slice object or integer index. - `sliding_win_oneaxis`: Generates a view of an array using a sliding window of specified length along a specified axis of the input array. - `view2d`: Context manager providing a 2D view of the data. The following are AxisArray class methods for manipulating the underlying data: - `concatenate`: Concatenate multiple AxisArray objects along a specified dimension. - `isel`: Select data using integer-based indexing along specified dimensions. - `iter_over_axis`: Iterate over slices of the data along a specified axis. - `sel`: Select data using label-based indexing along specified dimensions. - `to_xr_dataarray`: Convert the AxisArray to an xarray DataArray. - `transpose`: Transpose the dimensions of an AxisArray. Further utility methods (for getting data views, axis information, and shapes): - `as2d`: Get a 2D view of the data with the specified dimension as the first axis. - `ax`: Get `AxisInfo` for a specified dimension. - `axis_idx`: Get the axis index for a given dimension name or pass through if already an int. - `get_axis`: Get the axis coordinate system for a specified dimension. - `get_axis_idx`: Get the axis index for a given dimension name. - `get_axis_name`: Get the axis name for a given axis index. - `shape`: Get the shape of the data array. - `shape_2d`: Get the shape of the 2D view of the data with the specified dimension as the first axis. - `view_2d`: Context manager providing a 2D view of the data. How to return an AxisArray object ================================= To return an ``AxisArray`` object, you can create an instance of the ``AxisArray`` class and populate it with your data. There is some time cost in the creation of the ``AxisArray`` object. For performance-critical code, the preferred way is to use ``replace`` (imported from **ezmsg.util.messages.axisarray**) to modify the data as needed before returning the ``AxisArray`` object. One can think of this as similar to how one would use ``dataclasses.replace`` to create a new instance of a dataclass with some attributes changed. .. code-block:: python from ezmsg.util.messages.axisarray import LinearAxis, replace # Create some sample data new_data = some_processing_function(message.data) # Define newly created axis axis = LinearAxis(offset=0.0, gain=0.01, unit='s') # Create a new AxisArray message by replacing data and axes msg_out = replace( message, data=new_data, axes={ **message.axes, new_axis: axis, }, ) Calling ``ezmsg.util.messages.axisarray.replace()`` calls the utility function ``fast_replace`` (in `ezmsg.util.messages.util`) which is an optimized version of the standard python ``dataclasses.replace`` function. The optimization occurs due to skipping certain validation checks that would normally occur when initialising a dataclass. Specifically, unlike ``dataclasses.replace``, this function does not check for type compatibility, nor does it check that the passed in fields are valid fields for the dataclass and not flagged as ``init=False``. If you have concerns over this reduced safety, if you set the environment variable ``EZMSG_DISABLE_FAST_REPLACE=1``, then this imported ``replace`` function will simply be the function ``dataclasses.replace`` defined in the python standard `dataclasses` module. .. note:: Use of this purpose-made ``replace`` function is not limited to ``AxisArray`` objects. It can be used to create any dataclass object given an instance of said class, including user-defined dataclasses. |ezmsg_logo_small| See Also ******************************** #. `AxisArray API Reference `_ .. |ezmsg_logo_small| image:: ../_static/_images/ezmsg_logo.png :width: 40 :alt: ezmsg logo