xr_utils#

Small helpers for working with xarray.Dataset and xarray.DataArray objects used throughout ethograph.

ethograph.utils.xr_utils.sel_valid(da, sel_kwargs)[source]#

Select a slice of a DataArray, silently ignoring dimensions it doesn’t have.

Useful when the GUI holds a single sel_kwargs dict (e.g. {"keypoints": "nose", "space": "x"}) but different features have different subsets of those dimensions. Dimensions with labelled coordinates are selected with .sel(); dimensions without coordinates fall back to .isel(). The result is squeezed and transposed so time is always the first axis.

Parameters:
  • da (xarray.DataArray) – Source array. Must contain at least one dimension whose name includes "time".

  • sel_kwargs (dict) – Candidate selections. Keys that don’t match any dimension in da are silently dropped.

Returns:

  • data (numpy.ndarray) – Selected values with shape (n_time,) or (n_time, n_other).

  • used_kwargs (dict) – The subset of sel_kwargs that were actually applied via .sel() (i.e. only label-based selections, not integer-based ones). Handy for building plot titles.

Raises:

ValueError – If da has no dimension containing "time".

Examples

>>> import xarray as xr, numpy as np
>>> da = xr.DataArray(
...     np.random.randn(100, 3),
...     dims=["time", "space"],
...     coords={"time": np.linspace(0, 10, 100), "space": ["x", "y", "z"]},
... )
>>> data, used = eto.sel_valid(da, {"space": "x", "individuals": "mouse1"})
>>> data.shape
(100,)
>>> used
{'space': 'x'}
ethograph.utils.xr_utils.get_time_coord(da)[source]#

Return the time coordinate of a DataArray, regardless of its name.

Every feature variable in an ethograph dataset must have at least one dimension whose name contains "time" (e.g. time, time_aux, time_labels). Different features can use different time dimensions at different sampling rates — this function finds the right one for a given DataArray. See Data Format Requirements for the full specification.

Lookup order: dimension coordinates (da.dims) are checked before non-dimension coordinates (da.coords), so the primary time axis that actually indexes the data is returned even when a shorter auxiliary coordinate is also attached.

Parameters:

da (xarray.DataArray) – Any DataArray from an ethograph dataset.

Returns:

The time coordinate values, or None if no coordinate name contains "time".

Return type:

xarray.DataArray or None

Examples

>>> import ethograph as eto
>>> dt = eto.open("experiment.nc")
>>> ds = dt.itrial(0)

Feature with the default time dimension:

>>> eto.get_time_coord(ds["speed"])
<xarray.DataArray 'time' (time: 9000)> ...

Audio stored on a higher-rate time_aux axis:

>>> eto.get_time_coord(ds["audio_waveform"])
<xarray.DataArray 'time_aux' (time_aux: 441000)> ...

Used internally by add_changepoints_to_ds to discover which time dimension to vectorise over.