ethograph.labels.ml.dense_to_intervals#

ethograph.labels.ml.dense_to_intervals(dense_array, individuals, *, sample_rate=None, time_coord=None)[source]#

Convert a dense label array to an intervals DataFrame.

Provide either sample_rate (uniform spacing starting at t = 0) or an explicit time_coord array.

Parameters:
  • dense_array (np.ndarray) – Shape (n_samples,) for a single individual, or (n_samples, n_individuals) for multiple.

  • individuals (list[str]) – Individual identifiers — length must match the second axis.

  • sample_rate (float, optional) – Sampling rate in Hz. Timestamps are computed as np.arange(n_samples) / sample_rate.

  • time_coord (np.ndarray, optional) – Explicit time array of length n_samples. Use this when timestamps are non-uniform or do not start at zero.

Returns:

Intervals with columns onset_s, offset_s, labels, individual. offset_s is inclusive (last sample of the segment).

Return type:

pd.DataFrame

Raises:

ValueError – If neither sample_rate nor time_coord is given, or if the number of individuals does not match the array width.

Examples

Convert a 1-D dense array at 10 Hz:

>>> import numpy as np
>>> from ethograph.labels.ml import dense_to_intervals
>>> labels = np.array([0, 1, 1, 1, 0, 2, 2])
>>> df = dense_to_intervals(labels, ["crow_A"], sample_rate=10.0)
>>> df[["onset_s", "offset_s", "labels"]].values.tolist()
[[0.1, 0.3, 1], [0.5, 0.6, 2]]

With explicit timestamps:

>>> times = np.array([0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6])
>>> df = dense_to_intervals(labels, ["crow_A"], time_coord=times)
>>> df["onset_s"].tolist()
[0.1, 0.5]