ethograph.labels.ml.intervals_to_dense#

ethograph.labels.ml.intervals_to_dense(df, sample_rate, individuals, n_samples)[source]#

Convert an intervals DataFrame to a dense label array.

Each interval is mapped onto the nearest sample indices using round(time * sample_rate). Overlapping intervals for the same individual are resolved by last-write-wins.

Parameters:
  • df (pd.DataFrame) – Intervals DataFrame with columns onset_s, offset_s, labels, individual.

  • sample_rate (float) – Sampling rate in Hz (e.g. 30.0 for 30 fps video features).

  • individuals (list[str]) – Individual identifiers. The output column order matches this list.

  • n_samples (int) – Number of output time steps. Typically available as per-trial n_samples metadata in the TSV file.

Returns:

Dense label array of shape (n_samples, len(individuals)), dtype int8. Background (unlabeled) time steps are 0.

Return type:

np.ndarray

Examples

>>> import pandas as pd
>>> from ethograph.labels.ml import intervals_to_dense
>>> df = pd.DataFrame({
...     "onset_s": [0.1, 0.5], "offset_s": [0.3, 0.6],
...     "labels": [1, 2], "individual": ["A", "A"],
... })
>>> dense = intervals_to_dense(df, sample_rate=10.0, individuals=["A"], n_samples=7)
>>> dense[:, 0].tolist()
[0, 1, 1, 1, 0, 2, 2]