Importing labels#

In the Import labels tab, the Labels format combo offers:

Option

Source

Converter

.tsv

EthoGraph TSV (backup, colleague’s labels, manual edit)

(native)

pynapple (.npz)

Pynapple file with IntervalSet objects

PynappleLabelConverter

pynapple (.nwb)

NWB file loaded via IntervalSet objects

PynappleLabelConverter

BORIS (.boris)

BORIS project files

BorisLabelConverter (via the BORIS import wizard)

Crowsetta formats (aud-seq, simple-seq, textgrid, notmat, timit, yarden, …)

crowsetta-supported annotation tools (Audacity, Praat, Raven, …)

CrowsettaLabelConverter


Pynapple / NWB IntervalSets#

Selecting pynapple (.npz) or pynapple (.nwb) loads the file with pynapple.load_file() and extracts every IntervalSet in the data dict except those named "trials" or "epochs" (those are treated as trial boundaries, not labels). Each IntervalSet name becomes a label class.

The GUI auto-generates a mapping_pynapple.txt file with integer IDs for each label name (see Label mapping (mapping.txt) for the file format and resolution order) and writes the result to the canonical _labels.tsv alongside the .nc.

Global-time intervals are split across trials using the trials / epochs IntervalSet (or the session’s trial table). See PynappleLabelConverter for the conversion logic.


BORIS#

BORIS observations bind one or more media files (concatenated in Player 1) to a list of events coded in observation-global time. The wizard splits events across media boundaries, treating one media file as one trial.

The importer preserves BORIS’s two event kinds (see State vs point events):

  • State events become intervals (onset_soffset_s). Events that span a file boundary are clipped at the boundary with a warning.

  • Point events become rows with event_type = "point" and offset_s = NaN.

The per-behavior type field from BORIS is written into the generated mapping.txt so the kind is preserved on round-trip and the labelling shortcut behaves correctly. The BORIS Image index column is ignored, as ethograph stores label times in time (seconds), and does not round to nearest frame. This becomes import when labelling multimodal & multi sampling rate data (video, audio, accelerometer, …).

The .boris JSON is parsed via load_boris_project(); the import wizard lives at ethograph.gui.wizard_boris.


Crowsetta interop#

EthoGraph registers an ethograph-seq crowsetta format for sharing labels with string names (resolved via mapping.txt):

from ethograph.labels.crowsetta_format import EthographSeq

# Export: int labels -> string labels via mapping
ethoseq = EthographSeq.from_intervals_df(df, id_to_name={1: "Head bob", 2: "Song"})
ethoseq.to_file("labels_for_sharing.tsv")

# Import via crowsetta
import crowsetta
scribe = crowsetta.Transcriber(format="ethograph-seq")
annot = scribe.from_file("labels_for_sharing.tsv").to_annot()

On import, the GUI checks the active mapping.txt against the labels found in the file. If some labels are missing from the mapping, it auto-generates a new mapping_{format}.txt alongside the data file and warns about unmatched labels. See resolve_crowsetta_mapping() for details.


Programmatic usage#

All converters expose the same resolve_labels(...) contract, which falls back through existing TSV extract from source empty:

from pathlib import Path
from ethograph.labels.converters import PynappleLabelConverter
import pynapple as nap

data = nap.load_file("session.nwb")
trials_ep = data["trials"] if "trials" in data.keys() else None

converter = PynappleLabelConverter(data, trials_ep=trials_ep)
df = converter.resolve_labels(
    source_path=Path("session.nwb"),
    trial_ids=[1, 2, 3],
)

See Exporting labels for the full TSV column reference.