Label mapping (mapping.txt)#
EthoGraph uses integer label IDs everywhere — in the TSV (labels
column), in predictions (argmax over softmax), and in on-screen number-key
shortcuts. String names are kept in a separate mapping.txt file and
resolved only at the edges (display, Crowsetta export, plotting).
This matches the convention used in the action-segmentation literature where models predict a per-frame class index and a dataset-level mapping.txt lists the
corresponding action names. Keeping string↔int separate means:
Models and EthoGraph agree on the same integer IDs without re-encoding.
Rename a behaviour once in
mapping.txtand it propagates everywhere (old backups, predictions, exports).The
.tsvfiles stay compact and language-neutral.
Format#
Whitespace-delimited, one label per line. The full schema is
<id> <name> [<branch>] [<event_type>] — both trailing fields are
optional with defaults (branch=0, event_type=state):
0 background
1 pullOutStick
2 diagonalToBox
3 toss 0
4 nod 1
5 reachLeftCorner 0
11 peck 0 point
12 call 1 point
Rules:
ID
0is alwaysbackground. It is excluded from display, export, and model loss.IDs don’t have to be contiguous, but they must be unique.
Names should be valid identifiers (no spaces) so they round-trip through Crowsetta text formats cleanly.
Trailing whitespace is ignored; blank lines are skipped.
branchis an optional integer grouping labels for independent labeling (see Label branches). Defaults to0.event_typeis"state"(interval, default) or"point"(instantaneous marker — see State vs point events below). Lines without it are treated as state events, so existing mapping files keep working unchanged.
To declare a point class without a custom branch you must still write the
default branch explicitly, because the columns are positional:
11 peck 0 point — not 11 peck point.
Read / write programmatically:
from ethograph.labels.intervals import load_mapping
from ethograph.labels.converters import write_mapping_file
class_to_idx, idx_to_class = load_mapping("mapping.txt")
class_to_idx["pullOutStick"] # 1
idx_to_class[1] # "pullOutStick"
write_mapping_file("mapping.txt", {"background": 0, "walk": 1, "run": 2})
See load_mapping() and
write_mapping_file().
Resolution order#
When the GUI needs a mapping, it searches with
find_mapping_file():
Walk up from the loaded data directory looking for
.ethograph/mapping.txtin each ancestor. This lets a shared.ethograph/in a parent folder serve many sessions, while a per-session override wins.Fall back to
~/.ethograph/mapping.txt(global user default).
Typical layouts:
~/.ethograph/mapping.txt # global default
project/.ethograph/mapping.txt # project-wide (shared across sessions)
project/session_01/.ethograph/mapping.txt # per-session override
Auto-generated mappings#
When you import labels from an external source whose classes don’t exist
in the active mapping.txt, the GUI auto-creates a new mapping file
alongside the data so no IDs clash:
Import source |
Generated file |
|---|---|
Crowsetta formats (aud-seq, textgrid, …) |
|
Pynapple / NWB |
|
NWB epoch imports |
|
The mapping path field in the Label controls updates to point at the new
file. You can rename it to mapping.txt to make it the default for this
project.
See resolve_crowsetta_mapping() and
build_mapping_from_labels() for the
auto-generation logic.
State vs point events#
Every label is one of two kinds:
Kind |
Stored as |
Use for |
|---|---|---|
|
interval ( |
Behaviours with a duration: walk, syllable |
|
instant ( |
Instantaneous events: peck, brief call |
The kind is declared per class in mapping.txt (4th column, see above) and
stored per row in the TSV (event_type column). When a class is declared
point in mapping.txt, the labelling shortcut drops a marker at the
playhead instead of starting an interval drag.
Point events pass through every interval operation (purge, stitch, snap,
changepoint correction) untouched — they have no duration, so concepts like
“too short” or “stitch the gap” don’t apply to them. Internally this is
enforced by split_by_kind() in ethograph.labels.intervals.