{
"cells": [
{
"cell_type": "markdown",
"id": "1cb4bc77",
"metadata": {},
"source": [
"# Tool-using crows - Moll et al., 2025\n",
"\n",
"Dataset includes behavioural data from 2 trials:\n",
"\n",
"- Videos published in Moll et al., 2025¹, see available online [here](https://www.sciencedirect.com/science/article/pii/S0960982225011005?via%3Dihub#mmc1). We recorded with two cameras (`cam-1`, `cam-2`), but only video from left camera (`cam-2`) is shared.\n",
"- DeepLabCut² pose files generated for the videos of each camera and 3D pose file `*_DLC_3D.csv` generated using [3D triangulation](https://deeplabcut.github.io/DeepLabCut/docs/Overviewof3D.html).\n",
"- Video features files (`_s3d.npy`) generated using the Video Features repository³\n",
"- `Trial_data.nc` file with behavioural features (kinematic, video features), changepoints, custom colours, and trial meta data.\n",
"\n",
"Below is a example script how one can generate the `Trial_data.nc` file from the raw data.\n",
"\n",
"---\n",
"\n",
"¹ Moll, F. W., Würzler, J., & Nieder, A. (2025). Learned precision tool use in carrion crows. Current Biology, 35(19), 4845-4852.e3. https://doi.org/10.1016/j.cub.2025.08.033\n",
"\n",
"² Nath, T., Mathis, A., Chen, A. C., Patel, A., Bethge, M., & Mathis, M. W. (2019). Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Nature Protocols, 14(7), 2152–2176. https://doi.org/10.1038/s41596-019-0176-0\n",
"\n",
"³ Iashin, V. (2020). Video Features [Computer software]. https://github.com/v-iashin/video_features\n",
"\n",
"
\n",
"
\n",
"\n",
"Left: Figure 1C from Moll et al., 2025¹\n",
"\n",
"Right: Screenshot from GUI. Bottom line plot shows speed of beak tip."
]
},
{
"cell_type": "markdown",
"id": "e3dd0b07",
"metadata": {},
"source": [
"### File structure\n",
"```\n",
"Moll2025/\n",
"├── labels/ # GUI saved label files\n",
"├── 2024-12-17_115_Crow1-cam-1.mp4 # video (trial 115)\n",
"├── 2024-12-17_115_Crow1-cam-1DLC.csv # 2D pose cam-1\n",
"├── 2024-12-17_115_Crow1-cam-2DLC.csv # 2D pose cam-2\n",
"├── 2024-12-17_115_Crow1_DLC_3D.csv # 3D triangulated pose\n",
"├── 2024-12-17_115_Crow1-cam-1_s3d.npy # Video features\n",
"├\n",
"├── 2024-12-18_041_Crow1-cam-1.mp4 # video (trial 41)\n",
"├── 2024-12-18_041_Crow1-cam-1DLC.csv # 2D pose cam-1\n",
"├── 2024-12-18_041_Crow1-cam-2DLC.csv # 2D pose cam-2\n",
"├── 2024-12-18_041_Crow1_DLC_3D.csv # 3D triangulated pose\n",
"├── 2024-12-18_041_Crow1-cam-1_s3d.npy # Video features\n",
"└── Trial_data.nc # all behavioural and meta data in one place\n",
"```\n",
"\n",
"### Filename convention\n",
"\n",
"```\n",
"2024-12-17_115_Crow1-cam-1DLC.csv\n",
"2024-12-17_115_Crow1-cam-1.mp4\n",
"│ │ │\n",
"│ │ └── bird ID\n",
"│ └────── trial number\n",
"└─────────────────── session date\n",
"```\n",
"Using a similar file convention across related files (video, 2D pose, 3D pose), makes it easier to match file names across trials using regex."
]
},
{
"cell_type": "markdown",
"id": "download_header",
"metadata": {},
"source": [
"### Download example data"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "download",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading Trial_data.nc... (0/11)\n",
"Downloading Trial_data.nc... (1/11)\n",
"Downloading 2024-12-17_115_Crow1-cam-1.mp4... (1/11)\n",
"Downloading 2024-12-17_115_Crow1-cam-1.mp4... (2/11)\n",
"Downloading 2024-12-17_115_Crow1-cam-1DLC.csv... (2/11)\n",
"Downloading 2024-12-17_115_Crow1-cam-1DLC.csv... (3/11)\n",
"Downloading 2024-12-17_115_Crow1-cam-2DLC.csv... (3/11)\n",
"Downloading 2024-12-17_115_Crow1-cam-2DLC.csv... (4/11)\n",
"Downloading 2024-12-17_115_Crow1_DLC_3D.csv... (4/11)\n",
"Downloading 2024-12-17_115_Crow1_DLC_3D.csv... (5/11)\n",
"Downloading 2024-12-17_115_Crow1-cam-1_s3d.npy... (5/11)\n",
"Downloading 2024-12-17_115_Crow1-cam-1_s3d.npy... (6/11)\n",
"Downloading 2024-12-18_041_Crow1-cam-1.mp4... (6/11)\n",
"Downloading 2024-12-18_041_Crow1-cam-1.mp4... (7/11)\n",
"Downloading 2024-12-18_041_Crow1-cam-1DLC.csv... (7/11)\n",
"Downloading 2024-12-18_041_Crow1-cam-1DLC.csv... (8/11)\n",
"Downloading 2024-12-18_041_Crow1-cam-2DLC.csv... (8/11)\n",
"Downloading 2024-12-18_041_Crow1-cam-2DLC.csv... (9/11)\n",
"Downloading 2024-12-18_041_Crow1_DLC_3D.csv... (9/11)\n",
"Downloading 2024-12-18_041_Crow1_DLC_3D.csv... (10/11)\n",
"Downloading 2024-12-18_041_Crow1-cam-1_s3d.npy... (10/11)\n",
" 2024-12-18_041_Crow1-cam-1_s3d.npy (11/11)\n",
" mapping: d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\.ethograph\\mapping.txt\n",
"\n",
"data_folder: d:\\Akseli\\Code\\ethograph\\data\\Moll2025\n"
]
}
],
"source": [
"from pathlib import Path\n",
"from ethograph.utils.download import download_example_dataset\n",
"\n",
"try:\n",
" _here = Path(__vsc_ipynb_file__).parent # VS Code\n",
"except NameError:\n",
" _here = Path().resolve() # Jupyter Lab / Notebook (CWD = notebook dir)\n",
"\n",
"data_folder = _here.parent / \"data\" / \"Moll2025\"\n",
"download_example_dataset(\"moll2025\", data_folder)\n",
"\n",
"print(f\"\\ndata_folder: {data_folder}\")"
]
},
{
"cell_type": "markdown",
"id": "5d1df33b",
"metadata": {},
"source": [
"### Build NWB alignment"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "build_alignment",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" trial video_cam-1 pose_2d pose_3d\n",
"0 41 d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-18_041_Crow1-cam-1.mp4 d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-18_041_Crow1-cam-1DLC.csv d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-18_041_Crow1_DLC_3D.csv\n",
"1 115 d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-17_115_Crow1-cam-1.mp4 d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-17_115_Crow1-cam-1DLC.csv d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-17_115_Crow1_DLC_3D.csv\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
"
session_description: NWB file for media alignment (ethograph generated).
identifier: dfeda7ff-30d5-4fdb-97ec-cd2b080a4f81
session_start_time
2026-04-14 18:35:23.440187+02:00timestamps_reference_time
2026-04-14 18:35:23.440187+02:00file_create_date
0
2026-04-14 18:35:23.440187+02:00trials (TimeIntervals)
description: experimental trials
table
\n",
" \n",
" \n",
" | \n",
" start_time | \n",
" stop_time | \n",
" trial | \n",
" video_cam-1 | \n",
" pose_2d | \n",
"
\n",
" \n",
" | id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 0.000 | \n",
" 5.385 | \n",
" 41 | \n",
" d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-18_041_Crow1-cam-1.mp4 | \n",
" d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-18_041_Crow1-cam-1DLC.csv | \n",
"
\n",
" \n",
" | 1 | \n",
" 5.385 | \n",
" 11.230 | \n",
" 115 | \n",
" d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-17_115_Crow1-cam-1.mp4 | \n",
" d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-17_115_Crow1-cam-1DLC.csv | \n",
"
\n",
" \n",
"
devices
cam-1 (Device)
description: video device cam-1
2d (Device)
description: pose device 2d
acquisition
video_cam-1 (ImageSeries)
resolution: -1.0
comments: no comments
description: video from cam-1
conversion: 1.0
offset: 0.0
unit: unknown
data
NumPy array
| Data type | uint8 |
|---|
| Shape | (0, 0, 0) |
|---|
| Array size | 0.00 bytes |
|---|
[]
timestamps
NumPy array
| Data type | float64 |
|---|
| Shape | (2246,) |
|---|
| Array size | 17.55 KiB |
|---|
timestamps_unit: seconds
interval: 1
external_file
0: d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-18_041_Crow1-cam-1.mp4
1: d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-17_115_Crow1-cam-1.mp4
starting_frame
NumPy array
| Data type | int32 |
|---|
| Shape | (2,) |
|---|
| Array size | 8.00 bytes |
|---|
[ 0 1077]
format: external
pose_2d (ImageSeries)
resolution: -1.0
comments: no comments
description: pose from 2d
conversion: 1.0
offset: 0.0
unit: unknown
data
NumPy array
| Data type | uint8 |
|---|
| Shape | (0, 0, 0) |
|---|
| Array size | 0.00 bytes |
|---|
[]
timestamps
NumPy array
| Data type | float64 |
|---|
| Shape | (2246,) |
|---|
| Array size | 17.55 KiB |
|---|
timestamps_unit: seconds
interval: 1
external_file
0: d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-18_041_Crow1-cam-1DLC.csv
1: d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\2024-12-17_115_Crow1-cam-1DLC.csv
starting_frame
NumPy array
| Data type | int32 |
|---|
| Shape | (2,) |
|---|
| Array size | 8.00 bytes |
|---|
[ 0 1077]
format: external
analysis
scratch
processing
stimulus
stimulus_template
lab_meta_data
device_models
electrode_groups
icephys_electrodes
ogen_sites
imaging_planes
intervals
"
],
"text/plain": [
"root pynwb.file.NWBFile at 0x2692928229392\n",
"Fields:\n",
" acquisition: {\n",
" pose_2d ,\n",
" video_cam-1 \n",
" }\n",
" devices: {\n",
" 2d ,\n",
" cam-1 \n",
" }\n",
" file_create_date: [datetime.datetime(2026, 4, 14, 18, 35, 23, 440187, tzinfo=tzlocal())]\n",
" identifier: dfeda7ff-30d5-4fdb-97ec-cd2b080a4f81\n",
" session_description: NWB file for media alignment (ethograph generated).\n",
" session_start_time: 2026-04-14 18:35:23.440187+02:00\n",
" timestamps_reference_time: 2026-04-14 18:35:23.440187+02:00\n",
" trials: trials "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import re\n",
"import pandas as pd\n",
"import natsort\n",
"from pathlib import Path\n",
"from ethograph.io.nwb_alignment import align_media_per_trial\n",
"\n",
"try:\n",
" _here = Path(__vsc_ipynb_file__).parent\n",
"except NameError:\n",
" _here = Path().resolve()\n",
"\n",
"data_folder = _here.parent / \"data\" / \"Moll2025\"\n",
"fps = 200\n",
"\n",
"# ─── 1. Discover media files ───\n",
"# Filename examples:\n",
"# 2024-12-17_115_Crow1-cam-1.mp4\n",
"# 2024-12-17_115_Crow1-cam-1DLC.csv\n",
"# 2024-12-17_115_Crow1_DLC_3D.csv\n",
"video_pattern = re.compile(r\"_(?P\\d+)_Crow1-cam-1\\.mp4$\")\n",
"pose_2d_pattern = re.compile(r\"_(?P\\d+)_Crow1-cam-1DLC\\.csv$\")\n",
"pose_3d_pattern = re.compile(r\"_(?P\\d+)_Crow1_DLC_3D\\.csv$\")\n",
"\n",
"# cam-1 video — for visualization in the GUI\n",
"video_by_trial: dict[int, Path] = {}\n",
"for f in natsort.natsorted(data_folder.glob(\"*-cam-1.mp4\")):\n",
" m = video_pattern.search(f.name)\n",
" if m:\n",
" video_by_trial[int(m[\"trial\"])] = f\n",
"\n",
"# 2D DLC cam-1 — for GUI pose overlay\n",
"pose_2d_by_trial: dict[int, Path] = {}\n",
"for f in natsort.natsorted(data_folder.glob(\"*-cam-1DLC.csv\")):\n",
" m = pose_2d_pattern.search(f.name)\n",
" if m:\n",
" pose_2d_by_trial[int(m[\"trial\"])] = f\n",
"\n",
"# 3D DLC — used as pose source for kinematics\n",
"pose_3d_by_trial: dict[int, Path] = {}\n",
"for f in natsort.natsorted(data_folder.glob(\"*_DLC_3D.csv\")):\n",
" m = pose_3d_pattern.search(f.name)\n",
" if m:\n",
" pose_3d_by_trial[int(m[\"trial\"])] = f\n",
"\n",
"# ─── 2. Build session table ───\n",
"_all_trials = sorted(set(video_by_trial) | set(pose_2d_by_trial) | set(pose_3d_by_trial))\n",
"\n",
"session_table = pd.DataFrame({\n",
" \"trial\": _all_trials,\n",
" \"video_cam-1\": [str(video_by_trial.get(t) or \"\") for t in _all_trials],\n",
" \"pose_2d\": [str(pose_2d_by_trial.get(t) or \"\") for t in _all_trials],\n",
" \"pose_3d\": [str(pose_3d_by_trial.get(t) or \"\") for t in _all_trials], # Only needed for kinematics, not marker overlay.\n",
"})\n",
"session_table = session_table.loc[:, (session_table != \"\").any()]\n",
"\n",
"print(session_table.to_string())\n",
"\n",
"session_table_filt = session_table[[\"trial\", \"video_cam-1\", \"pose_2d\"]]\n",
"\n",
"# ─── 3. Build NWB alignment ───\n",
"nwb_path = data_folder / \".ethograph\" / \"alignment.nwb\"\n",
"align_media_per_trial(\n",
" trial_table=session_table_filt,\n",
" stream_rates={\"video\": float(fps), \"pose\": float(fps)},\n",
" output_path=nwb_path,\n",
" pose_fps=float(fps),\n",
")"
]
},
{
"cell_type": "markdown",
"id": "create_ds_header",
"metadata": {},
"source": [
"### Create dataset"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "create_ds",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Saved to d:\\Akseli\\Code\\ethograph\\data\\Moll2025\\Trial_data.nc\n"
]
}
],
"source": [
"import warnings\n",
"import numpy as np\n",
"import xarray as xr\n",
"\n",
"from movement.io import load_poses\n",
"from movement.kinematics import compute_pairwise_distances, compute_velocity, compute_acceleration\n",
"from movement.utils.vector import compute_norm\n",
"\n",
"import ethograph as eto\n",
"from ethograph.features.movement import compute_distance_to_constant, Position3DCalibration\n",
"from ethograph.features.changepoints import find_troughs_binary, find_nearest_turning_points_binary\n",
"from ethograph.features.preprocessing import gaussian_smoothing\n",
"\n",
"warnings.filterwarnings(\n",
" \"ignore\",\n",
" message=\"Confidence array was not provided.Setting to an array of NaNs\",\n",
" module=\"movement.validators.datasets\",\n",
")\n",
"\n",
"clip_distance = 50 # exclude unrealistic distances (> 50 cm)\n",
"smoothing_params = {\"sigma\": 1.5, \"axis\": 0, \"mode\": \"constant\", \"cval\": np.nan}\n",
"\n",
"# Subset of S3D video features with high Cohen's D for a label (Crow 1)\n",
"good_s3d_feats = [326, 327, 292, 363, 219, 192, 260, 66, 332, 199,\n",
" 288, 763, 837, 182, 24, 218, 213, 21, 733, 242]\n",
"\n",
"# Stationary locations\n",
"disp_xyz = [-10.23, -5.907, -1.395]\n",
"\n",
"ds_list = []\n",
"for _, row in session_table.iterrows():\n",
" trial = row[\"trial\"]\n",
" dlc_3d_path = row[\"pose_3d\"]\n",
"\n",
" ds = load_poses.from_dlc_file(dlc_3d_path, fps=fps)\n",
" ds = ds.assign_coords(individuals=[\"Crow1\"])\n",
" ds.attrs[\"trial\"] = trial\n",
"\n",
" # 3D calibration\n",
" calibration = Position3DCalibration()\n",
" ds = calibration.transform(ds)\n",
"\n",
" # Kinematics\n",
" ds[\"position\"] = gaussian_smoothing(ds.position, **smoothing_params)\n",
" ds[\"velocity\"] = compute_velocity(\n",
" ds.position.sel(keypoints=[\"stickTip\", \"beakTip\"])\n",
" ).clip(min=-150, max=150)\n",
" ds[\"speed\"] = compute_norm(ds.velocity.sel(keypoints=[\"stickTip\", \"beakTip\"]))\n",
" smooth_2x = {**smoothing_params, \"sigma\": smoothing_params[\"sigma\"] * 2}\n",
" ds[\"acceleration\"] = compute_acceleration(\n",
" gaussian_smoothing(ds.position, **smooth_2x).sel(keypoints=[\"stickTip\", \"beakTip\"])\n",
" ).clip(min=-1500, max=1500)\n",
"\n",
" # Distance features\n",
" ds[\"pellet_beakTip_dist\"] = compute_pairwise_distances(\n",
" ds.position, \"keypoints\", {\"pellet\": \"beakTip\"}\n",
" ).clip(0, clip_distance)\n",
" ds[\"pellet_stickTip_dist\"] = compute_pairwise_distances(\n",
" ds.position, \"keypoints\", {\"pellet\": \"stickTip\"}\n",
" ).clip(0, clip_distance)\n",
" ds[\"disp_beakTip_dist\"] = compute_distance_to_constant(\n",
" ds.position, reference_point=disp_xyz, keypoint=\"beakTip\"\n",
" ).clip(0, clip_distance)\n",
" ds[\"disp_stickTip_dist\"] = compute_distance_to_constant(\n",
" ds.position, reference_point=disp_xyz, keypoint=\"stickTip\"\n",
" ).clip(0, clip_distance)\n",
"\n",
" # Keep subset of keypoints\n",
" ds = ds.sel(keypoints=[\"beakTip\", \"stickTip\", \"pellet\"])\n",
"\n",
" # Video features (S3D)\n",
" s3d_path = data_folder / (Path(dlc_3d_path).name.replace(\"_DLC_3D.csv\", \"-cam-1_s3d.npy\"))\n",
" if s3d_path.exists():\n",
" s3d_data = np.load(s3d_path)\n",
" ds[\"s3d\"] = ((\"time\", \"s3d_dims\"), s3d_data[:, good_s3d_feats])\n",
"\n",
" # Changepoints\n",
" ds = eto.add_changepoints_to_ds(\n",
" ds=ds, target_feature=\"speed\", changepoint_name=\"troughs\",\n",
" changepoint_func=find_troughs_binary, prominence=0.5, distance=2,\n",
" )\n",
" ds = eto.add_changepoints_to_ds(\n",
" ds=ds, target_feature=\"speed\", changepoint_name=\"turning_points\",\n",
" changepoint_func=find_nearest_turning_points_binary,\n",
" threshold=1.0, max_value=50, prominence=5, width=2,\n",
" )\n",
"\n",
" # Colour for line plots\n",
" ds = eto.add_angle_rgb_to_ds(ds, smoothing_params=smoothing_params)\n",
"\n",
" # Trial metadata\n",
" if int(trial) == 41:\n",
" ds.attrs[\"pellet_position\"] = \"right\"\n",
" elif int(trial) == 115:\n",
" ds.attrs[\"pellet_position\"] = \"left\"\n",
"\n",
" ds_list.append(ds)\n",
"\n",
"dt = eto.from_datasets(ds_list)\n",
"dt.save(data_folder / \"Trial_data.nc\")\n",
"print(f\"Saved to {data_folder / 'Trial_data.nc'}\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ethograph",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}