Audio changepoints#
For audio data or high-sample-rate periodic signals (loaded as .wav file).
Four methods are available, drawn from two libraries.
VocalPy methods#
Reference: VocalPy documentation
Mean-squared energy (meansquared): Computes a smoothed energy envelope
via mean-squared amplitude, then thresholds to find vocal segments. Simple
and fast.
AVA (ava): The segmentation method from the Animal Vocalization
Analysis pipeline. Uses a spectrogram-based approach with multiple threshold
levels.
VocalSeg methods#
Reference: VocalSeg (Sainburg et al., 2020)
Dynamic thresholding (vocalseg): Adaptive threshold segmentation that
adjusts to local spectral energy. Good for signals with varying background
noise.
Continuity filtering (continuity): Extends dynamic thresholding with
temporal continuity constraints to merge fragmented detections.
Usage#
Open the Audio CPs panel.
Choose a method and click Configure… to adjust parameters.
Click Detect.
Detected onsets and offsets are drawn as vertical lines on the plot and
stored in the dataset as audio_cp_onsets / audio_cp_offsets.
Data format#
Note
Audio changepoints format is subject to change, still in development.
Audio changepoints use a different storage format because dense binary arrays at audio sample rates (e.g. 44 kHz) would be prohibitively large. They are stored as onset/offset time pairs in seconds:
ds["audio_cp_onsets"] = xr.DataArray(
onset_times_s,
dims=["audio_cp"],
attrs={"type": "audio_changepoints", "target_feature": "audio"},
)
ds["audio_cp_offsets"] = xr.DataArray(
offset_times_s,
dims=["audio_cp"],
attrs={"type": "audio_changepoints", "target_feature": "audio"},
)
References#
Nicholson, D. (2023). vocalpy/vocalpy: 0.2.0. Zenodo. https://doi.org/10.5281/zenodo.7905426
Nicholson, D., & Cohen, Y. (2023). vak: A neural network framework for researchers studying animal acoustic communication. Scipy 2023. https://doi.org/10.25080/gerudo-f2bc6f59-008
Sainburg, T., Thielk, M., & Gentner, T. Q. (2020). Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires. PLOS Computational Biology, 16(10), e1008228. https://doi.org/10.1371/journal.pcbi.1008228