Audio changepoints#

For audio data or high-sample-rate periodic signals (loaded as .wav file).

Four methods are available, drawn from two libraries.

VocalPy methods#

Mean-squared energy (meansquared): Computes a smoothed energy envelope via mean-squared amplitude, then thresholds to find vocal segments. Simple and fast.

AVA (ava): The segmentation method from the Animal Vocalization Analysis pipeline. Uses a spectrogram-based approach with multiple threshold levels.

VocalSeg methods#

Reference: VocalSeg (Sainburg et al., 2020)

Dynamic thresholding (vocalseg): Adaptive threshold segmentation that adjusts to local spectral energy. Good for signals with varying background noise.

Continuity filtering (continuity): Extends dynamic thresholding with temporal continuity constraints to merge fragmented detections.

Usage#

Open the Audio CPs panel.
Choose a method and click Configure… to adjust parameters.
Click Detect.

Detected onsets and offsets are drawn as vertical lines on the plot and stored in the dataset as audio_cp_onsets / audio_cp_offsets.

Data format#

Note

Audio changepoints format is subject to change, still in development.

Audio changepoints use a different storage format because dense binary arrays at audio sample rates (e.g. 44 kHz) would be prohibitively large. They are stored as onset/offset time pairs in seconds:

ds["audio_cp_onsets"]  = xr.DataArray(
    onset_times_s,
    dims=["audio_cp"],
    attrs={"type": "audio_changepoints", "target_feature": "audio"},
)
ds["audio_cp_offsets"] = xr.DataArray(
    offset_times_s,
    dims=["audio_cp"],
    attrs={"type": "audio_changepoints", "target_feature": "audio"},
)

References#

Nicholson, D. (2023). vocalpy/vocalpy: 0.2.0. Zenodo. https://doi.org/10.5281/zenodo.7905426
Nicholson, D., & Cohen, Y. (2023). vak: A neural network framework for researchers studying animal acoustic communication. Scipy 2023. https://doi.org/10.25080/gerudo-f2bc6f59-008
Sainburg, T., Thielk, M., & Gentner, T. Q. (2020). Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires. PLOS Computational Biology, 16(10), e1008228. https://doi.org/10.1371/journal.pcbi.1008228