(target-audio-changepoints)= # Audio changepoints For audio data or high-sample-rate periodic signals (loaded as `.wav` file). Four methods are available, drawn from two libraries. --- ## VocalPy methods Reference: [VocalPy documentation](https://vocalpy.readthedocs.io/) **Mean-squared energy** (`meansquared`): Computes a smoothed energy envelope via mean-squared amplitude, then thresholds to find vocal segments. Simple and fast. **AVA** (`ava`): The segmentation method from the Animal Vocalization Analysis pipeline. Uses a spectrogram-based approach with multiple threshold levels. --- ## VocalSeg methods Reference: [VocalSeg (Sainburg et al., 2020)](https://github.com/timsainb/vocalization-segmentation) **Dynamic thresholding** (`vocalseg`): Adaptive threshold segmentation that adjusts to local spectral energy. Good for signals with varying background noise. **Continuity filtering** (`continuity`): Extends dynamic thresholding with temporal continuity constraints to merge fragmented detections. --- ## Usage 1. Open the **Audio CPs** panel. 2. Choose a method and click **Configure...** to adjust parameters. 3. Click **Detect**. Detected onsets and offsets are drawn as vertical lines on the plot and stored in the dataset as `audio_cp_onsets` / `audio_cp_offsets`. --- ## Data format ```{note} Audio changepoints format is subject to change, still in development. ``` Audio changepoints use a different storage format because dense binary arrays at audio sample rates (e.g. 44 kHz) would be prohibitively large. They are stored as onset/offset time pairs in seconds: ```python ds["audio_cp_onsets"] = xr.DataArray( onset_times_s, dims=["audio_cp"], attrs={"type": "audio_changepoints", "target_feature": "audio"}, ) ds["audio_cp_offsets"] = xr.DataArray( offset_times_s, dims=["audio_cp"], attrs={"type": "audio_changepoints", "target_feature": "audio"}, ) ``` --- ## References - Nicholson, D. (2023). vocalpy/vocalpy: 0.2.0. Zenodo. - Nicholson, D., & Cohen, Y. (2023). vak: A neural network framework for researchers studying animal acoustic communication. Scipy 2023. - Sainburg, T., Thielk, M., & Gentner, T. Q. (2020). Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires. PLOS Computational Biology, 16(10), e1008228.