Run Local HawkEars Inference

This guide walks through running badc infer run --use-hawkears on the dev workstation (Quadro RTX 4000 GPUs). It supplements the CLI reference with a configuration schema, environment prep, and end-to-end command snippets so you can test the entire chunk → infer → aggregate cycle locally.

Prerequisites

  • git clone https://github.com/UBC-FRESH/badc.git && cd badc

  • python -m venv .venv && source .venv/bin/activate

  • pip install -e .[dev]

  • git submodule update –init –recursive

  • badc data connect bogus –pull

  • datalad get -r data/datalad/bogus (optional; downloads audio/chunks up-front)

  • Confirm GPU visibility: badc gpus

HawkEars configuration schema

All runtime knobs live behind badc infer run (Typer) options. For repeatability we recommend capturing them in a simple toml/yaml file (example below) even though the CLI currently expects flags.

badc infer run key options:

Flag

Default

Purpose

--use-hawkears

False

Switch from the stub runner to vendor/HawkEars/analyze.py.

--max-gpus

auto-detect

Cap number of GPUs used (one worker per GPU).

--cpu-workers

0

Additional CPU threads to run alongside GPUs (at least one CPU worker is added automatically when none are detected).

--hawkears-arg

[]

Extra args forwarded verbatim to analyze.py (repeat per flag, e.g. --hawkears-arg --min_score).

--runner-cmd

None

Custom command executed per chunk (mutually exclusive with --use-hawkears).

--output-dir

artifacts/infer

Output root for JSON/CSV; relocates automatically when the manifest lives inside a DataLad dataset.

--telemetry-log

auto-generated

JSONL log path; defaults to data/telemetry/infer/<manifest>_<timestamp>.jsonl or <dataset>/artifacts/telemetry/… inside DataLad datasets.

--max-retries

2

Per-chunk retry budget.

--print-datalad-run

False

Print a ready-to-use datalad run command (no jobs run).

Sample configuration (configs/hawkears-local.toml):

[runner]
manifest = "data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv"
use_hawkears = true
max_gpus = 1
cpu_workers = 1
output_dir = "data/datalad/bogus/artifacts/infer"
telemetry_log = "data/datalad/bogus/artifacts/telemetry/XXXX-000_20251001_093000_local.jsonl"

[hawkears]
extra_args = ["--min_score", "0.75", "--band", "1"]

[paths]
dataset_root = "data/datalad/bogus"

Invoke via badc infer run by interpolating the config values manually or call the helper command:

$ badc infer run-config configs/hawkears-local.toml

Each key mirrors the surface documented in notes/pipeline-plan.md so ops notes and user docs stay aligned. If you need further customization (e.g., injecting environment-specific defaults), use a short Python helper:

Config-file driven runs

configs/hawkears-local.toml ships with the repo so you can reuse the same schema across environments. A tiny launcher script (or notebook cell) can read the file, translate it to CLI flags, and execute badc infer run when you need to patch values on the fly:

$ python - <<'PY'
import tomllib, shlex, subprocess
from pathlib import Path

config = tomllib.loads(Path("configs/hawkears-local.toml").read_text())
runner = config["runner"]
hawkears_cfg = config.get("hawkears", {})

cmd = [
    "badc",
    "infer",
    "run",
    runner["manifest"],
    "--output-dir",
    runner["output_dir"],
    "--max-gpus",
    str(runner["max_gpus"]),
    "--cpu-workers",
    str(runner.get("cpu_workers", 0)),
]
if runner.get("use_hawkears", False):
    cmd.append("--use-hawkears")
for arg in hawkears_cfg.get("extra_args", []):
    cmd.extend(["--hawkears-arg", arg])
if telemetry := runner.get("telemetry_log"):
    cmd.extend(["--telemetry-log", telemetry])

print("Running:", " ".join(shlex.quote(part) for part in cmd))
subprocess.run(cmd, check=True)
PY

Tips:

  • Keep dataset-relative paths (data/datalad/…) so datalad run captures provenance.

  • extra_args map 1:1 to HawkEars’ analyze.py arguments (e.g., --min_score).

  • Store per-host overrides (GPU count, telemetry path) in separate TOML files, then pass the right file to the snippet above.

Step 1 — Chunk selection

Either reuse existing manifests (e.g., data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv) or generate new ones:

$ badc chunk probe data/datalad/bogus/audio/XXXX-000_20251001_093000.wav \
    --initial-duration 60 --max-duration 600 --tolerance 5
$ badc chunk manifest data/datalad/bogus/audio/XXXX-000_20251001_093000.wav \
    --chunk-duration 60 --hash-chunks \
    --output data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv

Each chunk run writes artifacts/chunks/<recording>/.chunk_status.json. Confirm the file exists and reports status="completed" before moving on; badc infer orchestrate enforces this by default so Sockeye scripts and local --apply runs never launch inference against partially chunked recordings. Use --allow-partial-chunks only when you intentionally want to process an incomplete manifest (e.g., debugging a failed chunk).

Step 2 — Run HawkEars locally

$ badc infer run data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv \
    --use-hawkears \
    --max-gpus 1 \
    --hawkears-arg --min_score \
    --hawkears-arg 0.75

Expected outputs:

  • JSON detections under data/datalad/bogus/artifacts/infer/<recording>/

  • Telemetry log inside data/datalad/bogus/artifacts/telemetry/…

  • Console summary listing telemetry path + job counts, plus a per-worker success/failure/ retry table so CPU vs. GPU bottlenecks and flaky chunks stand out. Each run now writes:

    • <telemetry>.summary.json — chunk-level metadata (status, retries, last error/backoff) for resumable runs.

    • <telemetry>.summary.json.workers.csv — per-worker (GPU/CPU) success/failure/retry counts for quick archival alongside SLURM logs.

    The CLI warns when any chunk retried or failed, echoing the chunk ID/worker/last error in the console. Re-run the command with --resume-summary <telemetry.summary.json> (the CLI prints the exact path) to skip chunks that already finished successfully after an interruption.

Step 3 — Monitor telemetry

$ badc infer monitor --log data/datalad/bogus/artifacts/telemetry/XXXX-000_20251001_093000_local.jsonl --follow

The monitor shows per-GPU event counts, retry/failure-attempt totals, utilization stats, and rolling VRAM/retry trends. For quick inspection, use badc telemetry --log to just print the latest entries.

Step 4 — Aggregate results

$ badc infer aggregate data/datalad/bogus/artifacts/infer \
    --manifest data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv \
    --output data/datalad/bogus/artifacts/aggregate/XXXX-000_20251001_093000_summary.csv \
    --parquet data/datalad/bogus/artifacts/aggregate/XXXX-000_20251001_093000_detections.parquet

Follow up with

$ badc report quicklook --parquet data/datalad/bogus/artifacts/aggregate/XXXX-000_20251001_093000_detections.parquet \
    --output-dir data/datalad/bogus/artifacts/aggregate/XXXX-000_20251001_093000_quicklook
$ open docs/notebooks/aggregate_analysis.ipynb  # optional pandas plots

Step 5 — Save via DataLad

$ cd data/datalad/bogus
$ datalad save -m "Local HawkEars run on XXXX-000_20251001_093000"
$ datalad push --to origin
$ datalad push --to arbutus-s3 --data auto

Configuration tips

  • Keep manifests + outputs inside the same DataLad dataset so telemetry and JSON/CSV artifacts are annexed together.

  • Set CUDA_VISIBLE_DEVICES when testing multi-GPU scenarios.

  • Use .venv/bin/badc infer run … so telemetry references the correct interpreter in virtualenv setups.

Batch planning

When multiple manifests need to run back-to-back, use badc infer orchestrate to plan or execute them:

$ badc infer orchestrate data/datalad/bogus \
    --manifest-dir manifests \
    --output-dir artifacts/infer \
    --plan-csv plans/infer.csv \
    --plan-json plans/infer.json \
    --print-datalad-run \
    --apply

The command prints a Rich table, saves CSV/JSON plans for later, and (with --apply) runs badc infer run for every manifest automatically. When the dataset contains .datalad and the CLI is available, the applied runs are wrapped in datalad run by default (add --no-record-datalad to perform plain executions). Include --resume-completed when rerunning plans so the CLI looks for the prior telemetry *.summary.json per manifest and forwards it via --resume-summary—only unfinished chunks will be scheduled.

Smoke tests

When you need extra assurance that HawkEars still runs end-to-end, enable the gated smoke test:

$ BADC_RUN_HAWKEARS_SMOKE=1 pytest tests/smoke/test_hawkears_smoke.py

The test trims the bogus manifest to a single chunk, runs badc infer run-config (real HawkEars), and checks that JSON/telemetry artifacts appear under a temporary path. By default the test is skipped so regular CI does not require GPUs or HawkEars assets.