Run Local HawkEars Inference¶
This guide walks through running badc infer run --use-hawkears on the dev workstation (Quadro RTX
4000 GPUs). It supplements the CLI reference with a configuration schema, environment prep, and
end-to-end command snippets so you can test the entire chunk → infer → aggregate cycle locally.
Prerequisites¶
git clone https://github.com/UBC-FRESH/badc.git && cd badc
python -m venv .venv && source .venv/bin/activate
pip install -e .[dev]
git submodule update –init –recursive
badc data connect bogus –pull
datalad get -r data/datalad/bogus (optional; downloads audio/chunks up-front)
Confirm GPU visibility: badc gpus
HawkEars configuration schema¶
All runtime knobs live behind badc infer run (Typer) options. For repeatability we recommend
capturing them in a simple toml/yaml file (example below) even though the CLI currently
expects flags.
badc infer run key options:
Flag |
Default |
Purpose |
|---|---|---|
|
|
Switch from the stub runner to |
|
auto-detect |
Cap number of GPUs used (one worker per GPU). |
|
|
Additional CPU threads to run alongside GPUs (at least one CPU worker is added automatically when none are detected). |
|
|
Extra args forwarded verbatim to |
|
|
Custom command executed per chunk (mutually exclusive with |
|
|
Output root for JSON/CSV; relocates automatically when the manifest lives inside a DataLad dataset. |
|
auto-generated |
JSONL log path; defaults to |
|
|
Per-chunk retry budget. |
|
|
Print a ready-to-use |
Sample configuration (configs/hawkears-local.toml):
[runner]
manifest = "data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv"
use_hawkears = true
max_gpus = 1
cpu_workers = 1
output_dir = "data/datalad/bogus/artifacts/infer"
telemetry_log = "data/datalad/bogus/artifacts/telemetry/XXXX-000_20251001_093000_local.jsonl"
[hawkears]
extra_args = ["--min_score", "0.75", "--band", "1"]
[paths]
dataset_root = "data/datalad/bogus"
Invoke via badc infer run by interpolating the config values manually or call the helper
command:
$ badc infer run-config configs/hawkears-local.toml
Each key mirrors the surface documented in notes/pipeline-plan.md so ops notes and user docs
stay aligned. If you need further customization (e.g., injecting environment-specific defaults),
use a short Python helper:
Config-file driven runs¶
configs/hawkears-local.toml ships with the repo so you can reuse the same schema across
environments. A tiny launcher script (or notebook cell) can read the file, translate it to CLI
flags, and execute badc infer run when you need to patch values on the fly:
$ python - <<'PY'
import tomllib, shlex, subprocess
from pathlib import Path
config = tomllib.loads(Path("configs/hawkears-local.toml").read_text())
runner = config["runner"]
hawkears_cfg = config.get("hawkears", {})
cmd = [
"badc",
"infer",
"run",
runner["manifest"],
"--output-dir",
runner["output_dir"],
"--max-gpus",
str(runner["max_gpus"]),
"--cpu-workers",
str(runner.get("cpu_workers", 0)),
]
if runner.get("use_hawkears", False):
cmd.append("--use-hawkears")
for arg in hawkears_cfg.get("extra_args", []):
cmd.extend(["--hawkears-arg", arg])
if telemetry := runner.get("telemetry_log"):
cmd.extend(["--telemetry-log", telemetry])
print("Running:", " ".join(shlex.quote(part) for part in cmd))
subprocess.run(cmd, check=True)
PY
Tips:
Keep dataset-relative paths (data/datalad/…) so
datalad runcaptures provenance.extra_args map 1:1 to HawkEars’
analyze.pyarguments (e.g.,--min_score).Store per-host overrides (GPU count, telemetry path) in separate TOML files, then pass the right file to the snippet above.
Step 1 — Chunk selection¶
Either reuse existing manifests (e.g., data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv)
or generate new ones:
$ badc chunk probe data/datalad/bogus/audio/XXXX-000_20251001_093000.wav \
--initial-duration 60 --max-duration 600 --tolerance 5
$ badc chunk manifest data/datalad/bogus/audio/XXXX-000_20251001_093000.wav \
--chunk-duration 60 --hash-chunks \
--output data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv
Each chunk run writes artifacts/chunks/<recording>/.chunk_status.json. Confirm the file exists
and reports status="completed" before moving on; badc infer orchestrate enforces this
by default so Sockeye scripts and local --apply runs never launch inference against partially
chunked recordings. Use --allow-partial-chunks only when you intentionally want to process an
incomplete manifest (e.g., debugging a failed chunk).
Step 2 — Run HawkEars locally¶
$ badc infer run data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv \
--use-hawkears \
--max-gpus 1 \
--hawkears-arg --min_score \
--hawkears-arg 0.75
Expected outputs:
JSON detections under
data/datalad/bogus/artifacts/infer/<recording>/Telemetry log inside
data/datalad/bogus/artifacts/telemetry/…Console summary listing telemetry path + job counts, plus a per-worker success/failure/ retry table so CPU vs. GPU bottlenecks and flaky chunks stand out. Each run now writes:
<telemetry>.summary.json— chunk-level metadata (status, retries, last error/backoff) for resumable runs.<telemetry>.summary.json.workers.csv— per-worker (GPU/CPU) success/failure/retry counts for quick archival alongside SLURM logs.
The CLI warns when any chunk retried or failed, echoing the chunk ID/worker/last error in the console. Re-run the command with
--resume-summary <telemetry.summary.json>(the CLI prints the exact path) to skip chunks that already finished successfully after an interruption.
Step 3 — Monitor telemetry¶
$ badc infer monitor --log data/datalad/bogus/artifacts/telemetry/XXXX-000_20251001_093000_local.jsonl --follow
The monitor shows per-GPU event counts, retry/failure-attempt totals, utilization stats, and rolling
VRAM/retry trends. For quick inspection, use badc telemetry --log … to just print the latest
entries.
Step 4 — Aggregate results¶
$ badc infer aggregate data/datalad/bogus/artifacts/infer \
--manifest data/datalad/bogus/manifests/XXXX-000_20251001_093000.csv \
--output data/datalad/bogus/artifacts/aggregate/XXXX-000_20251001_093000_summary.csv \
--parquet data/datalad/bogus/artifacts/aggregate/XXXX-000_20251001_093000_detections.parquet
Follow up with
$ badc report quicklook --parquet data/datalad/bogus/artifacts/aggregate/XXXX-000_20251001_093000_detections.parquet \
--output-dir data/datalad/bogus/artifacts/aggregate/XXXX-000_20251001_093000_quicklook
$ open docs/notebooks/aggregate_analysis.ipynb # optional pandas plots
Step 5 — Save via DataLad¶
$ cd data/datalad/bogus
$ datalad save -m "Local HawkEars run on XXXX-000_20251001_093000"
$ datalad push --to origin
$ datalad push --to arbutus-s3 --data auto
Configuration tips¶
Keep manifests + outputs inside the same DataLad dataset so telemetry and JSON/CSV artifacts are annexed together.
Set CUDA_VISIBLE_DEVICES when testing multi-GPU scenarios.
Use .venv/bin/badc infer run … so telemetry references the correct interpreter in virtualenv setups.
Batch planning¶
When multiple manifests need to run back-to-back, use badc infer orchestrate to plan or execute
them:
$ badc infer orchestrate data/datalad/bogus \
--manifest-dir manifests \
--output-dir artifacts/infer \
--plan-csv plans/infer.csv \
--plan-json plans/infer.json \
--print-datalad-run \
--apply
The command prints a Rich table, saves CSV/JSON plans for later, and (with --apply) runs
badc infer run for every manifest automatically. When the dataset contains .datalad and the
CLI is available, the applied runs are wrapped in datalad run by default (add
--no-record-datalad to perform plain executions). Include --resume-completed when rerunning
plans so the CLI looks for the prior telemetry *.summary.json per manifest and forwards it via
--resume-summary—only unfinished chunks will be scheduled.
Smoke tests¶
When you need extra assurance that HawkEars still runs end-to-end, enable the gated smoke test:
$ BADC_RUN_HAWKEARS_SMOKE=1 pytest tests/smoke/test_hawkears_smoke.py
The test trims the bogus manifest to a single chunk, runs badc infer run-config (real HawkEars),
and checks that JSON/telemetry artifacts appear under a temporary path. By default the test is
skipped so regular CI does not require GPUs or HawkEars assets.