# Fit HPS Inventories This guide walks through fitting the horizontal point sampling (HPS) workflow using the `nemora` CLI and Python API. It extends the weighted estimator described in the UBC FRESH Lab manuscript and relies on the probability distributions catalogued in the [distribution reference](../reference/distributions.md). ## Prerequisites - An HPS tally file with columns `dbh_cm` (bin midpoints) and `tally` (per-plot counts). - The basal area factor (`BAF`) used during cruise design. - Python environment with `nemora` installed. One approach: ```bash python -m venv .venv source .venv/bin/activate pip install -e ".[dev]" ``` ## CLI Workflow 1. Inspect available distributions: ```bash nemora registry ``` 2. Fit the HPS inventory with the default candidate set (`weibull`, `gamma`): ```bash nemora fit-hps data/hps_tally.csv --baf 2.0 ``` Pass additional `--distribution` options to try alternative PDFs as the CLI evolves. The command prints RSS, small-sample AIC (`AICc`), Pearson chi-square, Kolmogorov–Smirnov, Cramér–von Mises, and Anderson–Darling statistics (plus residual summaries) for each candidate distribution so you can compare fits at a glance. Generalised secant mixtures are exposed as `gsmN` (for any integer `N ≥ 2`), so a command such as `nemora fit-hps ... -d gsm3 -d gsm6` will evaluate both three- and six-component blends. Use `--distribution` (repeatable) to limit the candidate list and `--show-parameters` to include the fitted parameter estimates. For example: ```bash nemora fit-hps data/hps_tally.csv --baf 2.0 -d weibull -d gamma --show-parameters ``` The grouped Weibull solver defaults to `auto`, which refines the least-squares seed with a Newton update and quietly falls back to the stable least-squares solution if the grouped maximum-likelihood step drifts. Use `--grouped-weibull-mode ls` to pin the legacy behaviour or `--grouped-weibull-mode mle` to require the grouped likelihood path (the command will exit with an error if the refinement fails under the forced `mle` mode). To stay numerically stable, the solver applies the conditional offset from the manuscript: the Weibull support is shifted by `min(DBH) – 0.5 cm` before evaluating grouped probabilities. Johnson SB fits now run an EM pass that updates the Beta shape parameters in grouped space, while Birnbaum–Saunders grouped fits attempt a truncated-normal EM step before falling back to the legacy grouped MLE. Diagnostics include the solver mode (`grouped-em` or `grouped-mle`) so you can trace which path executed. To explore the public PSP example bundle prepared with `scripts/prepare_hps_dataset.py`, start with one of the manifests in `examples/hps_baf12`: ```bash nemora fit-hps examples/hps_baf12/4000002_PSP1_v1_p1.csv --baf 12 ``` ## Python API Example ```python import pandas as pd from nemora.workflows import fit_hps_inventory data = pd.read_csv("data/hps_tally.csv") results = fit_hps_inventory( dbh_cm=data["dbh_cm"].to_numpy(), tally=data["tally"].to_numpy(), baf=2.0, distributions=["weibull", "gamma", "gb2"], ) best = min(results, key=lambda r: r.gof["rss"]) print(best.distribution, best.parameters) for result in results: gof = result.gof summary = result.diagnostics.get("residual_summary", {}) print( f"{result.distribution:8s} RSS={gof.get('rss', float('nan')):,.2f} " f"AICc={gof.get('aicc', float('nan')):,.2f} Chi^2={gof.get('chisq', float('nan')):,.2f} " f"Max|res|={summary.get('max_abs', float('nan')):.3f}" ) ``` `fit_hps_inventory` expands tallies to stand tables, applies the HPS compression factors as weights, and auto-generates starting values. Override the defaults through `FitConfig` for specialised scenarios, and pass `grouped_weibull_mode="ls"` or `"mle"` to mirror the CLI toggle when you need to pin or force the grouped Weibull solver mode. ### Reference Fit (BC PSP 4000002-PSP1) Using the public bundle prepared in the [HPS dataset guide](hps_dataset.md), `fit_hps_inventory` identifies the Weibull distribution as the best fit for plot `4000002_PSP1_v1_p1` with BAF 12. This mirrors the methodology from the EarthArXiv preprint of the HPS manuscript (Paradis, 2025). The regression test in `tests/test_hps_parity.py` locks these targets: | Metric | Value | | --- | --- | | Distribution | `weibull` | | RSS | `4.184770291568501e+07` | | Parameters | `a=2.762844978640213`, `beta=13.778112123083137`, `s=69732.71124303175` | Re-run the check locally with: ```bash pytest tests/test_hps_parity.py ``` ### Censored Meta-Plot Demo For a pooled view, the notebook `examples/hps_bc_psp_demo.ipynb` aggregates the public BC PSP files, censors stems below 9 cm, and fits the censored workflow. This serves as a deployment example on new data rather than a reproduction of the manuscript figures. Run the notebook (or `pytest tests/test_censored_workflow.py`) to verify the gamma fit parameters recorded in the summary table. ### Parity Summary - Regression tests `tests/test_hps_parity.py` and `tests/test_censored_workflow.py` lock the manuscript-aligned Weibull and censored gamma fits. - `tests/test_cli.py::test_fit_hps_command_outputs_weibull_first` exercises the Typer CLI against the PSP tallies to ensure the command-line workflow mirrors the Python API. - `examples/hps_bc_psp_demo.ipynb` demonstrates the workflow on the public BC PSP dataset, while `examples/hps_parity_reference.ipynb` runs the parity analysis on the manuscript meta-plot dataset included under `examples/data/reference_hps/binned_meta_plots.csv` (see Figure 1). The notebook also exports `docs/_static/reference_hps_parity_table.csv` summarising RSS, AICc, chi-square, and parameter deltas for each meta-plot. - For a pure-Python walkthrough (no notebooks), see the [programmatic HPS guide](hps_api.md). ```{figure} ../_static/reference_hps_parity.png :name: fig-hps-parity :alt: Comparison of size-biased and weighted fits for the manuscript meta-plots. :width: 100% Figure 1 — Size-biased control vs. weighted `nemora` curves for the manuscript meta-plots. The dashed line shows residuals on the HPS tally scale. ``` ### Censored / Two-Stage Baseline The reference notebook also aggregates the manuscript dataset into a censored meta-plot (DBH ≥ 20 cm) and fits the two-stage workflow via `fit_censored_inventory`. The exported table (`docs/_static/reference_hps_censored_table.csv`) records RSS, AICc, chi-square, and parameter values for each candidate distribution. ```{figure} ../_static/reference_hps_censored.png :name: fig-hps-censored :alt: Censored meta-plot fits using the two-stage workflow. :width: 80% Figure 2 — Censored (DBH ≥ 20 cm) stand-table data with fitted Weibull and Gamma curves using the two-stage workflow. Residual lines highlight the agreement on the truncated support. ``` ## Diagnostics - Inspect `result.diagnostics["residuals"]` for shape or bias. - Compare `result.gof` metrics (RSS, AICc when available) across candidates. - Plot the empirical stand table alongside fitted curves to confirm agreement with the manuscript workflow. ## Next Steps - Expose distribution filters and parameter previews in the CLI. - Add worked examples for censored inventories and DataLad-backed datasets. - Integrate notebook tutorials mirroring the published reproducibility bundles. - Expand FAIR dataset coverage (see [HPS dataset guide](hps_dataset.md)) with additional PSP plots and censored variants to support end-to-end parity tests. .. todo:: Update this section once the nemora.ingest / sampling / synthesis modules land to reflect the broader workflow.