Usage - pySTAMPS Docs

Common workflows

Discover stage coverage:

uv run pystamps status --dataset PATH

Dry-run selected stages:

uv run pystamps run --dataset PATH --start-step 1 --end-step 4 --dry-run

Execute stages with tuned worker settings:

uv run pystamps run --dataset PATH --start-step 1 --end-step 8 --io-workers 12 --cpu-workers 4

Run the direct Rust CLI:

target/release/pystamps-native run \
  --native-only \
  --dataset PATH \
  --start-step 1 \
  --end-step 8 \
  --backend native \
  --stage2-kernel-backend native

Advanced usage

Strict reference replay

compat:
  strict_reference: true
  reference_root: /path/to/reference_dataset

This causes stage execution to copy expected artifacts from reference paths before running stage logic.

Backend selection

runtime.backend=auto: strategy by stage.
runtime.backend=threads: IO-style execution path.
runtime.backend=processes: CPU process execution for CPU stages.
runtime.backend=native: delegate pipeline execution to the Rust native runner.
runtime.backend=gpu: request GPU-capable kernels where the Python runtime has support.

Optimized kernel config

runtime:
  backend: auto
  stage2_kernel_backend: native
  stage2_native_threads: 0
  kernel_backend_overrides:
    stage2_grid_accumulate: native
    stage2_histogram: native
    stage2_topofit: native
    stage2_topofit_row_invariant: native
    stage2_topofit_coh_row_invariant: native
    stage4_edge_stats: native
    stage7_scla: native
    stage8_edge_noise: native

uv run pystamps --config native-kernels.yaml run --dataset PATH --start-step 2 --end-step 8

Use python for reference kernels and native for Rust/CPU. The stage-2 kernel backend accepts only auto, python, or native.

Pipeline stages with existing expected artifacts are skipped. Use an incomplete run copy for pipeline execution, or use the benchmark/direct-kernel examples to exercise optimized kernels.

Direct kernel API on repo data

uv run python - <<'PY'
from pathlib import Path
import numpy as np

from pystamps.io.mat import read_mat
from pystamps.kernels import run_stage8_edge_noise_kernel

root = Path("/path/to/reference_dataset")
uw_grid = read_mat(root / "uw_grid.mat")
uw_interp = read_mat(root / "uw_interp.mat")

uw_ph = np.asarray(uw_grid["ph"][:1000, :8], dtype=np.complex64)
edges = np.asarray(uw_interp["edgs"], dtype=np.int64)
node_a = edges[:, 1] - 1
node_b = edges[:, 2] - 1
valid = (node_a >= 0) & (node_b >= 0) & (node_a < uw_ph.shape[0]) & (node_b < uw_ph.shape[0])

out = run_stage8_edge_noise_kernel(uw_ph, node_a[valid][:2000], node_b[valid][:2000], backend="native")
print(out["dph_noise"].shape)
PY

Integration patterns

Use pystamps run in CI for deterministic execution and artifact replay checks.
Use Verification for baseline comparison commands and parity contract checks.
Keep long-running runs in a persistent dataset directory and rerun only failed stage ranges.
Pair stage-level artifact checks with discover_dataset-style logic from scripts for orchestration.

Performance considerations

Prefer stage ranges to avoid re-running existing artifacts.
Use run --start-step/--end-step to isolate expensive stages.
enable_mat_stage_cache reduces re-read overhead in merged stages.
Stage-7/8 chunk sizes can be tuned in config for memory-pressure control.
Use make benchmark or scripts/benchmark_backends.py for repeatable speed evidence.

Usage Guides