Program and science

Learn what the pipeline computes and how to run it safely.

This guide joins the scientific concepts behind StaMPS-style persistent-scatterer processing with the pySTAMPS commands used to inspect, execute, accelerate, verify, and benchmark a dataset.

Stage And Artifact Map

The scientific workflow is implemented as an artifact-driven stage chain. Use this map with the detailed Stages and Code Paths page when reading code or debugging a run.

pySTAMPS stage and artifact map

What pySTAMPS is

pySTAMPS is a Python-first runtime for StaMPS-style InSAR processing. It works on a dataset directory, discovers patch folders and stage artifacts, runs selected stages from 1 to 8, and writes new .mat products back into that same tree.

Run on a copy of your dataset. pySTAMPS writes outputs in place.

Minimum science background

SAR observations

A radar satellite revisits the same area and records complex values whose phase is sensitive to geometry, atmosphere, motion, topography, and noise.

Interferogram

An interferogram compares two radar acquisitions. Single-master workflows compare many slaves to one master; small-baseline workflows organize pairs differently.

Wrapped phase

Radar phase repeats every 2*pi. Unwrapping converts repeating phase cycles into a more continuous estimate.

Persistent scatterer

A point that remains stable across acquisitions and is useful for time-series analysis.

Coherence

A practical reliability signal. Higher coherence usually means a point is easier to trust later.

SCLA

Spatially correlated look-angle error, a structured phase component estimated and corrected in late-stage processing.

Dataset mental model

A pySTAMPS run points at one dataset root. The root usually contains patch directories, optional patch.list, source folders, and merged stage artifacts.

DATASET/
  patch.list
  PATCH_1/
    ps1.mat
    ph1.mat
    pm1.mat
    select1.mat
    weed1.mat
  PATCH_2/
  diff0/
  geo/
  rslc/
  ps2.mat
  ph2.mat
  phuw2.mat
  scla2.mat
  uw_space_time.mat

Use status first:

uv run pystamps status --dataset DATASET

Artifact-driven execution

Each stage has expected output artifacts. If the artifact or merged-stage bundle already exists, the pipeline reports skipped_existing instead of recomputing it.

StatusMeaning
plannedDry-run selected the stage but did not execute it.
completedStage executed or strict reference replay copied the expected bundle.
skipped_existingExpected artifacts were already present.
failedStage raised an execution error.
For speed tests, use make benchmark, the direct kernel API, or a dataset copy that actually needs the target outputs.

Install and first run

git clone git@github.com:sirbastiano/pystamps.git
cd pystamps
uv sync
uv run pystamps describe-backends

Editable installs compile the native Rust/CPU extension, so source builds require Rust and Cargo.

python -m pip install -e .
python -m pip install -e ".[dev]"
cargo --version

Run on a copy:

cp -a /path/to/source_dataset /path/to/run_dataset
uv run pystamps status --dataset /path/to/run_dataset
uv run pystamps run --dataset /path/to/run_dataset --start-step 1 --end-step 8 --dry-run
uv run pystamps run --dataset /path/to/run_dataset --start-step 1 --end-step 8

CLI command map

CommandPurposeTypical use
statusInspect dataset and inferred progress.First command on any dataset.
runExecute or dry-run stages.Normal processing.
verifyCompare a run tree against a golden tree.Trust but verify.
describe-inputsPrint logical input contracts.Learning and debugging.
describe-backendsPrint kernel/backend availability.Backend setup and speed work.
list-legacyList StaMPS legacy scripts.Migration support.
uv run pystamps describe-inputs --stage all
uv run pystamps describe-inputs --stage 1 --dataset DATASET --patch PATCH_1
uv run pystamps describe-backends

Stage-by-stage science and outputs

This table explains the scientific question at each stage. For implementation entrypoints, Rust readiness, and direct stage commands, open Stages and Code Paths.

StageScopeScience questionMain outputs
1PatchWhat candidate points and metadata are available?ps1.mat, ph1.mat, bp1.mat
2PatchHow well does each candidate fit the phase model?pm1.mat
3PatchWhich candidates are good persistent scatterers?select1.mat
4PatchWhich selected points are noisy or redundant?weed1.mat
5Patch and mergedHow do patch results become one dataset view?ph2.mat, ifgstd2.mat
6MergedWhat is the unwrapped phase estimate?phuw2.mat, uw_grid.mat, uw_interp.mat
7MergedWhat slow correction terms should be estimated?scla2.mat, scla_smooth2.mat
8MergedWhat are the final filtered space-time products?mean_v.mat, uw_space_time.mat

Switch kernel modality

The CLI command stays the same. Switch between reference Python, optimized native Rust/CPU, and optional CUDA providers through config.

runtime:
  backend: auto
  stage2_kernel_backend: native
  stage2_native_threads: 0
  kernel_backend_overrides:
    stage2_grid_accumulate: native
    stage2_histogram: native
    stage2_topofit: native
    stage2_topofit_row_invariant: native
    stage2_topofit_coh_row_invariant: native
    stage4_edge_stats: native
    stage7_scla: native
    stage8_edge_noise: native
  io_workers: 8
  cpu_workers: 0
  stage7_chunk_ps: 100000
  stage8_chunk_edges: 200000
uv run pystamps --config native-kernels.yaml run \
  --dataset /path/to/run_dataset \
  --start-step 2 --end-step 8

Current optimized kernel names are stage2_grid_accumulate, stage2_histogram, stage2_topofit, stage2_topofit_row_invariant, stage2_topofit_coh_row_invariant, stage4_edge_stats, stage7_scla, and stage8_edge_noise.

Python API examples

from pystamps.status import collect_status

status = collect_status("/path/to/run_dataset")
print(status.merged_stage)
for patch in status.patch_statuses:
    print(patch.patch, patch.stage)
from pathlib import Path

from pystamps.config import RunConfig
from pystamps.pipeline.stages import run_pipeline
from pystamps.pipeline.types import PipelineContext

context = PipelineContext(
    dataset_root=Path("/path/to/run_dataset"),
    run_config=RunConfig(),
    start_step=6,
    end_step=8,
    dry_run=False,
)
report = run_pipeline(context)

Verify parity and benchmark speed

Use verification or audit evidence for parity claims.

uv run pystamps verify \
  --run /path/to/run_dataset \
  --golden /path/to/reference_dataset
make audit

Use repeatable benchmarks for speed claims.

make benchmark
uv run python scripts/benchmark_backends.py \
  --dataset /path/to/reference_dataset \
  --start-step 1 --end-step 8 \
  --repeat 3 --warmup 1

Troubleshooting

  • Stage skipped: skipped_existing means expected artifacts already exist. Use a fresh copy that still needs the stage to force execution.
  • Native unavailable: run uv run pystamps describe-backends, install Rust, and rebuild the editable environment.
  • Unwrapping fails: check triangle and snaphu availability or configure their paths under tools.
  • Verification fails: ensure --run and --golden refer to comparable dataset states.
  • Audit is slow: full audit processes every dataset in the maintained manifest; use targeted tests during development and audit for release evidence.