Run flow

Inspect, rehearse, then run.

This page is the shortest correct path through pySTAMPS. It is intentionally practical: every step is something you can run on a dataset copy without needing to understand every scientific detail first.

Step 1: inspect the dataset

Start by asking pySTAMPS what it sees in the dataset root.

uv run pystamps status --dataset /path/to/run_dataset

This command does not process the data. It reports the merged stage and patch stages it can infer from the artifacts already present.

Step 2: dry-run the full pipeline

A dry-run is a safe rehearsal. It shows what pySTAMPS would do without executing the stages.

uv run pystamps run --dataset /path/to/run_dataset --start-step 1 --end-step 8 --dry-run

Step 3: run on a copy, not on your only original

pySTAMPS writes artifacts into the dataset tree. Make a run copy first.

cp -a /path/to/run_source_dataset /path/to/run_dataset
uv run pystamps run --dataset /path/to/run_dataset --start-step 6 --end-step 8
For debugging, shorten the range. If the copied dataset already has outputs, completed stages can report skipped_existing.

Step 4: run the optimized Rust/native kernels

Inspect available providers, then pin the optimized native kernels with a config file.

uv run pystamps describe-backends
runtime:
  backend: auto
  stage2_kernel_backend: native
  stage2_native_threads: 0
  kernel_backend_overrides:
    stage2_grid_accumulate: native
    stage2_histogram: native
    stage2_topofit: native
    stage2_topofit_row_invariant: native
    stage2_topofit_coh_row_invariant: native
    stage4_edge_stats: native
    stage7_scla: native
    stage8_edge_noise: native
  io_workers: 8
  cpu_workers: 0
  stage7_chunk_ps: 100000
  stage8_chunk_edges: 200000
uv run pystamps --config native-kernels.yaml run --dataset /path/to/run_dataset --start-step 2 --end-step 8
Use python for reference kernels and native for the compiled Rust/CPU kernels. Use cuda only when that kernel and CuPy are available.

If the copied dataset already contains the expected outputs, those stages report skipped_existing. To execute kernels through the pipeline, use a dataset copy that still needs those stage outputs. Use the benchmark or direct-kernel examples for profiling.

Step 5: validate and benchmark

Verification compares your run copy to a reference dataset. Benchmarking records repeatable JSON/CSV timing evidence.

uv run pystamps verify --run /path/to/run_dataset --golden /path/to/reference_dataset
make benchmark

Need worker tuning?

uv run pystamps run --dataset DATASET_DIR --start-step 1 --end-step 8 --io-workers 12 --cpu-workers 4

Need a config file?

runtime:
  backend: auto
  stage2_kernel_backend: auto
  stage2_native_threads: 0
  io_workers: 8
  cpu_workers: 0
  stage7_chunk_ps: 100000
  stage8_chunk_edges: 200000
  enable_mat_stage_cache: true
  stage2_checkpoint_mode: final
  stage2_checkpoint_interval: 1

Need the execution map?

Use the six-step flow above: inspect, dry-run, execute, verify, and benchmark. Repeat with narrower stage ranges as needed.

CLI flags by command

  • status: --dataset
  • run: --dataset, --start-step, --end-step, --dry-run, --io-workers, --cpu-workers
  • verify: --run, --golden
  • describe-backends: no flags
  • list-legacy: --stamps-root or STAMPS_ROOT