Metadata-Version: 2.4
Name: terraflow-agro
Version: 0.3.0
Summary: TerraFlow: a reproducible workflow for geospatial agricultural modeling.
Author: Gnaneswara (Akhil) Marupilla
License: MIT
Project-URL: Homepage, https://github.com/gmarupilla/AgroTerraFlow
Project-URL: Bug Tracker, https://github.com/gmarupilla/AgroTerraFlow/issues
Project-URL: Documentation, https://terraflow.marupilla.dev
Keywords: geospatial,agriculture,raster,climate,workflow
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: GIS
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: pyarrow>=14.0
Requires-Dist: rasterio>=1.2.0
Requires-Dist: pyyaml>=5.4.0
Requires-Dist: pydantic>=2.0
Requires-Dist: scipy>=1.9.0
Requires-Dist: pykrige>=1.7
Requires-Dist: shapely>=2.0.0
Requires-Dist: pyproj>=3.0
Requires-Dist: SALib>=1.5
Requires-Dist: scikit-learn>=1.0
Requires-Dist: typer>=0.12.5
Provides-Extra: viz
Requires-Dist: plotly>=5.0.0; extra == "viz"
Provides-Extra: h3
Requires-Dist: h3<5,>=4.0; extra == "h3"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=3.0; extra == "dev"
Requires-Dist: mypy>=1.10; extra == "dev"
Requires-Dist: pip-licenses>=4.3; extra == "dev"
Requires-Dist: pre-commit>=3.7; extra == "dev"
Requires-Dist: types-PyYAML>=6.0; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: marimo>=0.19; extra == "dev"
Requires-Dist: matplotlib>=3.7; extra == "dev"
Requires-Dist: seaborn>=0.13; extra == "dev"
Requires-Dist: h3<5,>=4.0; extra == "dev"
Dynamic: license-file

# TerraFlow: Reproducible Geospatial Agricultural Modeling

[![CI](https://github.com/gmarupilla/AgroTerraFlow/actions/workflows/ci.yml/badge.svg)](https://github.com/gmarupilla/AgroTerraFlow/actions/workflows/ci.yml)
[![Deploy Docs](https://github.com/gmarupilla/AgroTerraFlow/actions/workflows/docs.yml/badge.svg)](https://github.com/gmarupilla/AgroTerraFlow/actions/workflows/docs.yml)
[![Publish to PyPI](https://github.com/gmarupilla/AgroTerraFlow/actions/workflows/publish-pypi.yml/badge.svg)](https://github.com/gmarupilla/AgroTerraFlow/actions/workflows/publish-pypi.yml)
[![Build JOSS Manuscript](https://github.com/gmarupilla/AgroTerraFlow/actions/workflows/manuscript.yml/badge.svg)](https://github.com/gmarupilla/AgroTerraFlow/actions/workflows/manuscript.yml)
[![PyPI](https://img.shields.io/pypi/v/terraflow-agro.svg)](https://pypi.org/project/terraflow-agro/)
[![Homebrew Tap](https://img.shields.io/badge/brew-gmarupilla%2Fterraflow-orange.svg)](https://github.com/gmarupilla/homebrew-terraflow)
[![Python Version](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=gmarupilla_AgroTerraFlow&metric=alert_status)](https://sonarcloud.io/summary/new_code?id=gmarupilla_AgroTerraFlow)
[![Codecov](https://codecov.io/gh/gmarupilla/AgroTerraFlow/branch/main/graph/badge.svg)](https://codecov.io/gh/gmarupilla/AgroTerraFlow)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)

TerraFlow is a reproducible, config-driven geospatial workflow for agricultural suitability modeling. Give it a land-cover raster, a climate CSV, and a YAML config — it returns a scored, location-stamped results table with full provenance.

**Documentation:** [terraflow.marupilla.dev](https://terraflow.marupilla.dev) — see the [Reproducibility page](https://terraflow.marupilla.dev/reproducibility/) for what the run fingerprint covers and known sources of non-determinism.

---

## Installation

**macOS (Homebrew)** — handles GDAL and PROJ automatically:

```bash
brew tap gmarupilla/terraflow
brew install terraflow
```

**pip / uv:**

```bash
uv pip install terraflow-agro
# or
pip install terraflow-agro
```

For kriging-based interpolation:

```bash
pip install terraflow-agro pykrige
```

See [Homebrew install docs](https://terraflow.marupilla.dev/install/homebrew/) for update/uninstall instructions and troubleshooting.

## Quickstart

```bash
terraflow --config config.yml
```

A minimal config:

```yaml
raster_path: "data/land_cover.tif"
climate_csv: "data/climate.csv"
output_dir: "outputs"
roi:
  type: bbox
  xmin: -120.5
  ymin: 34.0
  xmax: -118.0
  ymax: 35.5
model_params:
  v_min: 0.0
  v_max: 1.0
  t_min: 10.0
  t_max: 35.0
  r_min: 100.0
  r_max: 800.0
  w_v: 0.4
  w_t: 0.3
  w_r: 0.3
```

Results are written to `outputs/runs/<fingerprint>/`:

```
features.parquet   — scored cells (lat, lon, score, label, …)
results.csv        — same data in CSV
manifest.json      — full provenance record
report.json        — QA stats and timings
```

## CLI subcommands

| Subcommand | Purpose |
|---|---|
| `terraflow run -c config.yml` | Run the full pipeline |
| `terraflow sensitivity -c config.yml` | Sobol' / Morris sensitivity indices for model weights |
| `terraflow validate -c config.yml` | Spatial block CV, Cohen's kappa, Moran's I on residuals |
| `terraflow export --format h3 -c config.yml` | H3-indexed export for interop with H3-native visualization tools (`pip install terraflow-agro[h3]`) |

See [CLI docs](https://terraflow.marupilla.dev/cli/usage/) for full reference.

## Climate interpolation

Three spatial algorithms are available via `interpolation_method`:

| Method | Notes |
|---|---|
| `linear` (default) | `scipy.griddata` — fast, no extra deps |
| `kriging` | Ordinary Kriging via `pykrige`; adds `{var}_krig_std` uncertainty columns |
| `idw` | Inverse Distance Weighting (power=2) — faster than kriging, no uncertainty |

Combine `interpolation_method: kriging` with `uncertainty_samples: N` in `model_params` to get Monte Carlo score confidence intervals (`score_ci_low` / `score_ci_high`).
For kriging, `variogram_mode: extended` evaluates additional nested variogram candidates and records all LOOCV candidate scores in `report.json`; use the default `standard` mode for large station networks unless nested structures are needed. See the extended variogram notebook in the docs for a worked synthetic example.

See [Config Schema](https://terraflow.marupilla.dev/config/schema/) for the full reference.

## Python API

```python
from terraflow.pipeline import run_pipeline

results_df = run_pipeline("config.yml")
```

## Development

```bash
git clone https://github.com/gmarupilla/AgroTerraFlow.git
cd AgroTerraFlow
make dev       # create .venv and install dev deps
make test      # run test suite
make lint      # ruff + black
make docs-build
```

## Architecture

Core modules: `cli`, `config`, `climate`, `geo`, `ingest`, `model`, `pipeline`, `stats`, `viz`.

Key design decisions are documented in Architecture Decision Records under `docs/architecture/`.

## Project Scope

TerraFlow is a reproducible pipeline for geospatial agricultural modeling. It
handles raster ingestion, ROI clipping, climate interpolation, suitability
scoring, and deterministic artifact generation.

**In scope:**
- Configuration-driven pipeline execution (YAML → Parquet + provenance artifacts)
- Spatial interpolation of point climate observations (linear, kriging, IDW)
- Per-cell suitability scoring with uncertainty quantification (Monte Carlo)
- Deterministic run fingerprinting and artifact caching

**Out of scope:**
- Real-time data ingestion or streaming workflows
- General-purpose raster analysis (use `rioxarray` or `rasterstats` instead)
- Cloud-scale distributed processing (no Dask/Spark integration planned)
- Web application or GUI layer

## Maintenance & Support

TerraFlow is actively maintained. Bug fixes are prioritized; the test suite and
CI pipeline are kept green on every commit.

Feature requests are evaluated against project scope — open an issue to discuss
before building. Not all requests will be accepted.

Support is provided on a best-effort basis via [GitHub Issues](https://github.com/gmarupilla/AgroTerraFlow/issues).
Response time is typically within a week. There is no paid support tier.

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md).

## Citation

If you use TerraFlow in your research, please cite our JOSS paper (manuscript in preparation).

## License

MIT License — free for academic, commercial, and open-source use.
