Metadata-Version: 2.4
Name: ies-ingest
Version: 0.1.0
Summary: Orchestration and contract layer for IES-shaped state-demand ingestion. Apache 2.0 donation to the IES Accelerator track.
Project-URL: Homepage, https://energymap.in
Project-URL: Repository, https://github.com/india-energy-atlas/ies-ingest
Project-URL: Issues, https://github.com/india-energy-atlas/ies-ingest/issues
Author-email: India Energy Atlas <hello@energymap.in>
License-Expression: Apache-2.0
License-File: LICENSE
License-File: NOTICE
Keywords: energy,freshness,grid,ies,india,india-energy-stack,observability,sldc
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.11
Requires-Dist: pydantic>=2.0
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.21; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1; extra == 'dev'
Description-Content-Type: text/markdown

# ies-ingest

> **A donation into the India Energy Stack (IES) Accelerator track.** Apache 2.0. Maintained by [India Energy Atlas](https://energymap.in) (energymap.in). Produced against IES Architecture v0.4 (Ministry of Power, 27 Mar 2026). *Not* certified IES-compliant — see "Positioning" below.

`ies-ingest` is the **orchestration and contract layer** that sits between raw State Load Dispatch Centre (SLDC) feeds and an IES-shaped consumer API. It is the same code that powers [energymap.in](https://energymap.in)'s public state-demand surface, extracted into a small, dependency-light Python package so other IES participants — pilot DISCOMs, SERCs, regulators, research bodies — can adopt it without inheriting the rest of energymap.

It does NOT include the per-state SLDC scrapers themselves; those live in a sibling collector that will be donated as a separate repo. See "Scope, honestly" below.

---

## Why this exists

IES Architecture v0.4 §"Adoption Strategy" calls for *"sandbox environments, tools, reference solutions, and other ecosystem enablers"* (p. 7). Pilots in Delhi, Gujarat, and Maharashtra need exactly this plumbing today. Rather than wait for an Accelerator-curated solution, we are donating the orchestration layer we already run in production.

The strategic frame, in plain English: REC Limited (the IES nodal authority) does not need another vendor selling APIs. It needs working public-goods code that pilots can pick up and run. This repository is that code.

---

## What's in the box

```
ies_ingest/
├── contract.py          # StateDemandRow, source_kind enum, finaliser
├── canonical_states.py  # Collapse 33+ state-name variants to one canonical key
├── clock.py             # IST-aligned canonical hour ("atlas_now")
├── envelope.py          # Lightweight {atlas_hour, source_health, data} wrapper
├── health.py            # Red / yellow / green from rolling status window
├── sources.py           # Source registry — pull / check / timeout per metric
└── examples/
    └── one_state.py     # End-to-end Delhi demo — fake SLDC row → contract → envelope
```

| Module | Lines | What it does | Why it matters for IES |
|---|---:|---|---|
| `contract.py` | ~200 | Public contract for a state-demand row. Distinguishes `observed` from `modeled` / `synthesized` / `derived` / `missing`. Refuses to serve null observed values. | IES Annexure 3 §"Provenance" — every reading must declare its provenance class. |
| `canonical_states.py` | ~130 | Maps every known SLDC variant ("J&K", "Jammu and Kashmir", "jammu-kashmir") to one canonical key. | Federation requires deterministic identifiers. This is the cheapest possible step toward IES Identity & Addressability (v0.4 §15). |
| `envelope.py` + `clock.py` | ~125 | Wraps every API response with an IST `atlas_hour` stamp + source-health label. | Conformance artefact A (trace context) + D (boundary metrics) per cheat-sheet §5c. |
| `health.py` | ~35 | Rolling 3-window health: 0/3 fail = green, 1/3 = yellow, 2/3 = red. | Programme Observability — Friction surface (v0.4 pp. 26–28). |
| `sources.py` | ~110 | Registry of `Source(metric, pull, check, timeout, observed_only)`. Async runner with timeout. | IES Programme Observability — Adoption surface. |

Total surface: **~600 lines of Python, no Postgres or Prometheus required to import.**

---

## Quick start

```bash
git clone https://github.com/india-energy-atlas/ies-ingest
cd ies-ingest
make verify
```

`make verify`:

1. Runs the test suite (`pytest`, no DB required).
2. Runs the Delhi end-to-end example (`python -m ies_ingest.examples.one_state`).
3. Prints a sample IES-shaped envelope so you can see the output shape with your own eyes.

Expected output:

```json
{
  "atlas_hour": "2026-05-06T18:00:00+05:30",
  "source_health": "green",
  "data": {
    "state": "delhi",
    "timestamp": "2026-05-06T18:00:00+05:30",
    "demand_mw": 6420.5,
    "source": "metered_sldc",
    "source_kind": "observed"
  }
}
```

---

## Scope, honestly

This is not the **whole** SLDC ingestion stack. It is the part that:

- ✅ Validates and normalises rows produced by upstream collectors
- ✅ Stamps every response with a canonical hour + source health
- ✅ Provides a registry pattern for plugging in new metrics
- ✅ Maps state-name variants to a canonical key

It does **not** include:

- ❌ The 33-state SLDC scrapers themselves (httpx / playwright / pdfplumber per-state fetchers). These live in a sibling collector codebase that will be donated separately. Pilots wanting a turn-key collector should watch for `india-energy-atlas/ies-collect` (forthcoming).
- ❌ Postgres / TimescaleDB schemas. The contract is DB-agnostic by design — bring your own store.
- ❌ A Prometheus exporter. The metrics surface is a thin convenience; production deployments wire their own.
- ❌ Modelled-demand fallback (LightGBM tier-2 fill). That requires historical training data that is not yet ours to release.

If you need the full pipeline including collectors, [open an issue](https://github.com/india-energy-atlas/ies-ingest/issues) describing the pilot and the state. We will prioritise extraction by demand.

---

## Positioning — what this repo is and is not

| | This repo IS | This repo is NOT |
|---|---|---|
| Identity | A reference contribution from India Energy Atlas (energymap.in) | India Energy Stack (the Government of India programme led by REC Limited) |
| Status | Apache 2.0, no certification | An IES-certified component |
| Authority | We donate the code; we do not speak for IES | The IES specification, signatory, or conformance authority |
| Use of the IES name | "Produced against IES v0.4 §X (p. NN)" | "IES-compliant" or "IES-certified" |

**We never claim to be IES, speak for IES, or certify anything as IES-compliant.** Every claim about conformance is scoped to a specific version of the published v0.4 architecture, with page references.

---

## Mapping to IES v0.4

| v0.4 building block | Section | This repo's contribution |
|---|---|---|
| Identity & addressability | pp. 15–16 | `canonical_states.py` — first-cut canonicalisation. ERA URN scheme is forthcoming under [IEA-136](https://linear.app/sayon/issue/IEA-136). |
| Registries & directories | p. 17 | `sources.py` — registry pattern; pluggable per-metric. |
| Energy credentials | pp. 18–19 | (Out of scope here — see verifier in IEA-135.) |
| Policies & data | pp. 20–24, 57–58 | `contract.py` enforces a deterministic provenance contract — one of the §"Consistent interpretation" requirements from Annexure 6. |
| Protocols & schemas | pp. 21–24 | `envelope.py` provides the lightweight Atlas freshness envelope. The cryptographically signed IES envelope (with receipts) lives upstream in the consumer API. |
| Programme Observability | pp. 26–28 | `health.py` is the Friction surface in nascent form. Adoption + Impact surfaces live upstream. |

---

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for the full guide. In short:

- Open issues for bugs, gaps, or new state-name variants.
- PRs welcome. Run `make verify` before pushing.
- Keep the package dependency-light. New runtime deps need a strong justification.
- Be kind: see [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md).

---

## Licence

- Code: **Apache License 2.0** — see [LICENSE](LICENSE).
- Documentation in this repo: **CC-BY 4.0**.
- Original copyright: India Energy Atlas (energymap.in), 2026.

See [NOTICE](NOTICE) for attribution.

---

## Acknowledgements

- Ministry of Power, REC Limited, FSR Global, ISGF, and the IES Taskforce — for the protocol that makes federated energy data infrastructure possible.
- Saral Systems Council — the Section 8 non-profit context under which this contribution is offered.
- All the SLDC operators across India whose hourly publications are the substrate this layer rides on.
