Metadata-Version: 2.4
Name: forest-data
Version: 0.1.0a0
Summary: Public Python client for the Instituto Forest open-data API.
Author: Instituto Forest
License: MIT
Project-URL: Homepage, https://institutoforest.org
Project-URL: Documentation, https://institutoforest.org/docs/api/v1
Project-URL: Repository, https://github.com/julianopadua/forest-open-data-pipelines
Keywords: forest,open-data,brazil,environment,api-client
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: httpx>=0.27.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: respx>=0.21.0; extra == "dev"

# forest-data

Public Python client for the Instituto Forest open-data API.

The package wraps `https://institutoforest.org/api/v1`. The API returns metadata, official source URLs, and profiling results. Dataset bytes are downloaded from the official source listed in each item `source_url`. Forest does not mirror raw dataset files in Supabase.

## Install

```bash
pip install forest-data
```

## Basic Use

```python
import forest_data

client = forest_data.Client()

for dataset in client.list_datasets():
    print(dataset.id, dataset.title)

manifest = client.get_dataset("inpe_bdqueimadas_focos")
for item in manifest.items:
    print(item.period, item.source_url, item.profile_status, item.row_count)
```

## Download From Official Sources

```python
import forest_data

client = forest_data.Client()
paths = client.download("inpe_bdqueimadas_focos", path="./data")
```

`download()` follows item `source_url` values. If an item has `sha256`, the SDK verifies it by default. If an item does not have `sha256`, the SDK downloads it without checksum verification.

## URL Discovery

```python
import forest_data

client = forest_data.Client()
urls = client.get_source_urls("inpe_bdqueimadas_focos")
```

Use `get_dataset_items()` or `iter_items()` when your pipeline needs profile metadata before deciding what to download:

```python
for item in client.iter_items("inpe_bdqueimadas_focos"):
    if item.profile_status == "ok" and item.row_count:
        print(item.filename, item.row_count, item.source_url)
```

## Profile Fields

`profiled_at` is the UTC time when Forest finished profiling the item.

`profile_status` describes the profiling result:

- `ok`: download and expected profiling metrics succeeded.
- `partial`: the resource was downloaded, but some metrics are incomplete.
- `failed`: the URL was discovered, but download or parsing failed.
- `skipped`: profiling was intentionally not attempted.

`profile_warnings` contains public-safe warning objects with `code` and `message`. Use these warnings to decide whether an item is fit for automated ingestion.

## Configuration

Default base URL: `https://institutoforest.org/api/v1`.

Override with `Client(base_url=...)` or `FOREST_API_BASE_URL`.

Full API documentation: https://institutoforest.org/docs/api/v1
