Metadata-Version: 2.4
Name: petrinex
Version: 0.3.0
Summary: Load Alberta Petrinex data (Volumetrics, NGL) into Spark/pandas DataFrames
Home-page: https://github.com/guanjieshen/petrinex-python-api
Author: Guanjie Shen
License: MIT
Project-URL: Homepage, https://github.com/guanjieshen/petrinex-python-api
Project-URL: Documentation, https://github.com/guanjieshen/petrinex-python-api#readme
Project-URL: Repository, https://github.com/guanjieshen/petrinex-python-api
Project-URL: Bug Reports, https://github.com/guanjieshen/petrinex-python-api/issues
Keywords: petrinex,alberta,oil,gas,volumetrics,ngl,spark,databricks,unity-catalog,energy-data
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: beautifulsoup4>=4.9.0
Requires-Dist: lxml>=4.6.0
Requires-Dist: pandas>=1.2.0
Requires-Dist: requests>=2.25.0
Requires-Dist: pyspark>=3.0.0
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: pytest-cov>=2.0; extra == "dev"
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# Petrinex Python API

Load Alberta Petrinex data (Volumetrics, NGL) into Spark/pandas DataFrames.

> **Note:** Currently supports Alberta (AB) jurisdiction only.

[![Python 3.7+](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Features

- ✅ Unity Catalog & Databricks Serverless compatible
- ✅ Memory efficient - handles 60+ files without OOM
- ✅ Zero config - automatic ZIP extraction, encoding, error handling
- ✅ Multiple data types - Volumetrics and NGL support

## Quick Start

```bash
pip install petrinex
```

```python
from petrinex import PetrinexClient

# Volumetrics (Alberta only)
client = PetrinexClient(spark=spark, jurisdiction="AB", data_type="Vol")
df = client.read_spark_df(updated_after="2025-12-01")

# NGL and Marketable Gas
ngl_client = PetrinexClient(spark=spark, jurisdiction="AB", data_type="NGL")
ngl_df = ngl_client.read_spark_df(updated_after="2025-12-01")
```

## API

### Load Data

```python
# Spark DataFrame (recommended)
df = client.read_spark_df(updated_after="2025-12-01")

# pandas DataFrame
pdf = client.read_pandas_df(updated_after="2025-12-01")
```

**Date Options:**
- `updated_after="2025-12-01"` - Files modified after this date
- `from_date="2021-01-01"` - All data from production month onwards

### Supported Data Types

| Type | Description |
|------|-------------|
| `Vol` | Conventional Volumetrics |
| `NGL` | NGL and Marketable Gas Volumes |

## Databricks

```python
# Install from GitHub
%pip install git+https://github.com/guanjieshen/petrinex-python-api.git

from petrinex import PetrinexClient

client = PetrinexClient(spark=spark, data_type="Vol")
df = client.read_spark_df(updated_after="2025-12-01")
display(df)
```

See [databricks_example.ipynb](databricks_example.ipynb) for complete example.

## Examples

### Incremental Updates

```python
last_update = spark.sql(
    "SELECT MAX(file_updated_ts) FROM main.petrinex.volumetrics"
).first()[0]

df = client.read_spark_df(updated_after=last_update.split()[0])
```

### Historical Backfill

```python
df = client.read_spark_df(from_date="2020-01-01")
df.write.format("delta").mode("overwrite").saveAsTable("main.petrinex.volumetrics")
```

## Installation

```bash
# From PyPI
pip install petrinex

# From GitHub
pip install git+https://github.com/guanjieshen/petrinex-python-api.git

# Development
git clone https://github.com/guanjieshen/petrinex-python-api.git
cd petrinex-python-api
pip install -e ".[dev]"
```

## Testing

```bash
pytest tests/ -v                    # Unit tests
pytest tests/ -v -m integration     # Include integration tests
pytest tests/ --cov=petrinex        # With coverage
```

## Links

- 📦 [PyPI](https://pypi.org/project/petrinex/)
- 📓 [Databricks Example](databricks_example.ipynb)
- 🧪 [Tests](tests/)
- 📋 [Changelog](CHANGELOG.md)
- 🤝 [Contributing](CONTRIBUTING.md)

## License

MIT License - Copyright (c) 2026 Guanjie Shen

See [LICENSE](LICENSE) for full details.
