Metadata-Version: 2.4
Name: polnor
Version: 0.4.0
Summary: Polnor Python SDK — query Iceberg lakehouses, manage notebooks/jobs/models, MLflow-compatible tracking.
Author-email: Polnor <contact@polnor.net>
License: Apache-2.0
Project-URL: Homepage, https://polnor.net
Project-URL: Documentation, https://docs.polnor.net/sdk/python
Project-URL: Source, https://github.com/polnor/polnor
Keywords: polnor,lakehouse,iceberg,sql,mlflow,data-platform
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.28
Provides-Extra: test
Requires-Dist: pytest>=7; extra == "test"
Requires-Dist: responses>=0.24; extra == "test"

# Polnor Python SDK

MLflow-compatible tracking for jobs running on the Polnor lakehouse.

## Install

```bash
pip install polnor
```

Or, from a local checkout (contributors / pre-release):

```bash
pip install -e ./sdk/python
```

## Configure

Set three environment variables before importing the SDK:

| Variable | Required | Notes |
|---|---|---|
| `POLNOR_API_URL` | yes | e.g. `https://api.polnor.net` |
| `POLNOR_TOKEN` | yes | Personal access token from `/settings/tokens` |
| `POLNOR_EXPERIMENT_ID` | optional | Default experiment for `start_run()` |
| `POLNOR_WORKSPACE_SLUG` | optional | When the token isn't workspace-scoped |

Inside a Polnor job these are injected automatically.

## Quickstart

```python
from polnor import mlflow

with mlflow.start_run(run_name="train-resnet50") as run:
    mlflow.log_params({"lr": 0.01, "batch_size": 32, "optimizer": "adam"})

    for epoch in range(10):
        loss, acc = train_one_epoch()
        mlflow.log_metrics({"loss": loss, "accuracy": acc}, step=epoch)

    mlflow.set_tag("git_commit", os.environ.get("CI_COMMIT_SHA", "dev"))
```

The run handle is also usable without the `with` block:

```python
run = mlflow.start_run()
try:
    mlflow.log_metric("score", 0.93)
finally:
    mlflow.end_run("completed")
```

## What's supported

| API | Status |
|---|---|
| `start_run` / `end_run` / `active_run` | ✅ |
| `log_metric` / `log_metrics` (latest value wins) | ✅ |
| `log_param` / `log_params` | ✅ |
| `set_tag` / `set_tags` | ✅ |
| `autolog()` context manager | ✅ |
| Per-step metric history (time series) | ⏳ planned |
| `log_artifact` (presigned-URL upload) | ⏳ planned |
| Sklearn / PyTorch / TF auto-instrumentation | ⏳ planned |

## How it differs from upstream MLflow

* Server-side, metrics merge into a single JSONB blob — the latest
  value per key wins. The full `(step, value, timestamp)` history is
  in scope for a follow-up migration.
* No server-side parameter immutability check yet — calling
  `log_param("lr", 0.02)` after `log_param("lr", 0.01)` overwrites
  silently. Upstream MLflow rejects with 400.
* No `mlflow.search_runs()` — use the REST API
  (`GET /api/v1/experiments/:id/runs`) directly until we wrap it.

## Development

```bash
pip install -e ".[test]"
pytest
```

Tests use [responses](https://github.com/getsentry/responses) to mock
HTTP — no live API needed.
