Metadata-Version: 2.4
Name: rosenbound
Version: 0.1.1
Summary: Official Python SDK for the Rosenbound clinical-AI platform
Project-URL: Homepage, https://rosenbound.com
Project-URL: Documentation, https://rosenbound.com/docs/sdk/python
Author-email: Harsh Singh <harsh@rosenbound.com>
License: Apache-2.0
License-File: LICENSE
Keywords: causal-inference,clinical-ai,pharmacovigilance,rosenbound,sdk
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Healthcare Industry
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: httpx>=0.26
Requires-Dist: pydantic>=2.5
Requires-Dist: typing-extensions>=4.8
Provides-Extra: dev
Requires-Dist: mypy>=1.9; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.6; extra == 'dev'
Description-Content-Type: text/markdown

# rosenbound

Official Python SDK for the **Rosenbound** clinical-AI platform — a causal,
auditable clinical decision-support platform for structured medical and
pharmacological data.

The SDK is a typed, Pydantic-based wrapper over the platform's REST API
(platform API **v1**). It is sync-first; an asynchronous client is reserved for
a future release.

> **Status — v0.1.0 (alpha).** This release implements the **Cohorts**
> (list / get / create), **Studies** (list / get / create / run / results), and
> **Certificates** (reproducibility certificate + methodology PDF) surfaces —
> the full cohort → study → run → result → certificate loop. An asynchronous
> client, SSO auth, certificate-chain `verify()`, and per-feature resources
> (MedDRA, E2B, drug-coder, …) are reserved for a future release.

---

## Installation

```bash
pip install rosenbound
```

Requires Python ≥ 3.10. Runtime dependencies: `httpx`, `pydantic` (v2),
`typing-extensions`.

During pre-release the package installs from a local checkout:

```bash
cd code/sdk/python
pip install -e ".[dev]"
```

## Authentication

The SDK authenticates with a platform **API token**, sent on every request as
`Authorization: Bearer <token>`. Tokens are issued from the in-product API-key
management surface (which requires an admin role). The token is opaque to the
SDK; the platform validates it and resolves your tenant + role from it.

```python
from rosenbound import Client

client = Client(api_key="rk_your_token_here")
```

By default the client talks to `https://api.rosenbound.com`. Point it elsewhere
(e.g. a staging host) with `base_url`:

```python
client = Client(api_key="rk_...", base_url="https://staging.api.rosenbound.com")
```

## Security & access control

The SDK is intentionally open-source (Apache 2.0). Platform access is not.
Every protected endpoint the SDK calls enforces the following stack
server-side &mdash; the SDK is a thin typed wrapper; the audit + isolation
guarantees live in the platform, not in the client.

1. **Bearer-token verification** via FastAPI's `get_current_user` middleware.
   Tokens are JWTs minted from the in-product API-key management surface
   (admin role required). Tokens are never embedded in the SDK source or the
   PyPI distribution.

2. **RBAC permission check** via the `require_permission(...)` dependency on
   every protected route. Endpoints declare their required permission inline;
   missing permissions return HTTP 403 with no data echo.

3. **Tenant scoping** via `TenantContextMiddleware` + Postgres Row-Level
   Security policies. Cross-tenant reads and writes are server-rejected even
   with a valid token; the platform reads nothing outside the caller's
   organization.

4. **Append-only audit log** on every regulated write. SHA-256-chained
   `AppendOnlyAuditLog` entries + a mirrored `audit_logs` row + 21 CFR Part 11
   ALCOA+ attestation fields. Ledger writes are fail-loud: a write that
   cannot anchor to the SHA chain rolls back the entire request rather than
   persist unanchored.

5. **Transport security.** HSTS preload, HTTPS-only on production surfaces,
   TLS 1.2+ enforced, secure + httpOnly + SameSite cookies. The pre-launch
   staging surface carries an additional HTTP Basic Auth gate for
   partner-only isolation.

6. **SOC 2 Type I in flight.** Auditor engagement active; CC6.6 MFA gate
   closed; encryption-at-rest via `pgcrypto`; disaster-recovery runbook
   tested. BAA + DPA + MSA templates ready for partner signing.

7. **No LLM in the prediction path.** Pharmacovigilance triage, causal
   estimation, and audit-anchored writes never touch a large language model.
   Classical ML (LightGBM, causal-forest), calibrated ensembles, and
   symbolic rule engines only &mdash; the substrate an FDA reviewer can audit
   without trusting a foundation-model black box.

`ApiError` (raised by every SDK call) carries the platform-side `request_id`
on its `.request_id` attribute. Include this in support correspondence to
correlate against the platform's audit log.

## Quick start — cohorts

```python
from rosenbound import Client

with Client(api_key="rk_...") as client:
    # Create a cohort from a CSV file plus a cohort-definition DSL.
    cohort = client.cohorts.create(
        name="metformin_t2dm_2026",
        archetype="treatment_policy",
        dsl_yaml="path/to/definition.yaml",   # YAML text or a Path to a file
        csv_path="my_cohort.csv",
    )
    print(cohort.id, cohort.status)

    # List cohorts (paginated, filterable by archetype + name search).
    page = client.cohorts.list(archetype="treatment_policy", page=1, page_size=50)
    print(f"{len(page.items)} of {page.total} cohorts")
    for c in page.items:
        print(c.id, c.name, c.archetype)

    # Fetch one cohort by id.
    detail = client.cohorts.get(cohort.id)
    print(detail.cohort_def_hash, detail.data_hash)
```

### Cohort archetypes

`archetype` is one of:

| value | meaning |
|---|---|
| `pv` | pharmacovigilance pentagon cohort |
| `longitudinal` | repeated-measures / longitudinal cohort |
| `treatment_policy` | treatment-policy (A-vs-B) cohort |
| `time_to_event` | survival / time-to-event cohort |

The `dsl_yaml` cohort definition must validate against the chosen archetype,
and every column it references must exist in the uploaded CSV header — the
platform rejects a mismatch with a `ValidationError` at upload time.

## Studies

A study runs a causal protocol against a cohort and produces a 5-method
(Pentagon) estimate set. The workflow is **create → run → fetch results**:

```python
with Client(api_key="rk_...") as client:
    study = client.studies.create(
        cohort_id=cohort.id,
        protocol_id="pv_pentagon",
        estimator_selection={},          # per-protocol config; {} = protocol defaults
    )
    print(study.id, study.status)        # -> "draft"

    # run() executes synchronously: it returns once the Pentagon run reaches a
    # terminal state, so the returned study is normally status="complete".
    completed = client.studies.run(study.id, seed=17)   # seed pins reproducibility
    print(completed.status)              # -> "complete"

    result = client.studies.get_results(study.id)
    print(result.pentagon_result_payload)        # the full 5-method estimate set

    # List + filter (single status value, optional cohort filter, paginated).
    page = client.studies.list(cohort_id=cohort.id, status="complete")
    print(f"{len(page.items)} of {page.total} studies")
```

`protocol_id` is one of:

| value | meaning |
|---|---|
| `pv_pentagon` | pharmacovigilance 5-method Pentagon |
| `cr_longitudinal` | clinical-research longitudinal protocol |
| `cr_treatment_policy` | treatment-policy (A-vs-B) protocol |
| `cr_tte` | time-to-event / survival protocol |

`run()` is synchronous today. If a run is already in progress it raises
`ConflictError` (409); a server-side run failure raises `ServerError` (500).
When the platform later adds an asynchronous job model, a blocking
`wait_for_completion` poller will land alongside the async client (v0.2).

## Certificates

Every completed run mints a **reproducibility certificate** — the hashes and
environment pins an external auditor re-derives the result from. The
certificate is embedded in the study response; `certificates.get()` surfaces it:

```python
with Client(api_key="rk_...") as client:
    cert = client.certificates.get(study.id)
    print(cert.code_version)        # the code version that produced the result
    print(cert.cohort_def_yaml_hash, cert.cohort_data_hash)
    print(cert.library_versions)    # {package: version} pin map
    print(cert.generated_at)

    # Download the human-readable methodology PDF.
    client.certificates.download_pdf(study.id, output_path="methodology.pdf")
```

| field | meaning |
|---|---|
| `cohort_def_yaml_hash` | SHA-256 of the cohort-definition YAML |
| `cohort_data_hash` | SHA-256 of the input cohort data |
| `code_version` | resolved code version (git SHA) that produced the result |
| `library_versions` | `{package: version}` environment pin map |
| `generated_at` | server-side timestamp the certificate was minted |

`certificates.get()` raises `NotFoundError` if the study has no completed run
yet. `download_pdf()` raises `ServerError` if the methodology PDF has not been
materialized for the latest run (the platform returns 503 until the generator
has produced the artifact). Re-validating the certificate chain against the
platform's authoritative ledger — `certificates.verify()` — is a v0.2 stub that
currently raises `NotImplementedError`.

## End-to-end example

The full loop in one block — upload a cohort, define and run a study, read the
result, and fetch its reproducibility certificate:

```python
from rosenbound import Client

with Client(api_key="rk_...") as client:
    cohort = client.cohorts.create(
        name="metformin_t2dm_2026",
        archetype="treatment_policy",
        dsl_yaml="definition.yaml",
        csv_path="my_cohort.csv",
    )
    study = client.studies.create(
        cohort_id=cohort.id,
        protocol_id="pv_pentagon",
    )
    client.studies.run(study.id, seed=17)
    result = client.studies.get_results(study.id)
    cert = client.certificates.get(study.id)

    print("estimates:", result.pentagon_result_payload)
    print("reproducible under code version:", cert.code_version)
```

## Error handling

Every SDK error derives from `RosenboundError`. The HTTP layer maps response
status codes to specific subclasses:

```python
from rosenbound import (
    Client,
    AuthenticationError,  # 401 — invalid / missing API key
    NotFoundError,        # 404 — absent, or not visible to your tenant
    ValidationError,      # 422 — payload rejected (carries .errors)
    ConflictError,        # 409 — state conflict (e.g. study already running)
    RateLimitError,       # 429 — rate limit (carries .retry_after_seconds)
    ServerError,          # 5xx — platform fault, after retries are exhausted
    RosenboundError,      # base class — catch-all
)

with Client(api_key="rk_...") as client:
    try:
        client.cohorts.create(
            name="bad",
            archetype="pv",
            dsl_yaml=open("definition.yaml").read(),
            csv_path="cohort.csv",
        )
    except ValidationError as exc:
        for err in exc.errors:        # the platform's structured error body
            print(err)
    except RateLimitError as exc:
        print("retry after", exc.retry_after_seconds, "s")
```

A 404 means "not found *for you*" — the platform returns 404 (never 403) for
cross-tenant reads so resource existence never leaks across tenants.

## Retries and timeouts

The client retries transient failures (HTTP **429** and **5xx**) up to
`max_retries` times with deterministic exponential backoff
(`0.5 * 2 ** attempt` seconds). A `429` carrying a numeric `Retry-After` header
honors that value instead. Non-retryable errors (4xx other than 429) raise
immediately.

```python
client = Client(
    api_key="rk_...",
    timeout=30.0,      # per-request timeout, seconds (default 30)
    max_retries=3,     # additional attempts on 429 / 5xx (default 3)
)
```

Network-layer failures (connection refused, DNS, read timeout) propagate as the
underlying `httpx` transport exceptions and are not wrapped.

## Resource management

`Client` owns a connection pool. Use it as a context manager, or call
`close()` explicitly:

```python
with Client(api_key="rk_...") as client:
    ...  # pool is released on exit

# or
client = Client(api_key="rk_...")
try:
    ...
finally:
    client.close()
```

A single client is safe for serial use from one thread; construct one client
per thread for concurrent use.

## Compatibility

- Python: ≥ 3.10
- Platform API: v1
- Concurrency: synchronous client only (async reserved for v0.2)
- SDK version: see `rosenbound.__version__`

## License

Apache-2.0. See [LICENSE](./LICENSE).

## Contact

Issues and questions: the Rosenbound platform team.
