Metadata-Version: 2.4
Name: cerebral-sdk
Version: 0.1.0
Summary: Python SDK for the Cerebral data versioning API
Project-URL: Homepage, https://github.com/cerebral-storage/cerebral-python-sdk
Project-URL: Documentation, https://cerebral-storage.github.io/cerebral-python-sdk
Author: Cerebral
License-Expression: Apache-2.0
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: httpx<1.0,>=0.25.0
Provides-Extra: dev
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pre-commit>=3.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: respx>=0.21.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.0; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.24; extra == 'docs'
Description-Content-Type: text/markdown

# Cerebral Python SDK

Python SDK for the [Cerebral](https://cerebral.storage) data versioning API.

## Installation

```bash
pip install cerebral-sdk
```

Or with [uv](https://github.com/astral-sh/uv):

```bash
uv add cerebral-sdk
```

## Quick Start

### Using environment variables (simplest)

```bash
export CEREBRAL_API_KEY="your-api-key"
```

```python
import cerebral

repo = cerebral.repository("my-org", "my-repo")
print(repo.description)  # lazy-loaded on first access

branch = repo.branches.get("main")
branch.objects.put("data/example.csv", b"col1,col2\na,b\n")
branch.commit(message="added example csv")
```

### Explicit configuration

```python
import cerebral

cerebral.configure(api_key="your-api-key", endpoint_url="https://custom.endpoint")
repo = cerebral.repository("my-org", "my-repo")
```

### Explicit client (most flexible)

```python
from cerebral import Client

with Client(api_key="your-api-key") as client:
    repo = client.repository("my-org", "my-repo")
    branch = repo.branches.get("main")
```

## Configuration

| Option | Environment Variable | Default |
|--------|---------------------|---------|
| `api_key` | `CEREBRAL_API_KEY` | *required* |
| `endpoint_url` | `CEREBRAL_ENDPOINT_URL` | `https://api.cerebral.storage` |

Resolution order: explicit parameter > environment variable > default.

A missing API key is not an error at construction time; a `ConfigurationError` is raised
when the first request is made.

## Usage

### Repositories

```python
repo = cerebral.repository("my-org", "my-repo")

# Lazy-loaded properties
print(repo.id, repo.description, repo.visibility)

# Update
repo.update(description="New description", visibility="public")

# Delete
repo.delete()
```

### Branches

```python
branches = repo.branches

# Create
branch = branches.create("feature-branch", source="main")

# Get
branch = branches.get("main")
print(branch.commit_id)

# List (auto-paginating)
for record in branches.list():
    print(record.branch_id)

# Delete
branches.delete("feature-branch")
```

### Objects

```python
branch = repo.branches.get("main")

# Upload
branch.objects.put("data/file.csv", b"content")

# Download (streaming)
with branch.objects.get("data/file.csv") as f:
    data = f.read()

# Stream large objects without caching
with branch.objects.get("data/large.bin", cache=False) as f:
    for chunk in f.iter_bytes(chunk_size=8192):
        process(chunk)

# List (auto-paginating, with directory grouping)
for entry in branch.objects.list(prefix="data/", delimiter="/"):
    print(entry.path, entry.type)  # type is "object" or "prefix"

# Check metadata
meta = branch.objects.head("data/file.csv")
print(meta.etag, meta.content_type, meta.content_length)

# Delete
branch.objects.delete("data/file.csv")
```

### Commits

```python
# Create a commit
commit_id = branch.commit(message="update data", metadata={"source": "etl"})

# Browse commit log
for entry in branch.log():
    print(entry.id, entry.message)

# Access a specific commit
commit = repo.commits.get("abc123")
print(commit.committer, commit.message, commit.creation_date)

# Read objects at a specific commit
for entry in commit.objects.list():
    print(entry.path)
```

### Tags

```python
# Create
repo.tags.create("v1.0", commit_id="abc123")

# Get (read-only object access)
tag = repo.tags.get("v1.0")
print(tag.commit_id)
with tag.objects.get("data.csv") as f:
    data = f.read()

# List
for record in repo.tags.list():
    print(record.tag_id)

# Delete
repo.tags.delete("v1.0")
```

### Diff and Merge

```python
# Diff two refs
for entry in repo.diff.diff("main", "feature", delimiter="/"):
    print(entry.path, entry.status)

# Uncommitted changes
for entry in repo.diff.uncommitted("main"):
    print(entry.path)

# Merge
commit_id = repo.merge.merge("feature", "main", message="merge feature")

# Or from a branch object
commit_id = branch.merge_into("main", message="merge feature")
```

### Organizations

```python
orgs = cerebral.Client(api_key="key").organizations

# Create
org = orgs.create("my-org", "My Organization")

# List
for org in orgs.list():
    print(org.name)

# Members
for member in orgs.members("my-org").list():
    print(member.username, member.role)

orgs.members("my-org").add(user_id="user-uuid", role="member")
```

### Groups

```python
groups = client.organizations.groups("my-org")

group = groups.create("engineers", description="Engineering team")
detail = groups.get(group.id)  # includes members and attachments

groups.add_member(group.id, "user", "user-uuid")
groups.remove_member(group.id, "user", "user-uuid")
```

### Policies

```python
policies = client.organizations.policies("my-org")

# Create and validate
result = policies.validate("package cerebral.authz\ndefault allow = true")
policy = policies.create("allow-all", rego="...", description="Allow everything")

# Attach/detach
policies.attach(policy.id, "group", "group-uuid")
policies.detach(policy.id, "group", "group-uuid")

# Effective policies for a user
for ep in policies.effective_policies("user-uuid"):
    print(ep.policy_name, ep.source)
```

### Connectors and Imports

```python
# Org-level connectors
connectors = client.organizations.connectors("my-org")
conn = connectors.create("my-s3", "s3", {"bucket": "my-bucket", "region": "us-east-1"})

# Attach to repo
repo.connectors.attach(conn.id)

# Import
job_id = repo.imports.start(
    branch="main",
    connector_id=conn.id,
    source_path="s3://my-bucket/data/",
    destination_path="imported/",
)
status = repo.imports.status(job_id)
print(status.status, status.objects_imported)
```

## Error Handling

All SDK exceptions inherit from `CerebralError`:

```
CerebralError                        # base for all SDK errors
+-- ConfigurationError               # missing API key, bad endpoint
+-- TransportError                   # network failures, DNS, timeouts
+-- SerializationError               # invalid JSON in response
+-- APIError                         # base for HTTP API errors
    +-- BadRequestError              # 400
    +-- AuthenticationError          # 401
    +-- ForbiddenError               # 403
    +-- NotFoundError                # 404
    +-- ConflictError                # 409
    +-- GoneError                    # 410
    +-- PreconditionFailedError      # 412
    +-- LockedError                  # 423
    +-- ServerError                  # 5xx
```

`APIError` includes `status_code`, `message`, `code`, `request_id`, `method`, `url`, and
`response_text` for debugging.

```python
from cerebral import NotFoundError, CerebralError

try:
    repo.branches.get("nonexistent").commit_id
except NotFoundError as e:
    print(f"Not found: {e.message} (request_id={e.request_id})")
except CerebralError as e:
    print(f"SDK error: {e}")
```

## Large Object Handling

By default, `objects.get()` caches the full object in memory on `.read()`. For large
objects, disable caching and stream:

```python
with branch.objects.get("large-file.bin", cache=False) as f:
    for chunk in f.iter_bytes(chunk_size=1024 * 1024):
        output.write(chunk)
```

## Development

```bash
# Install dev dependencies
uv sync --all-extras

# Run tests
uv run pytest

# Lint and format
uv run ruff check src/ tests/
uv run ruff format src/ tests/

# Type check
uv run mypy src/cerebral/

# Build
uv build
```

## License

Apache 2.0
