Metadata-Version: 2.4
Name: bizone-cloud-helpers
Version: 0.1.5
Summary: Shared helpers library for Bizone projects.
Author: Bizone
License-Expression: MIT
Project-URL: Homepage, https://github.com/Bizone-ai/bizone-cloud-helpers
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: boto3
Requires-Dist: azure-storage-blob
Requires-Dist: azure-identity
Requires-Dist: google-cloud-storage
Requires-Dist: pydantic
Requires-Dist: filelock
Requires-Dist: fastapi
Requires-Dist: starlette
Requires-Dist: httpx
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: respx; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# bizone-cloud-helpers

Reusable helpers and decorators shared across Bizone Python projects. Designed for FastAPI applications running in cloud environments (AWS, Azure, GCP).

## Features

- **Asynchronous Task Processing**: `@large_task` decorator to turn heavy endpoints into background tasks with callbacks and cross-process idempotency.
- **Concurrency Control**: `@limit_concurrency` decorator for sync functions to prevent resource exhaustion and manage execution timeouts.
- **Cloud File Management**: Unified interface for S3, Azure Blob Storage, Google Cloud Storage, and local files with built-in caching (`FileCacheManager`) and uploading (`RemoteUploader`).
- **Idempotency**: Cross-process file-based locking to ensure tasks with the same `X-call-id` aren't executed multiple times concurrently.
- **Orchestrator Tracing**: JSON stdout logs enriched with `X-caller-id`, `X-call-id`, and pod machine name for CloudWatch filtering.

## Installation

### From PyPI

```bash
pip install bizone-cloud-helpers==0.1.4
```

The source repository can remain private. Client projects should consume
published release artifacts from PyPI instead of installing from Git.

## Usage

### 1. Asynchronous Tasks with Callbacks

Use `@large_task` to handle long-running requests. It automatically responds with `202 Accepted` and runs the function in the background, POSTing the result back to a callback URL.

```python
from bizone_cloud_helpers.async_task import large_task

@app.post("/process")
@large_task(callback_field="callback_url")
async def heavy_computation(data: dict):
    # This runs in background if callback_url is provided in request
    result = perform_work(data)
    return {"status": "done", "result": result}
```

Support for:
- Idempotency via `X-call-id` header.
- Automatic result/error reporting via HTTP POST.
- Context propagation for FastAPI `Request`.
- **Heartbeat timer**: Sends PUT pings to the caller during long tasks. Can be overridden via `X-heartbeat-interval` header (minimum 1s).
- **Concurrency control**: Limit how many background instances of a task run at once.

### 2. Orchestrator Request Tracing

Install the request tracing middleware once on each FastAPI app. It enriches
Python logs with the orchestrator headers, emits JSON logs to stdout for
CloudWatch, and adds `X-machine-name` to in-band HTTP responses.

```python
from fastapi import FastAPI
from bizone_cloud_helpers import install_orchestrator_context

app = FastAPI()
install_orchestrator_context(app)
```

CloudWatch Logs Insights example:

```sql
fields @timestamp, level, logger, message, x_caller_id, x_call_id, machine_name
| filter x_call_id = "..." and x_caller_id = "..."
| sort @timestamp asc
```

For worker code that launches child processes, use `popen_logged` or
`run_logged` so subprocess output is drained through Python logging and receives
the active request metadata.

```python
import logging
from bizone_cloud_helpers import run_logged

logger = logging.getLogger(__name__)

process = run_logged(
    ["python", "-m", "cli.main"],
    logger=logger,
    log_prefix="pengine",
    check=True,
)
```

Subprocess limitation: only child processes launched through these helpers are
trace-enriched. A subprocess that inherits raw stdout/stderr still bypasses
Python logging and cannot be tagged by this library.

### 3. Concurrency Limiting

Limit how many instances of a sync or async function can run at once.

```python
from bizone_cloud_helpers.concurrency import limit_concurrency

@limit_concurrency(group="cpu_intensive", max_concurrency=2, max_runtime=60)
def compute_heavy_stuff():
    # Only 2 of these will run at a time across the process
    ...

@limit_concurrency(group="io_intensive", max_concurrency=5)
async def async_io_task():
    # Also works for async functions
    await do_something()
```

### 4. Remote Files and Caching

Unified access to cloud storage with local caching.

```python
from bizone_cloud_helpers.remote_files import RemoteFileInfo

# Download/Cache a file from S3
info = RemoteFileInfo(
    provider="s3",
    bucket="my-bucket",
    key="path/to/file.txt",
    access_key="...",
    secret_key="..."
)

local_path = info.local_file # Downloads if not cached or expired
```

## Configuration

Environment variables:
- `IDEMPOTENCY_LOCK_DIR`: Directory for idempotency locks (default: `/tmp/myapp_idemp_locks`).
- `BIZONE_CALLBACK`: Default base URL for flow callbacks.
- `MAX_CONCURRENCY`: Global default concurrency limit.
- `MAX_CONCURRENCY_<GROUP>`: Group-specific concurrency limit.
- `FILE_CACHE_DIR`: Directory for local file cache.
- `POD_NAME`: Preferred value for `X-machine-name` and log `machine_name`.

## Development

### Running Tests

The project uses `pytest` for testing.

1. Install development dependencies:
   ```bash
   pip install -e ".[dev]"
   ```

2. Run all tests:
   ```bash
   python -m pytest
   ```

3. Run specific test file:
   ```bash
   python -m pytest tests/test_async_task_new.py
   ```

### Building a Release Locally

```bash
python -m pip install -e ".[dev]"
python -m pytest
python -m build
```

The build creates a wheel and source distribution under `dist/`.

### Publishing Releases

Releases are published to public PyPI from GitHub Actions using PyPI Trusted
Publishing. No PyPI API token is required in GitHub secrets.

One-time PyPI setup:

1. Create or claim the `bizone-cloud-helpers` project on PyPI.
2. In PyPI project settings, add a Trusted Publisher for GitHub Actions:
   - owner: `Bizone-ai`
   - repository: `bizone-cloud-helpers`
   - workflow: `release.yml`
   - environment: `pypi`

Release steps:

```bash
git checkout main
git pull
python -m pytest
git tag v0.1.4
git push origin v0.1.4
```

After PyPI publishes the release, client projects can depend on:

```txt
bizone-cloud-helpers==0.1.4
```

## License

MIT
