Metadata-Version: 2.4
Name: remote-upload
Version: 0.1.0
Summary: Stream local content INTO remote storage (S3/MinIO, Azure Blob, GCS, SFTP, HTTP) through one tiny, framework-agnostic API. The write-side twin of remote-download.
Project-URL: Homepage, https://github.com/calcifux/remote-upload-python
Project-URL: Repository, https://github.com/calcifux/remote-upload-python
Project-URL: Changelog, https://github.com/calcifux/remote-upload-python/blob/main/CHANGELOG.md
Project-URL: Issues, https://github.com/calcifux/remote-upload-python/issues
Author: Carlos Guillermo Reyes Ramiro (@Calcifux)
License-Expression: MIT
License-File: LICENSE
Keywords: azure-blob,gcs,http,minio,s3,sftp,storage,streaming,upload
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: System :: Archiving
Classifier: Typing :: Typed
Requires-Python: >=3.14
Provides-Extra: all
Requires-Dist: azure-storage-blob>=12.19; extra == 'all'
Requires-Dist: boto3>=1.34; extra == 'all'
Requires-Dist: google-cloud-storage>=2.14; extra == 'all'
Requires-Dist: httpx>=0.27; extra == 'all'
Requires-Dist: paramiko>=3.4; extra == 'all'
Provides-Extra: azure
Requires-Dist: azure-storage-blob>=12.19; extra == 'azure'
Provides-Extra: gcs
Requires-Dist: google-cloud-storage>=2.14; extra == 'gcs'
Provides-Extra: httpx
Requires-Dist: httpx>=0.27; extra == 'httpx'
Provides-Extra: s3
Requires-Dist: boto3>=1.34; extra == 's3'
Provides-Extra: sftp
Requires-Dist: paramiko>=3.4; extra == 'sftp'
Description-Content-Type: text/markdown

# remote-upload

[![CI](https://github.com/calcifux/remote-upload-python/actions/workflows/ci.yml/badge.svg)](https://github.com/calcifux/remote-upload-python/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/remote-upload.svg)](https://pypi.org/project/remote-upload/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.14+](https://img.shields.io/badge/Python-3.14%2B-blue.svg)](https://www.python.org/)
[![Typed](https://img.shields.io/badge/typing-strict-brightgreen.svg)](https://peps.python.org/pep-0561/)

> **Stream local content INTO remote storage** (S3 / MinIO, Azure Blob, GCS, SFTP, authenticated HTTP) through one tiny, framework-agnostic API — the **write-side twin** of `remote-download`.

```python
from remote_upload import RemoteUpload

result = (
    RemoteUpload.to(target)
    .body(stream, length)
    .content_type("image/jpeg")
    .upload()
)
```

`remote-download` pipes bytes *out* of a remote origin to your client. `remote-upload` is the other half: it pushes bytes *into* a remote destination. Same shape, mirrored — works in any web framework (Django, FastAPI, Flask), task queues, AWS Lambda, or plain CLI scripts: anywhere you can read bytes from a stream.

| | remote-download | remote-upload |
|---|---|---|
| Port | `DownloadOrigin.open() -> RemoteContent` | `UploadTarget.upload(UploadContent) -> UploadResult` |
| Facade | `RemoteDownload.from_(src).write_to(out)` | `RemoteUpload.to(target).body(stream, length).upload()` |
| Direction | remote -> your backend -> client | client -> your backend -> remote |

## Install

The core (`HttpTarget`) is **pure standard library** — no third-party dependencies. Cloud and SSH backends each pull in one SDK, gated behind an extra so you install only what you use:

```bash
pip install remote-upload
```

| Extra | Install | Backend | Brings in |
|---|---|---|---|
| *(none)* | `pip install remote-upload` | `HttpTarget` | stdlib only — always available |
| `s3` | `pip install "remote-upload[s3]"` | `S3Target` (S3 / MinIO / Ceph / LocalStack) | `boto3` |
| `azure` | `pip install "remote-upload[azure]"` | `AzureBlobTarget` | `azure-storage-blob` |
| `gcs` | `pip install "remote-upload[gcs]"` | `GcsTarget` | `google-cloud-storage` |
| `sftp` | `pip install "remote-upload[sftp]"` | `SftpTarget` | `paramiko` |
| `httpx` | `pip install "remote-upload[httpx]"` | `HttpxTarget` (retries / auth / proxy) | `httpx` |
| `all` | `pip install "remote-upload[all]"` | everything above | all of the above |

Targets are importable straight from the package root. The extra-gated ones are loaded lazily, so importing one without its SDK installed raises a clear `ImportError` telling you exactly which extra to install:

```python
from remote_upload import RemoteUpload, S3Target, AzureBlobTarget  # lazily resolved
```

Requires **Python 3.14+**.

## Quick start

Every upload follows the same fluent shape: pick a target (or a URL), attach a body, optionally decorate it, then call `.upload()`.

### Plain HTTP PUT to a URL

A bare string is treated as an absolute HTTP/HTTPS URL and wrapped in a default `HttpTarget` (PUT, no auth):

```python
from remote_upload import RemoteUpload

result = (
    RemoteUpload.to("https://api.example.com/files/report.pdf")
    .body(b"%PDF-1.7 ...")
    .content_type("application/pdf")
    .upload()
)
print(result.key, result.bytes_transferred, "bytes")
```

### The three ways to supply a body

```python
# 1. Raw bytes — content length is exact and inferred for you.
RemoteUpload.to(target).body(b"hello world").upload()

# 2. An open binary stream — pass length when you know it (cloud targets like
#    S3 need it); omit it for chunked / unknown-length uploads.
with open("photo.jpg", "rb") as fh:
    RemoteUpload.to(target).body(fh, length=204_800).upload()

# 3. A file on disk — body_file() infers length, and (unless already set) the
#    filename and content type from the path.
RemoteUpload.to(target).body_file("/tmp/photo.jpg").upload()
```

> `upload()` consumes and **closes** the body stream. Build a fresh request per upload — instances are not reusable.

### Everything together

```python
def on_progress(sent: int, total: int | None) -> None:
    pct = (sent * 100 // total) if total else -1
    print(f"uploaded {sent} / {total} bytes ({pct}%)")

result = (
    RemoteUpload.to(target)
    .body(stream, length=204_800)
    .content_type("image/jpeg")
    .metadata({"captured_by": "user-1", "album": "summer"})
    .checksum("sha256")            # also accepts Java-style "SHA-256"
    .on_progress(on_progress)
    .upload()
)

print(result.key)
print(result.etag)
print(result.checksum_hex)
print(f"{result.bytes_per_second / 1_048_576:.1f} MiB/s")
```

## Backends

Every target is constructed with **keyword arguments** and then handed to `RemoteUpload.to(...)`. The keyword names below match each target's constructor exactly.

### HttpTarget — authenticated HTTP PUT (stdlib, always available)

```python
from remote_upload import RemoteUpload, HttpTarget

target = HttpTarget(
    "https://api.example.com/files/report.pdf",
    method="PUT",
    headers={"X-Api-Key": "secret"},
    bearer="<token>",            # adds "Authorization: Bearer <token>"
    connect_timeout=10.0,
    request_timeout=60.0,
)
RemoteUpload.to(target).body_file("/tmp/report.pdf").upload()
```

### S3Target — S3 / MinIO / Ceph / LocalStack (`[s3]`)

```python
from remote_upload import RemoteUpload, S3Target

target = S3Target(
    bucket="my-bucket",
    key="tenant-1/uploads/abc/photo.jpg",
    endpoint="http://localhost:9000",   # MinIO; omit for real AWS
    access_key="minioadmin",
    secret_key="minioadmin",
    region="us-east-1",
)

result = (
    RemoteUpload.to(target)
    .body(stream, length=204_800)
    .content_type("image/jpeg")
    .metadata({"captured_by": "user-1"})
    .checksum("sha256")
    .upload()
)
print(result.key, "etag=", result.etag, result.bytes_transferred, "bytes")
```

Setting `endpoint` enables path-style addressing automatically (needed by most S3-compatible services); override with `path_style=...` if needed. Omit `access_key` / `secret_key` to fall back to the default boto3 credential chain (env vars, `~/.aws/credentials`, IAM roles). For high throughput inject a shared boto3 client with `client=...` — an injected client is reused and never closed by the target.

### AzureBlobTarget — Azure Blob Storage (`[azure]`)

```python
from remote_upload import RemoteUpload, AzureBlobTarget

target = AzureBlobTarget(
    container="uploads",
    blob="tenant-1/photo.jpg",
    connection_string="DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...",
)
RemoteUpload.to(target).body_file("/tmp/photo.jpg").content_type("image/jpeg").upload()
```

Authenticate one of three ways: a full `connection_string`, an `endpoint` URL plus an optional `sas_token`, or a pre-built `BlobClient` passed as `client=...`. Uploads overwrite an existing blob.

### GcsTarget — Google Cloud Storage (`[gcs]`)

```python
from remote_upload import RemoteUpload, GcsTarget

target = GcsTarget(
    bucket="my-bucket",
    object_name="tenant-1/photo.jpg",
    project_id="my-gcp-project",
    credentials_path="/etc/secrets/service-account.json",
)
RemoteUpload.to(target).body_file("/tmp/photo.jpg").content_type("image/jpeg").upload()
```

Credentials resolve in order: an explicit `credentials` object, then a service-account JSON file via `credentials_path` (an optional `file:` prefix is stripped), then Application Default Credentials. Pass a pre-built `storage.Client` via `client=...` for reuse / tests — an injected client is never closed by the target.

### SftpTarget — SFTP over SSH (`[sftp]`)

```python
from remote_upload import RemoteUpload, SftpTarget

target = SftpTarget(
    host="sftp.example.com",
    user="deploy",
    path="/uploads/photo.jpg",
    port=22,
    password="s3cr3t",                       # or use private_key_path=...
)
RemoteUpload.to(target).body_file("/tmp/photo.jpg").upload()
```

Authenticate with a `password` or a `private_key_path` (Ed25519 / ECDSA / RSA / DSA are tried in order). Authentication failures surface as `TerminalUploadError`; connection and I/O failures as `RetryableUploadError`. Tune `connect_timeout` / `auth_timeout` as needed.

### HttpxTarget — HTTP PUT with retries / auth / proxy (`[httpx]`)

```python
from remote_upload import RemoteUpload, HttpxTarget

target = HttpxTarget(
    "https://api.example.com/files/report.pdf",
    method="PUT",
    bearer="<token>",                        # or basic_auth=("user", "pass")
    retries=3,
    connect_timeout=30.0,
    response_timeout=300.0,
    proxy="http://corp-proxy:3128",
)
RemoteUpload.to(target).body_file("/tmp/report.pdf").upload()
```

The richer twin of the stdlib `HttpTarget`: transport-level retries, `Bearer` / `Basic` auth, granular timeouts and an optional forward `proxy`. Pass an existing `httpx.Client` via `client=...` to reuse a connection pool.

## Concepts

The library is built around a single port (`UploadTarget`) and three plain data types. Implement the port and you can push bytes to anything.

### `UploadTarget` — the port

A `Protocol` with one method:

```python
def upload(self, content: UploadContent) -> UploadResult: ...
```

Each backend supplies its own implementation; consumers push bytes through the same API regardless of where they land. Custom destinations only need this single method. Implementations **read** `content.body` but do **not** own its lifecycle — the request opens and closes the stream for them.

### `UploadContent` — the payload

A frozen dataclass the facade builds and hands to the target:

| Field | Meaning |
|---|---|
| `body` | live binary stream to read from (already metered / checksummed) |
| `content_length` | size in bytes, or `None` when unknown |
| `content_type` | MIME type to store, or `None` |
| `filename` | suggested filename / key tail, or `None` |
| `metadata` | user metadata mapping (never `None`; empty when unset) |

### `UploadResult` — the outcome

A frozen dataclass combining the target's provider identifiers with the transfer stats the request measures:

| Field / property | Meaning |
|---|---|
| `key` | object key / remote path the bytes were written to |
| `location` | fully-qualified URL / URI, when the provider exposes one |
| `etag` | provider ETag (S3 / Azure), when available |
| `version_id` | provider version id, when versioning is enabled |
| `bytes_transferred` | total bytes streamed to the destination |
| `duration` | wall-clock `timedelta` of the upload |
| `content_type` | content type stored with the object |
| `checksum_algorithm` | algorithm requested via `.checksum(...)`, or `None` |
| `checksum_hex` | lower-case hex digest, or `None` if none requested |
| `bytes_per_second` | computed throughput (`0` when duration is zero/`None`) |

### `ProgressListener` — progress callback

A callable `(bytes_transferred: int, total_bytes: int | None) -> None`, fired as the destination reads the body. `total_bytes` is `None` for chunked / unknown-length uploads. Register it with `.on_progress(...)`:

```python
RemoteUpload.to(target).body(data).on_progress(
    lambda sent, total: print(f"{sent}/{total}")
).upload()
```

## Error handling — retryable vs terminal

Targets translate provider failures into one of two exceptions so callers can branch on retry semantics **without parsing messages**. Both subclass `RemoteUploadError`:

- **`RetryableUploadError`** — transient: a network blip, a 5xx response, a timeout. Callers with a retry budget (an offline outbox, a sync coordinator) should re-enqueue with backoff.
- **`TerminalUploadError`** — permanent: invalid credentials, a 4xx, quota exceeded, validation. Retrying the same request will fail again; change something (re-auth, fix the payload, escalate) instead.

```python
from remote_upload import (
    RemoteUpload,
    RetryableUploadError,
    TerminalUploadError,
)

try:
    RemoteUpload.to(target).body_file("/tmp/photo.jpg").upload()
except RetryableUploadError:
    enqueue_for_retry(...)        # backoff and try again later
except TerminalUploadError:
    mark_failed_and_alert(...)    # do not retry; surface to the user
```

This retryable/terminal split is the deliberate improvement over a single exception type: it lets a sync coordinator decide between "keep retrying" and "mark failed, surface to user".

## Java -> Python mapping

This package is a faithful port of the Java library [`remote-upload-java`](https://github.com/calcifux/remote-upload-java). If you know one, you know the other:

| Java | Python |
|---|---|
| `RemoteUpload.to(target)` | `RemoteUpload.to(target)` (same) |
| `.body(in, len).contentType(...).metadata(k, v)` | `.body(stream, len).content_type(...).metadata({...})` |
| `.onProgress(...)` / `.checksum("SHA-256")` | `.on_progress(...)` / `.checksum("sha256")` (or `"SHA-256"`) |
| `S3Target.builder().bucket(...).key(...).credentials(ak, sk).build()` | `S3Target(bucket=..., key=..., access_key=..., secret_key=...)` |
| `RetryableUploadException` / `TerminalUploadException` | `RetryableUploadError` / `TerminalUploadError` |
| `UploadResult.getKey()` / `.etag()` | `UploadResult.key` / `.etag` (plain attributes) |

In short: `*Exception` becomes `*Error`, fluent builders become keyword arguments, and getters become attributes.

## License

MIT (c) Carlos Guillermo Reyes Ramiro. See [LICENSE](https://github.com/calcifux/remote-upload-python/blob/main/LICENSE).
