Metadata-Version: 2.4
Name: fast-s3-ingest
Version: 0.4.3
Summary: A controlled public S3 ingestion endpoint for users, applications, vendors, and agentic workflows
Author-email: Jason Oliveira <fasts3ingest@gmail.com>
Maintainer-email: Jason Oliveira <fasts3ingest@gmail.com>
License-Expression: MIT
Project-URL: Repository, https://github.com/jasonpoliveira/fast-s3-ingest
Project-URL: Issues, https://github.com/jasonpoliveira/fast-s3-ingest/issues
Project-URL: Changelog, https://github.com/jasonpoliveira/fast-s3-ingest/blob/main/CHANGELOG.md
Classifier: Development Status :: 4 - Beta
Classifier: Framework :: FastAPI
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Internet :: WWW/HTTP :: HTTP Servers
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.100.0
Requires-Dist: starlette>=0.27
Requires-Dist: uvicorn[standard]>=0.23.0
Requires-Dist: python-multipart>=0.0.6
Requires-Dist: boto3>=1.28.0
Provides-Extra: redis
Requires-Dist: redis>=5.0; extra == "redis"
Provides-Extra: images
Requires-Dist: Pillow>=10.0; extra == "images"
Provides-Extra: observability
Requires-Dist: prometheus-client>=0.19; extra == "observability"
Requires-Dist: opentelemetry-sdk>=1.20; extra == "observability"
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.20; extra == "observability"
Requires-Dist: opentelemetry-instrumentation-fastapi>=0.41b0; extra == "observability"
Provides-Extra: av
Requires-Dist: clamd>=1.0; extra == "av"
Provides-Extra: mcp
Requires-Dist: mcp>=1.2; extra == "mcp"
Requires-Dist: httpx>=0.27; extra == "mcp"
Provides-Extra: all
Requires-Dist: fast-s3-ingest[av,images,mcp,observability,redis]; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Requires-Dist: pyright>=1.1; extra == "dev"
Requires-Dist: moto[s3]>=4.0; extra == "dev"
Requires-Dist: httpx>=0.24; extra == "dev"
Requires-Dist: requests>=2.28; extra == "dev"
Requires-Dist: Pillow>=10.0; extra == "dev"
Requires-Dist: fakeredis[lua]>=2.20; extra == "dev"
Requires-Dist: hypothesis>=6.0; extra == "dev"
Requires-Dist: clamd>=1.0; extra == "dev"
Requires-Dist: prometheus-client>=0.19; extra == "dev"
Requires-Dist: opentelemetry-sdk>=1.20; extra == "dev"
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.20; extra == "dev"
Requires-Dist: opentelemetry-instrumentation-fastapi>=0.41b0; extra == "dev"
Dynamic: license-file

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/fast-s3-ingest-logo-horizontal-dark.svg">
    <img alt="fast-s3-ingest" src="https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/fast-s3-ingest-logo-horizontal.svg" width="560">
  </picture>
</p>

[![CI](https://github.com/jasonpoliveira/fast-s3-ingest/actions/workflows/ci.yml/badge.svg)](https://github.com/jasonpoliveira/fast-s3-ingest/actions/workflows/ci.yml)
[![CodeQL](https://github.com/jasonpoliveira/fast-s3-ingest/actions/workflows/codeql.yml/badge.svg)](https://github.com/jasonpoliveira/fast-s3-ingest/actions/workflows/codeql.yml)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
![Python](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12%20%7C%203.13%20%7C%203.14-blue.svg)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
![Types: pyright](https://img.shields.io/badge/types-pyright-blue.svg)

**Safe, validated S3 ingestion for untrusted upload traffic.**

Any endpoint that accepts files from the public is an infrastructure boundary,
not a web-form convenience. The callers are increasingly varied: users,
applications, vendors, and autonomous agents alike.

fast-s3-ingest is a controlled S3 ingestion endpoint built around validation
before storage, constant-memory streaming, traceability, and real-world
deployment constraints. The interface is plain HTTP + JSON with stable error
codes and a self-describing schema, so an agent can drive it as readily as a
browser can.

---

## Why this exists

Public uploads are an infrastructure boundary, not a web-form convenience. The
moment an endpoint accepts files from users, applications, vendors, or autonomous
agents, every byte in the stream is a potential weapon: path traversal, null-byte
injection, magic-byte spoofing, zip bombs, pixel bombs. A single
`multipart/form-data` handler that writes to disk or buffers into memory falls
over under real traffic and hands an adversary a foothold.

Before files can be processed, routed, reviewed, or acted on by downstream
systems, they need a safer way to enter cloud infrastructure. fast-s3-ingest is
the controlled point where that intake happens: it validates what it sees and
streams it to S3 through a constant-memory pipeline that never buffers whole
payloads in application memory.

---

## What it provides

- A controlled S3 ingestion endpoint
- Validation before storage
- File size constraints
- Filename and key safety checks
- MIME/content handling
- Predictable validation errors
- Traceable upload behavior
- Deployment-aware configuration
- S3-backed storage without direct public bucket writes

Three ingestion modes cover the trade-offs: server-proxied (full inline
validation), presigned (direct-to-S3 throughput), and resumable multipart (large
files, parallel parts).

---

## Quickstart

One command brings up the service plus a local S3 (MinIO) with a ready bucket -
no AWS account required:

```bash
docker compose up --build
```

Then upload a file via the API:

```bash
curl -F "file=@photo.jpg" http://localhost:8000/v1/upload
# -> 201 {"key": "...", "bucket": "uploads", "size": ..., "etag": "..."}
```

...or open **http://localhost:8000** in a browser for the upload demo UI: drag a
file in and watch it get accepted, then try a `.exe` or a zip bomb and watch it
get rejected at the boundary with a stable error code. The MinIO console is at
http://localhost:9001 (`minioadmin` / `minioadmin`).

---

## See it in action

Five short clips cover the full trust stack: it works, it rejects bad input, it
catches malware, it is auditable, and it is architecturally sound under load.

**It works.** A valid file streams through the proxied endpoint and lands in S3.

<details>
<summary>▶ Watch: a valid file stream through to S3</summary>

![Successful upload](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/01_successful_upload.gif)

</details>

**It rejects bad input.** A hostile or malformed file is refused at the boundary
with a stable, machine-branchable error code, never reaching the bucket.

<details>
<summary>▶ Watch: a hostile file refused at the boundary</summary>

![Rejected bad file](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/02_rejected_bad_file.gif)

</details>

**It catches malware.** Structural validation proves a file is well-formed;
optional malware scanning proves it is not malicious. With `ENABLE_MALWARE_SCAN`
on, accepted bytes are streamed to a clamd daemon — a real EICAR test virus is
refused with `422 malware_detected` before it ever reaches S3, while a clean
file passes and is stored.

<details>
<summary>▶ Watch: a real virus caught by the scanner (EICAR → 422 malware_detected)</summary>

![Malware scanning catches the EICAR test virus](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/13_malware_scan.gif)

</details>

**It is traceable.** Every request emits structured logs with secrets redacted
and user-controlled values escaped, so uploads are auditable after the fact.

<details>
<summary>▶ Watch: structured logs with redaction and trace IDs</summary>

![Structured logs](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/04_structured_logs.gif)

</details>

**It is architecturally sound.** The upload pipeline avoids full-body buffering
and holds constant memory under backpressure regardless of file size;
framework/runtime spooling may still occur before the pipeline. Measured numbers
(flat memory, end-to-end throughput) are in [Performance](#performance).

<details>
<summary>▶ Watch: constant memory under backpressure (1 to 8 GB)</summary>

![Backpressure streaming](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/06_backpressure_streaming.gif)

</details>

Runnable clients for all three ingestion modes (Python, JavaScript, curl) are in
**[examples/](examples/README.md)**.

---

## What it is not

- Not a dashboard
- Not a document management system
- Not an AI agent framework
- Not an agent runtime
- Not a replacement for broader application security
- Not direct public access to an S3 bucket

---

## Agentic context

fast-s3-ingest does not implement agents.

The agentic framing refers to the infrastructure problem around safe file intake.
As more workflows involve autonomous systems reading, submitting, routing, or
processing files, the boundary where files enter cloud infrastructure becomes more
important. This project focuses on that boundary: controlled ingestion before
downstream processing.

The endpoint speaks plain HTTP + JSON, error codes are stable and
machine-branchable, idempotency keys make sequential retries safe, and the full
API schema is self-describing at `/openapi.json`. An autonomous agent can discover and drive
the ingestion boundary at runtime with zero prior knowledge, the same way any
application or vendor would. For more on that workflow, see
[docs/agents.md](docs/agents.md).

For agents that speak the Model Context Protocol, an optional MCP server is
included (`pip install 'fast-s3-ingest[mcp]'`, then run `fast-s3-ingest-mcp`). It
exposes the ingestion endpoints as MCP tools that map 1:1 to the REST API and
surface the same stable error codes. See
[docs/agents.md](docs/agents.md#mcp-server-model-context-protocol).

<details>
<summary>▶ Watch: an agent discover the API and retry safely (OpenAPI + Idempotency-Key)</summary>

![Agent self-discovery and idempotent retry](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/11_agent_discovery.gif)

</details>

---

## Features

- **Three ingestion modes** - server-proxied (full validation), presigned (direct-to-S3), and resumable multipart (large files, parallel parts)
- **Streaming pipeline** - the upload pipeline avoids full-body buffering; framework/runtime spooling may still occur before the pipeline
- **Content validation** - magic bytes, extension allow/deny, content-type allow/deny, optional image verification (Pillow; buffers each image up to `MAX_UPLOAD_SIZE` per concurrent image), archive bomb detection
- **Optional malware scanning** - stream uploads to a clamd daemon; a signature hit rejects inline or quarantines out of band. Fails closed by default
- **Ingest outcome events** - optional signed (HMAC-SHA256) webhook emitting `ingest.accepted` / `rejected` / `quarantined`, so direct-to-S3 verdicts aren't log-only
- **Quarantine management** - list, release, and delete quarantined objects via `/v1/quarantine`, scoped to the quarantine prefix
- **Agent-ready interface** - stable `snake_case` error codes, `Idempotency-Key` for safe retry, OpenAPI self-discovery at `/openapi.json`
- **MCP server** - optional Model Context Protocol server (`fast-s3-ingest-mcp`) exposing the ingestion endpoints as native agent tools
- **Rate limiting** - token bucket per client (API key or IP), with standard rate-limit headers
- **Multi-tenancy** - optional tenant header mapped to S3 key prefix for namespace isolation
- **Server-side encryption** - AES256 and `aws:kms` support
- **Health and readiness probes** - `/healthz` and `/readyz` for orchestrators
- **Optional observability** - Prometheus metrics and OpenTelemetry tracing
- **S3-compatible** - works with AWS S3, MinIO, Cloudflare R2, Ceph, LocalStack, and others

---

## Ingestion modes

| Mode | Server sees bytes? | Validation | Best for |
|---|---:|---|---|
| Proxied | Yes | Full inline validation | Strongest safety |
| Presigned | No | S3 policy-level controls | Maximum throughput |
| Multipart | No | Session-level controls; post-upload scanner logic included, event-source wiring required | Large / resumable uploads |

**Proxied** (`POST /v1/upload`): the server receives, validates, and streams the
file to S3. Full content validation (magic bytes, content-type, extension, size,
optional image/archive checks) runs inline before storage. The S3 streaming
pipeline is constant-memory (bounded and backpressured) and does not persist
payloads to application storage; for proxied HTTP uploads, FastAPI/Starlette may
use a bounded temporary multipart spool for large request bodies before the
pipeline streams them to S3. Note that optional image validation is not part of
the streaming path: when `ENABLE_IMAGE_VALIDATION` is on, it buffers each image
in memory up to `MAX_UPLOAD_SIZE` per concurrent image in order to verify it with
Pillow. The buffer is bounded by `MAX_UPLOAD_SIZE` (the same ceiling the pipeline
enforces) so any image the service would accept can be fully verified.

**Presigned** (`POST /v1/upload-url`): the server mints a direct-to-S3 URL. The
client uploads straight to S3; the server never touches the file bytes.

<details>
<summary>▶ Watch: presigned direct-to-S3 (server never sees the bytes)</summary>

![Presigned direct-to-S3 upload](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/07_presigned_direct.gif)

</details>

**Multipart** (`POST /v1/multipart`): the server orchestrates a resumable S3
multipart upload, returning presigned PUT URLs for each part. The client uploads
parts in parallel and calls `/complete` to assemble them. A
`DELETE /v1/multipart/{upload_id}` aborts an in-progress session and discards
any uploaded parts.

<details>
<summary>▶ Watch: resumable multipart (parts → /complete → abort)</summary>

![Resumable multipart upload](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/08_multipart_resumable.gif)

</details>

> **Important trade-off:** Direct-to-S3 uploads cannot be fully validated inline
> because the server never sees the file bytes. For those paths, validation must
> happen through an out-of-band scanner after upload. The proxied path is the
> only mode that provides complete inline validation before storage.

---

## Security posture

This project narrows the public ingestion boundary; it does not claim to be fully
secure. It reduces exposure by avoiding direct public bucket writes and enforcing
validation before storage, so a hostile file is rejected at the boundary rather
than landing in your bucket. The full threat model and operational guidance are
in **[docs/security.md](docs/security.md)**; a formal data-flow view of trust
boundaries and abuse cases is in **[docs/threat-model.md](docs/threat-model.md)**,
and the map of what each CI scanner does and does not prove is in
**[docs/security-ci.md](docs/security-ci.md)**. Key points:

- **Input sanitization** - path traversal prevention, null-byte rejection, NFKC
  normalization against homograph attacks (CVE-2019-19844 pattern)
- **Content validation** - magic byte signatures, extension and content-type
  allow/deny lists, optional image verification (Pillow `verify()` + `load()`),
  archive bomb detection (compression ratio, total size, entry count)
- **Authentication** - optional `X-API-Key` header with constant-time comparison.
  When `API_KEYS` is unset the service is open. Real auth (OAuth, JWT, API
  gateway) belongs in front of the service.
- **Rate limiting** - token bucket per client. Rate-limited responses carry
  `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset`; 429s add
  `Retry-After`.
- **Credential safety** - secret fields are never logged. URLs are sanitized
  before logging. User-controlled values in log messages are escaped against
  CRLF / ANSI injection.
- **Malware scanning & quarantine** - optional clamd scan (a signature hit
  rejects inline with `422 malware_detected`); presigned/multipart uploads are
  re-validated out of band and any reject is moved to a quarantine prefix, then
  reviewable via `GET|POST /v1/quarantine` (list, release a false positive,
  delete a confirmed-bad object).

<details>
<summary>▶ Watch: a zip bomb rejected on compression ratio</summary>

![Zip bomb rejected](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/10_archive_bomb.gif)

</details>

<details>
<summary>▶ Watch: the auth boundary (missing/wrong key → 401, valid key → 201)</summary>

![Auth failure](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/03_auth_failure.gif)

</details>

<details>
<summary>▶ Watch: rate limiting (201 → 429 with Retry-After + X-RateLimit-* headers)</summary>

![Rate limiting in action](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/09_rate_limit.gif)

</details>

<details>
<summary>▶ Watch: quarantine lifecycle (scanner quarantines a reject → list → release)</summary>

![Quarantine management: list, release, delete](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/14_quarantine.gif)

</details>

What it constrains: upload size, allowed extensions and content types, archive
expansion, and the S3 key space. What it avoids: buffering whole payloads in
application memory and exposing the bucket to direct public writes. (For proxied
HTTP uploads, FastAPI/Starlette may spool a large multipart body to a bounded
temporary file on disk before the constant-memory pipeline streams it to S3.)

---

## Performance

Two properties matter for an upload boundary: it must not fall over on large
files, and it must not add meaningful overhead. Both are measured below, and
both are reproducible (`make bench`; and `scripts/benchmark_e2e.py` against the
`docker compose` stack).

**Memory stays flat regardless of file size.** The pipeline processes bounded
chunks under backpressure instead of buffering the body, so peak resident memory
(RSS) does not grow with the payload. This is what lets a single replica absorb
a multi-gigabyte upload without being knocked over.

| payload | peak RSS |
|--------:|---------:|
| 1 GB | 74 MB |
| 4 GB | 74 MB |
| 8 GB | 75 MB |

*(`make bench`; S3 mocked to isolate the pipeline's own memory profile. If the
body were buffered, RSS would track payload size toward OOM. It doesn't, it holds
at ~75 MB from 1 GB to 8 GB.)*

**Throughput, measured end-to-end.** Client to gateway to S3 over a socket,
through the full validation path, against a local MinIO:

| payload | throughput |
|--------:|-----------:|
| 32 MB | 134 MB/s |
| 64 MB | 129 MB/s |
| 96 MB | 134 MB/s |

*(`scripts/benchmark_e2e.py` vs MinIO on loopback, single stream. Real-world
throughput is bound by the network to your S3 endpoint; this isolates the
gateway's own contribution. For reference, the pipeline's mocked ceiling is
~8 GB/s, so validation plus streaming is not the bottleneck.)*

**Memory under concurrency is what you size a replica on.** Backpressure caps
each in-flight upload at ~2x the part size, so a replica's working set grows
with the number of *simultaneous* uploads, not with file size:

| simultaneous uploads | peak RSS |
|---------------------:|---------:|
| 1 | 74 MB |
| 8 | 267 MB |
| 16 | 451 MB |
| 32 | 827 MB |
| 64 | 1.6 GB |

*(`make bench-concurrency`; 512 MiB each, 8 MiB parts, S3 mocked. Aggregate
throughput holds at ~7.9 GB/s across all rows - the bound is memory, not CPU.)*
This is a sizing rule of thumb: `replica RSS ~= base + concurrency x (2 x
UPLOAD_CHUNK_SIZE)` (here ~24 MB per concurrent upload). Cap real concurrency at
the edge or with replica count to keep a replica inside its memory limit; image
validation, when enabled, adds up to `MAX_UPLOAD_SIZE` per concurrent image on
top of this.

Numbers are from one dev machine; reproduce them with the commands above.

**Under a concurrency spike, it degrades gracefully, not catastrophically.** A
live load test ramps offered concurrency past the single-worker knee against the
real validate → stream → S3 path; latency rises and the queue absorbs the spike
rather than the process falling over.

<details>
<summary>▶ Watch: a live load test (ramp → spike → drain, real requests)</summary>

![Pressure test: live concurrency ramp against the real stack](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/12_pressure_test.gif)

</details>

---

## Deployment notes

fast-s3-ingest is built for real-world deployment constraints: it sits behind your
edge layer (reverse proxy, load balancer, CDN) and writes into S3 with
least-privilege credentials. All settings are environment variables. Copy
`.env.example` and adjust. For a pre-launch checklist see
**[docs/production-checklist.md](docs/production-checklist.md)**, and to verify a
published image's signature, SBOM, and provenance see
**[docs/release-verification.md](docs/release-verification.md)**.

One command brings up the endpoint and a local S3 (MinIO) with the bucket created
for you:

```bash
docker compose up --build
curl -F "file=@photo.jpg" http://localhost:8000/v1/upload
```

The file streams through the proxied endpoint and lands in MinIO. Console at
http://localhost:9001 (`minioadmin` / `minioadmin`).

To run just the endpoint against your own S3:

```bash
docker build -t fast-s3-ingest .
docker run -e S3_BUCKET=my-bucket -p 8000:8000 fast-s3-ingest
```

To run from source (the `src/` layout requires an editable install so the
`fast_s3_ingest` package resolves):

```bash
pip install -e .          # add '.[images]' / '.[all]' for optional features
S3_BUCKET=my-bucket uvicorn fast_s3_ingest.app:app --port 8000
```

**Edge layer.** By default, browser-facing security headers and CORS are expected
at the reverse proxy, load balancer, or CDN. Application-level CORS
(`CORS_ENABLED`) and security headers (`SECURITY_HEADERS_ENABLED`) are available
for deployments that terminate browser traffic directly at the app. Do not enable
HSTS unless the public hostname is served exclusively over HTTPS.

**Request body size limits.** The application enforces `MAX_UPLOAD_SIZE` by
counting bytes as it streams, but that check runs *after* the ASGI server and any
reverse proxy have already begun accepting (and possibly spooling) the request
body. Configure a body-size limit at the edge as an operational guardrail so
oversized requests are rejected before they reach the app. This complements, and
does not replace, the application-level validation. For example, with nginx:

```nginx
# Reject bodies larger than your MAX_UPLOAD_SIZE before they reach the app.
client_max_body_size 100m;
```

**S3 permissions.** Grant the service least-privilege access to the target bucket
only. Prefer IAM roles over static keys in production-like environments. The
minimal policy below covers the core ingestion paths (proxied, presigned, and
multipart). Replace `YOUR_BUCKET` with your `S3_BUCKET` and `YOUR_PREFIX` with
your `KEY_PREFIX` (drop the prefix segment if you do not set one), and tighten the
resources to your own deployment.

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ObjectWrite",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:AbortMultipartUpload"
      ],
      "Resource": "arn:aws:s3:::YOUR_BUCKET/YOUR_PREFIX/*"
    },
    {
      "Sid": "BucketProbeAndMultipartList",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:ListBucketMultipartUploads"
      ],
      "Resource": "arn:aws:s3:::YOUR_BUCKET"
    }
  ]
}
```

`s3:ListBucket` backs the `/readyz` `head_bucket` probe and
`s3:ListBucketMultipartUploads` backs the background cleanup of stale multipart
uploads. If you enable optional post-upload scanning, the scanner additionally
needs `s3:GetObject`, `s3:PutObjectTagging`, and `s3:DeleteObject` on
`arn:aws:s3:::YOUR_BUCKET/YOUR_PREFIX/*`.

### Configuration

| Variable | Default | Description |
|----------|---------|-------------|
| `S3_BUCKET` | *(required)* | Target S3 bucket |
| `S3_ENDPOINT_URL` | *(none)* | S3-compatible endpoint (MinIO, R2, LocalStack, etc.) |
| `AWS_REGION` | `us-east-1` | AWS region |
| `AWS_ACCESS_KEY_ID` | SDK chain | Access key (prefer IAM roles in production) |
| `AWS_SECRET_ACCESS_KEY` | SDK chain | Secret key (prefer IAM roles in production) |
| `AWS_SESSION_TOKEN` | SDK chain | Session token for temporary STS credentials |
| `MAX_UPLOAD_SIZE` | `104857600` | Maximum upload size in bytes (100 MB) |
| `UPLOAD_CHUNK_SIZE` | `8388608` | Multipart part size in bytes (8 MB) |
| `ALLOWED_EXTENSIONS` | *(none = all)* | Comma-separated allow-list of file extensions |
| `DENIED_EXTENSIONS` | `exe,sh,php,…` | Comma-separated deny-list of file extensions |
| `ALLOWED_CONTENT_TYPES` | *(none = all)* | Comma-separated allow-list of Content-Type values |
| `DENIED_CONTENT_TYPES` | *(none)* | Comma-separated deny-list of Content-Type values. Deny beats allow: a type here is rejected even if also allow-listed |
| `ENABLE_IMAGE_VALIDATION` | `false` | Validate images with Pillow (`verify()` + `load()`). Requires the `images` extra. |
| `ENABLE_ARCHIVE_VALIDATION` | `false` | Detect zip bombs (compression ratio, size, entry count) |
| `ENABLE_NESTED_ARCHIVE_VALIDATION` | `false` | Recurse into nested ZIP archives (archive-in-archive) up to a bounded depth of 3. Recursion follows ZIP archives, not nested TAR archives (still vetted as bounded members). Defense-in-depth on top of the per-entry ratio/size/count limits; only consulted when `ENABLE_ARCHIVE_VALIDATION` is on, and adds validation cost for archive-heavy uploads |
| `ENABLE_MALWARE_SCAN` | `false` | Stream uploads to a clamd daemon (INSTREAM); a signature hit rejects (proxied: `422 malware_detected`; scanner: quarantine). Requires the `av` extra and a reachable clamd. |
| `CLAMAV_TCP_HOST` | *(none)* | clamd TCP host (used when `CLAMAV_UNIX_SOCKET` is unset) |
| `CLAMAV_TCP_PORT` | `3310` | clamd TCP port |
| `CLAMAV_UNIX_SOCKET` | *(none)* | Path to a clamd unix socket; takes precedence over the TCP host |
| `CLAMAV_TIMEOUT` | `30` | clamd connection/scan timeout in seconds |
| `CLAMAV_FAIL_OPEN` | `false` | Fail **closed** by default: if clamd is unreachable, refuse uploads (proxied: `503 scan_unavailable`) and leave scanned objects in place rather than accept/quarantine on no verdict. `true` accepts unscanned. |
| `S3_SSE_ALGORITHM` | *(none)* | Server-side encryption: `AES256` or `aws:kms` |
| `S3_KMS_KEY_ID` | *(none)* | KMS key ID when `S3_SSE_ALGORITHM=aws:kms` |
| `S3_CHECKSUM_ALGORITHM` | *(none)* | S3 additional checksum on upload (e.g. `SHA256`) |
| `S3_RETRY_MAX_ATTEMPTS` | `10` | Max boto3 retry attempts for S3 API calls |
| `S3_QUARANTINE_PREFIX` | `quarantine/` | Key prefix where an out-of-band scanner moves rejected objects |
| `RATE_LIMIT_PER_SECOND` | `10` | Token-bucket refill rate per client |
| `RATE_LIMIT_BURST` | `30` | Token-bucket burst capacity |
| `REDIS_URL` | *(none)* | Redis URL for shared session / rate-limit state across replicas. Requires the `redis` extra. On a Redis backend error the rate limiter fails open (logs a warning and allows the request with full burst); the in-memory limiter always enforces. |
| `INGEST_WEBHOOK_URL` | *(none)* | When set, emit a signed JSON event (`ingest.accepted`/`rejected`/`quarantined`) to this URL after each decision. Delivery never blocks ingestion. |
| `INGEST_WEBHOOK_SECRET` | *(none)* | HMAC-SHA256 signing secret for the `X-Ingest-Signature` (`t=<ts>,v1=<sig>`) header. Required when `INGEST_WEBHOOK_URL` is set; the app refuses to start otherwise. |
| `INGEST_WEBHOOK_TIMEOUT` | `5` | Per-attempt webhook delivery timeout in seconds |
| `INGEST_WEBHOOK_MAX_ATTEMPTS` | `3` | Total webhook delivery attempts (initial + retries) per event |
| `API_KEYS` | *(none = open)* | Comma-separated list of valid `X-API-Key` values |
| `TENANT_HEADER` | *(none)* | Optional header name used to resolve tenant prefixes for multi-tenant key-space isolation (e.g. `X-Tenant-ID`) |
| `TENANT_REQUIRED` | `false` | When `true`, requests must include a valid tenant header whenever `TENANT_HEADER` is configured; missing or blank headers are rejected with 400 |
| `KEY_PREFIX` | *(none)* | Global static prefix prepended to all S3 keys |
| `PRESIGNED_EXPIRES_IN` | `900` | Presigned URL TTL in seconds |
| `MULTIPART_CLEANUP_HOURS` | `24` | Abort orphaned multipart uploads older than this |
| `SHUTDOWN_DRAIN_TIMEOUT` | `30.0` | On orderly shutdown, seconds to let in-flight *proxied* uploads finish streaming to S3 before their internal multipart sessions are force-aborted. Resumable `/v1/multipart` sessions are client-driven and not tracked here; they are reclaimed by `MULTIPART_CLEANUP_HOURS`. (A direct SIGTERM/SIGINT is a hard backstop and does not drain.) |
| `CORS_ENABLED` | `false` | Enable Starlette's CORSMiddleware. Off by default - expected at reverse proxy/CDN |
| `CORS_ALLOW_ORIGINS` | *(none)* | Comma-separated allowed origins for CORS (e.g. `https://example.com,https://app.example.com`) |
| `CORS_ALLOW_CREDENTIALS` | `false` | Allow credentials (cookies, auth headers) in CORS responses |
| `SECURITY_HEADERS_ENABLED` | `false` | Enable browser security headers middleware. Off by default - expected at reverse proxy/CDN |
| `HSTS_ENABLED` | `false` | Emit `Strict-Transport-Security` header. Requires `SECURITY_HEADERS_ENABLED=true`. Only enable if HTTPS-only |
| `HSTS_MAX_AGE` | `31536000` | HSTS max-age in seconds (1 year) |
| `HSTS_INCLUDE_SUBDOMAINS` | `true` | Include `includeSubDomains` directive in HSTS header |
| `HSTS_PRELOAD` | `false` | Opt into browser HSTS preload lists. Difficult to reverse - only enable if intentionally prepared |
| `ENABLE_PROMETHEUS` | `false` | Serve Prometheus metrics at `/metrics`. Requires the `observability` extra. |
| `OTEL_EXPORTER_OTLP_ENDPOINT` | *(none)* | OTLP/HTTP collector URL for OpenTelemetry tracing. Requires the `observability` extra. |
| `OTEL_SERVICE_NAME` | `fast-s3-ingest` | Logical service name reported in traces and metrics labels |
| `ENABLE_WEB_UI` | `false` | Serve the static upload demo page at `/`. Demo/manual-testing convenience, off by default; not a production feature |
| `DEBUG` | `false` | Verbose errors and `/docs` endpoint; never enable in production |

`TENANT_HEADER` enables tenant-prefix resolution. By default, tenant prefixing
is optional: requests without the configured tenant header write to the
unprefixed namespace. Set `TENANT_REQUIRED=true` when the bucket is shared
across tenants and unprefixed writes should be rejected.

`TENANT_REQUIRED` enforces *presence*, not *authorization*. It guarantees every
request names a valid tenant; it does not verify the caller is entitled to the
tenant they name. A caller can still send any tenant ID and write to that
prefix. Binding a caller's identity to its permitted tenant is the job of the
auth layer in front of this service (JWT claim, API gateway). See
[docs/threat-model.md](docs/threat-model.md) (A7, A9).

Optional extras: `pip install 'fast-s3-ingest[redis]'`, `[images]`,
`[observability]`, `[av]`, or `[all]`. The published Docker image already
bundles `[images]` and `[av]`, so `ENABLE_IMAGE_VALIDATION` and
`ENABLE_MALWARE_SCAN` (against a reachable clamd) work without a custom image.

---

## Running tests

```bash
make install       # Create venv + install dev dependencies
make check         # Run lint (ruff) + type check (pyright) + tests (pytest)
make test-cov      # Run tests with coverage report
make bench         # Streaming pipeline benchmark
```

CI runs the full matrix on Python 3.10-3.14: lint, type-check, unit tests,
property-based fuzzing (Hypothesis), an end-to-end suite against MinIO, and
security scans (bandit, pip-audit, CodeQL, Trivy). See
[docs/security-ci.md](docs/security-ci.md) for what each scanner does and does
not prove.

See [CONTRIBUTING.md](CONTRIBUTING.md) for the full development guide.

---

## Observability

Traceable upload handling is built in. When `ENABLE_PROMETHEUS=true` (and the
`observability` extra is installed), `/metrics` serves Prometheus exposition
format: request counts, bytes transferred, latency histogram, and error
breakdowns by code.

When `OTEL_EXPORTER_OTLP_ENDPOINT` is set, OpenTelemetry traces every request
through the full call chain (inbound -> validation -> S3).

`/healthz` and `/readyz` give orchestrators liveness and readiness signals, so
the service is operable behind a load balancer or scheduler.

<details>
<summary>▶ Watch: health and readiness probes (/healthz + /readyz)</summary>

![Health and readiness probes](https://raw.githubusercontent.com/jasonpoliveira/fast-s3-ingest/main/docs/assets/demos/05_health_readiness.gif)

</details>

Without these configured, `/metrics` returns a JSON snapshot and tracing is
off. Both fail closed: the app refuses to start if the feature is enabled but
the `observability` extra is missing.

---

## Limitations

This is an **0.1.0 release** - a serious starting point, not a finished product.

- **Download is out of scope.** This is an ingestion boundary. Serve or verify
  objects directly against S3.
- **In-memory state by default.** Sessions, idempotency keys, and rate-limit
  buckets are per-process unless `REDIS_URL` is set. With multiple workers
  the effective rate limit is N× the configured rate.
- **Idempotency keys cover sequential retries, not concurrent dedup.** A key
  makes a retried request safe after the first one settles; two requests with
  the same key in flight at the same time are not collapsed into one.
- **Inline validation is CPU-bound but offloaded.** For the proxied path, the
  synchronous content checks (magic bytes, Pillow image verify/load, archive
  decompression) run in a threadpool via `run_in_threadpool`, so they do not
  block the event loop or starve a worker's other in-flight requests. They still
  consume CPU and threadpool slots, so validation-heavy workloads are bounded by
  cores and the thread limit; run multiple workers to scale past one box.
- **Presigned paths skip inline content validation.** Bytes go direct
  client->S3, so the post-upload scanner (`scanner.py`) is the safety net. Its
  verdict is no longer log-only: a quarantine can emit a signed
  `ingest.quarantined` webhook event (`INGEST_WEBHOOK_URL`) and quarantined
  objects are listable/releasable/deletable via `GET|POST /v1/quarantine`. The
  scanner still needs an event source wired to it (see Roadmap).
- **Auth is API-key only.** Real authentication (OAuth, JWT, mTLS) belongs
  in a reverse proxy or API gateway in front of the service.
- **Scale testing is preliminary.** The streaming pipeline has been
  benchmarked to constant memory under multi-GB payloads, but the service has
  not been subjected to large-scale workloads.

Deeper technical details live in [docs/](docs/): architecture, security model,
and agent integration guide.

---

## Roadmap

- **Post-upload scanner deployment** - the scanner itself ships in `scanner.py`
  (`handle_s3_event` already accepts the standard S3 notification payload); what's
  not yet packaged is a turnkey Lambda/worker deployment to wire it to an event
  source so presigned/multipart uploads are scanned after the fact
- **Distributed backend without Redis** - pluggable store backends (Postgres,
  DynamoDB) via the existing `SessionStore` protocol
- **S3 EventBridge / bucket notification support** - react to object-created
  events for out-of-band scanning pipelines

---

## Author

Built and maintained by [Jason Oliveira](https://github.com/jasonpoliveira).

Questions, issues, and security reports: fasts3ingest@gmail.com

---

## License

[MIT](LICENSE)
