Metadata-Version: 2.4
Name: multi-model-image-gen
Version: 0.2.2
Summary: Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield — aisuite-style addressing, retry + fallback, async-first, plugin-extensible.
Project-URL: Homepage, https://github.com/oj-rivas/multi-model-image-gen
Project-URL: Source, https://github.com/oj-rivas/multi-model-image-gen
Project-URL: Issues, https://github.com/oj-rivas/multi-model-image-gen/issues
Project-URL: Changelog, https://github.com/oj-rivas/multi-model-image-gen/releases
Author: Oscar Rivas
License: MIT License
        
        Copyright (c) 2026 Oscar Rivas
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: ai,aisuite,async,fal,fallback,flux,gemini,google,higgsfield,ideogram,image-generation,imagen,litellm,llm-sdk,nano-banana,openai,recraft,retry,seedream,vertex-ai
Classifier: Development Status :: 4 - Beta
Classifier: Framework :: AsyncIO
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Multimedia :: Graphics
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: click>=8.1
Requires-Dist: fal-client>=0.5
Requires-Dist: google-genai>=1.0
Requires-Dist: httpx>=0.27
Requires-Dist: openai>=1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: tenacity>=8.0
Provides-Extra: dev
Requires-Dist: hatch>=1.12; extra == 'dev'
Requires-Dist: prometheus-client>=0.19; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: respx>=0.21; extra == 'dev'
Requires-Dist: ruff>=0.8; extra == 'dev'
Provides-Extra: metrics
Requires-Dist: prometheus-client>=0.19; extra == 'metrics'
Description-Content-Type: text/markdown

# multi-model-image-gen

Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield. One API, aisuite-style addressing, retry + fallback, async-first, plugin-extensible.

```python
from image_gen import generate

result = generate(
    prompt="a neon-lit Tokyo alley at night",
    model="fal:flux/dev",
    fallbacks=["openai", "higgsfield"],
    output="out.png",
)
```

## Supported models

Catalog of models known to the library (current as of **2026-04-17**). Shorthands resolve to the provider-native model ID automatically. Any full model ID also works — unknown models just pass through without capability validation.

> Single source of truth: [`src/image_gen/providers/catalog.yaml`](src/image_gen/providers/catalog.yaml). Add or update a model = edit that file, no Python changes.

### OpenAI — GPT Image family

| Shorthand | Model ID | Seed | Neg. prompt | Notes |
|---|---|:---:|:---:|---|
| `gpt-image-1.5` | `gpt-image-1.5` | – | – | **Current SOTA** — natively multimodal (text + image I/O). |
| `gpt-image-mini` | `gpt-image-1-mini` | – | – | Cheapest tier when quality is secondary. |
| `gpt-image` | `gpt-image-1` | – | – | Legacy. Kept for back-compat. |

Pricing is **token-based** (text + image tokens) — see OpenAI's pricing page for current rates. The image API also supports `dall-e-2` / `dall-e-3` via full model IDs.

### fal.ai — curated subset of 1000+ models

Any fal model works by full ID (`generate(model="fal:fal-ai/sdxl/lightning", …)`). The shorthands below cover the most common ones:

| Shorthand | Model ID | Seed | Neg. prompt | Cost / image |
|---|---|:---:|:---:|---:|
| `flux2-pro` | `fal-ai/flux-2-pro` | ✓ | – | ~$0.03 / MP |
| `flux2-dev-turbo` | `fal-ai/flux-2-dev-turbo` | ✓ | – | ~$0.008 |
| `flux-pro` | `fal-ai/flux-pro` | ✓ | – | $0.050 |
| `flux` | `fal-ai/flux/dev` | ✓ | – | $0.025 |
| `flux-schnell` | `fal-ai/flux/schnell` | ✓ | – | $0.003 |
| `seedream` / `seedream-4.5` | `fal-ai/bytedance/seedream/v4.5/text-to-image` | ✓ | ✓ | $0.040 |
| `ideogram` | `fal-ai/ideogram/v3` | ✓ | ✓ | $0.030–0.090 |
| `recraft` | `fal-ai/recraft-v3` | ✓ | – | $0.040 raster / $0.080 vector |
| `nb2-fal` | `fal-ai/nano-banana-2` | – | – | $0.080 |
| `nb-pro-fal` | `fal-ai/nano-banana-pro` | – | – | $0.150 |
| `imagen4-fal` | `fal-ai/imagen4/preview` | – | – | variable |

fal acts as a unified gateway — you can use Google's Nano Banana and Imagen through fal's billing + API instead of Google's own credentials, which is occasionally useful for multi-tenant apps.

### Google — Nano Banana + Imagen (AI Studio **or** Vertex AI)

Two providers, each with independent auth:

| Provider | Env var | Auth method | Use case |
|---|---|---|---|
| `google` | `GOOGLE_VERTEX_API_KEY` | Vertex AI (API key) | Railway / production |
| `google` | `GOOGLE_CLOUD_PROJECT` | Vertex AI (ADC) | Local dev |
| `google-studio` | `GOOGLE_GEMINI_API_KEY` | AI Studio | Gemini-only, no GCP project |
| `google` | `GOOGLE_API_KEY` + `GOOGLE_GENAI_USE_VERTEXAI=True` | Vertex AI (API key) | Legacy / compat |
| `google` | `GOOGLE_API_KEY` alone | AI Studio | Legacy / compat |

`GOOGLE_VERTEX_API_KEY` takes priority for the `google` provider — set it to use a Vertex-bound API key instead of ADC. Imagen models require Vertex. `google-studio` shorthands (`nb2`, `nano-banana`, `nb-pro`) also work via `google-studio:model-id` if you want them on AI Studio billing.

**Nano Banana (Gemini Image)** — three tiers, shared `generate_content` API surface:

| Shorthand | Model ID | Aliases | Description |
|---|---|---|---|
| `nb-pro` | `gemini-3-pro-image-preview` | Nano Banana **Pro** | Best reasoning ("Thinking"), high-fidelity text, most expensive. |
| `nb2` | `gemini-3.1-flash-image-preview` | Nano Banana **2** | Fast, high-volume. Trades some compositional depth for speed. |
| `nano-banana` | `gemini-2.5-flash-image` | original Nano Banana | Legacy speed/efficiency tier. |

**Imagen 4** — three tiers:

| Shorthand | Model ID | Notes |
|---|---|---|
| `imagen4-ultra` | `imagen-4.0-ultra-generate-001` | Highest fidelity; supports 1K + 2K + 4K. |
| `imagen4` | `imagen-4.0-generate-001` | Default Imagen 4. |
| `imagen4-fast` | `imagen-4.0-fast-generate-001` | Cheapest / fastest. |

All Google image models output includes a **SynthID** watermark.

### Higgsfield — async-task platform (submit → poll → download)

Polling is built in via `poll_until`. Higgsfield also gives day-0 access to third-party video models.

**Soul** (photorealistic image generation):

| Shorthand | Model ID |
|---|---|
| `higgsfield-soul` | `higgsfield-ai/soul/standard` |
| `higgsfield-soul-pro` | `higgsfield-ai/soul/pro` |

**DoP** (image-to-video, three quality/speed tiers):

| Shorthand | Model ID |
|---|---|
| `higgsfield-dop` | `higgsfield-ai/dop/lite` |
| `higgsfield-dop-preview` | `higgsfield-ai/dop/preview` |
| `higgsfield-dop-turbo` | `higgsfield-ai/dop/turbo` |

**Kling** (video via Higgsfield platform):

| Shorthand | Model ID | Notes |
|---|---|---|
| `higgsfield-kling` | `kling-video/v3.0/master/image-to-video` | Kling 3.0 — unified video + audio + images, multi-shot. |
| `higgsfield-kling-v2` | `kling-video/v2.1/pro/image-to-video` | Kling 2.1 Master. |

Other Higgsfield platform models (Sora 2, WAN 2.5, MiniMax Hailuo 02, Seedance Pro, etc.) work by passing their full ID: `generate(model="higgsfield:sora-2/preview", …)`.

### All models support

Aspect ratios `9:16`, `16:9`, `1:1`. Resolutions `720p`, `1080p`. Capabilities beyond these are rejected pre-flight with `UnsupportedCapabilityError` — the SDK call is never made. Add a model, or update capabilities, by editing [`src/image_gen/providers/catalog.yaml`](src/image_gen/providers/catalog.yaml) — no Python changes needed.

## Install

```bash
pip install git+https://github.com/oj-rivas/multi-model-image-gen.git
# or pinned:
pip install git+https://github.com/oj-rivas/multi-model-image-gen.git@v0.2.0
# or editable from local checkout:
pip install -e /path/to/multi-model-image-gen
# with Prometheus metrics:
pip install "multi-model-image-gen[metrics] @ git+…"
```

## Credentials

Set only the keys for providers you'll actually call:

```bash
# OpenAI
OPENAI_API_KEY=sk-…

# fal.ai
FAL_KEY=…

# Google — Vertex AI (production / Railway)
GOOGLE_VERTEX_API_KEY=…                   # "google" provider → Vertex with API key
GOOGLE_GEMINI_API_KEY=…                   # "google-studio" provider → AI Studio
GOOGLE_CLOUD_PROJECT=your-gcp-project     # required for Vertex (Imagen 4)
GOOGLE_CLOUD_LOCATION=us-central1         # default
# Google — local dev (ADC)
# GOOGLE_CLOUD_PROJECT=your-gcp-project   # Vertex via gcloud auth application-default login
# Google — legacy single key
# GOOGLE_API_KEY=…                        # AI Studio, or Vertex with GOOGLE_GENAI_USE_VERTEXAI=True

# Higgsfield
HIGGSFIELD_API_KEY=…
HIGGSFIELD_API_SECRET=…
HIGGSFIELD_BASE_URL=https://platform.higgsfield.ai  # default
```

The library does **not** load `.env` for you. Use `python-dotenv` or your framework's loader.

## Addressing

Three ways to name a model:

| Form | Example | Provider chosen by |
|---|---|---|
| `provider:model` | `"fal:flux/dev"` | explicit prefix |
| Bare shorthand | `"flux"` | catalog |
| Bare full model ID | `"higgsfield-ai/soul/standard"` | catalog → `higgsfield` default |

Unknown provider prefix raises `ValueError`. Unknown bare models route to `higgsfield` (which uses full IDs rather than shorthands).

## Library usage

### Sync

```python
from image_gen import generate

result = generate(
    prompt="a neon fox",
    model="flux",                 # catalog → fal-ai/flux/dev
    output="fox.png",             # written when status == "completed"
    aspect_ratio="9:16",
    resolution="720p",
)
assert result.status == "completed"
print(result.request_id, result.cost_usd)
```

### Async — for FastAPI / high-concurrency

```python
import asyncio
from image_gen import generate_async

async def batch():
    return await asyncio.gather(*[
        generate_async(prompt=f"scene {i}", model="flux")
        for i in range(10)
    ])
```

### Fallback chains

```python
result = generate(
    prompt="…",
    model="fal:flux/dev",
    fallbacks=["openai", "higgsfield"],   # tried in order on retry-exhausted failure
    correlation_id="trace-42",
)
```

### Direct provider client (bypass routing)

```python
from image_gen import FalImageClient

with FalImageClient() as client:
    result = client.generate_image(
        prompt="cinematic forest",
        model="fal-ai/flux/dev",
        aspect_ratio="16:9",
        num_inference_steps=40,      # provider-specific kwarg
    )
    client.download(result.image_url, "forest.png")
```

## Result shape

```python
@dataclass
class GenerationResult:
    request_id: str
    status: str                # "completed" | "failed" | "blocked"
    image_bytes: bytes | None  # inline payload (OpenAI, Google)
    image_url: str | None      # remote URL (fal, Higgsfield)
    video_url: str | None      # Higgsfield video models
    cost_usd: float | None     # populated from catalog when available
    raw: dict | None           # full provider response + correlation_id if set
```

Callers that pass `output=` to `generate()` don't need to touch this — bytes/URL are handled automatically.

## CLI

```bash
image-gen -p "a neon fox" -m flux -o fox.png
image-gen -p "…" -m "openai:gpt-image-1" --aspect 16:9 --resolution 1080p -o wide.png
image-gen -p "…" -m "higgsfield:higgsfield-ai/soul/standard" -o h.png
```

## FastAPI integration

```python
from fastapi import FastAPI, HTTPException
from image_gen import generate_async, FallbackExhausted

app = FastAPI()

@app.post("/images")
async def gen(prompt: str, model: str = "flux", x_correlation_id: str | None = None):
    try:
        r = await generate_async(prompt=prompt, model=model, correlation_id=x_correlation_id)
    except FallbackExhausted as e:
        raise HTTPException(502, str(e))
    if r.status == "blocked":
        raise HTTPException(422, r.raw)
    return {"url": r.image_url, "cost": r.cost_usd, "id": r.request_id}
```

## Resilience

- **Retry**: tenacity-backed exponential backoff on `httpx.TimeoutException`, `ConnectError`, `ReadError`, and HTTP `429/500/502/503/504`. Configurable via `IMAGE_GEN_RETRY_ATTEMPTS` (default 3) and `IMAGE_GEN_RETRY_MAX_WAIT` (default 8s). Non-transient errors (`ValueError`, 4xx) are not retried.
- **Fallback**: `Router` iterates `[primary, *fallbacks]`; each provider's retries run first, then the router moves on. `FallbackExhausted` is raised if all fail, carrying the list of underlying exceptions.
- **Blocked content**: NSFW / policy refusals return `status="blocked"` instead of raising — never retried.

## Observability

Every `generate()` / `generate_async()` call emits one structured log record `image_gen.request` with canonical fields:

```
request_id, provider, model, latency_ms, status, bytes_out,
cost_usd, retry_count, fallback_used, correlation_id
```

Set `IMAGE_GEN_LOG_FORMAT=json` for one-line JSON per request (Datadog / CloudWatch / Loki-friendly).

Install the optional `[metrics]` extra to expose Prometheus counters: `image_gen_requests_total`, `image_gen_retries_total`, `image_gen_fallbacks_total`, `image_gen_cost_usd_total`, and histogram `image_gen_latency_seconds`.

## Extending — add a provider as a plugin

No fork needed. In your own package:

```python
# runway_image_gen/__init__.py
from image_gen.providers.base import ImageProvider

class RunwayClient:
    def generate_image(self, prompt, model, aspect_ratio, resolution, **kw): ...
    def download(self, url, output_path): ...
    def close(self): ...
```

```toml
# pyproject.toml
[project.entry-points."image_gen.providers"]
runway = "runway_image_gen:RunwayClient"
```

Then `pip install your-package` and `generate(model="runway:gen-3")` just works — entry-point discovery picks it up.

Also works at runtime:

```python
from image_gen import register_provider
register_provider("custom", MyClientFactory)
generate(model="custom:foo")
```

A complete example lives in [`examples/runway_plugin/`](examples/runway_plugin/).

## Layout

```
src/image_gen/
├── __init__.py                 # generate(), generate_async(), Router re-export
├── cli.py                      # image-gen CLI
├── config.py                   # lazy env readers
├── router.py                   # Router, FallbackExhausted
├── observability.py            # StructuredLogger, JSONFormatter, correlation_id
├── metrics.py                  # Prometheus (optional, gated on import)
├── py.typed                    # PEP 561 marker
└── providers/
    ├── __init__.py             # registry + plugin discovery (register_provider, get_provider)
    ├── base.py                 # ImageProvider Protocol (sync + async)
    ├── result.py               # GenerationResult
    ├── catalog.yaml            # SINGLE SOURCE OF TRUTH for models
    ├── catalog.py              # ModelEntry, resolve, validate
    ├── _retry.py               # @retry_policy, is_retryable
    ├── _poll.py                # poll_until, poll_until_async
    ├── README.md               # "how to add an async-task provider"
    ├── openai_images.py
    ├── fal_images.py
    ├── google_images.py        # AI Studio + Vertex AI
    └── higgsfield.py
examples/
├── runway_plugin/              # third-party plugin example
└── benchmark_concurrent.py     # async concurrency demo
```

## License

MIT
