Metadata-Version: 2.1
Name: bithuman
Version: 2.3.8
Summary: bitHuman Python SDK — libessence-backed avatar runtime. `from bithuman import AsyncBithuman`.
Keywords: bithuman,avatar,essence,lipsync,pybind11
Author-Email: bitHuman <hello@bithuman.ai>
License: Commercial — see LICENSE file
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: MacOS
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: C++
Classifier: Topic :: Multimedia
Classifier: Topic :: Multimedia :: Graphics
Classifier: Topic :: Multimedia :: Sound/Audio
Classifier: Topic :: Multimedia :: Video
Project-URL: Homepage, https://bithuman.ai
Project-URL: Documentation, https://docs.bithuman.ai
Project-URL: Source, https://github.com/bithuman-product/bithuman-sdk-public
Requires-Python: >=3.10
Requires-Dist: numpy>=1.26.0
Requires-Dist: loguru~=0.7
Requires-Dist: soundfile~=0.13
Requires-Dist: pydantic~=2.10
Requires-Dist: pydantic-settings~=2.8
Requires-Dist: av>=12.0
Requires-Dist: opencv-python-headless>=4.8
Provides-Extra: test
Requires-Dist: pytest>=7; extra == "test"
Requires-Dist: psutil>=5.9; extra == "test"
Description-Content-Type: text/markdown

# bithuman

This is the **Python flavor of L2**: a platform-specific SDK for app developers. It wraps the L1 `libessence` engine. For the CLI tool — an L3 app — see the [CLI docs](https://docs.bithuman.ai/cli).

```
┌─────────────────────────────────────────────────────────────┐
│ L3: Apps (bithuman-apps — separate repo)                    │
│   - bithuman CLI       brew / pip install bithuman-cli      │
│   - Flutter plugin · Mac/iPad reference apps                │
└─────────────────────────────────────────────────────────────┘
                          ▼ consumes
┌─────────────────────────────────────────────────────────────┐
│ L2: Platform SDKs (app developers)                          │
│   - Python wheel       pip install bithuman    ◄──── you are here
│   - Swift package      SwiftPM Bithuman                     │
│   - Kotlin AAR         ai.bithuman:sdk                      │
│   - Rust crate (in-tree)                                    │
└─────────────────────────────────────────────────────────────┘
                          ▼ wraps
┌─────────────────────────────────────────────────────────────┐
│ L1: libessence engine (cross-platform C++ core)             │
│   - portable C ABI, same source on every target             │
│   - macOS · iOS · Android · Linux · Windows                 │
│   - never imported directly by app developers               │
└─────────────────────────────────────────────────────────────┘
```

## Library only — the CLI moved out

This wheel ships the **library only**. The `bithuman` command-line tool
(`bithuman run model.imx`, `bithuman render`, `bithuman doctor`) lives in
the sibling **`bithuman-cli`** PyPI wheel (an L3 app); `bithuman-cli`
depends on this `bithuman` wheel for the in-process avatar runtime, and
both share the same libessence native engine.

```python
# Library — embed the runtime in your own app
from bithuman import AsyncBithuman
avatar = await AsyncBithuman.create(model_path="model.imx", api_secret="bh-...")
```

```sh
# CLI — separate wheel
pip install bithuman-cli
bithuman run model.imx
```

----

Python bindings for the **bitHuman SDK** — the portable C++ avatar engine
(`libessence`) that powers our cross-platform lipsync pipeline. The wheel
ships a native pybind11 module that talks directly to `libessence`,
so you get the same per-frame cost as our Swift and Kotlin clients with
none of the GIL noise.

On an Apple M5 with 24 GB unified memory we measure **~640 FPS sustained
compose** (1.56 ms/frame mean, 2.03 ms p99) for a 1248×704 avatar, with
**~206 MB peak RSS** end-to-end. Cold load is ~14 ms for the fixture and
~400 ms for the first compose tick (lazy ONNX init).

This package is namespace-isolated from the v0 `bithuman` SDK; you can
install both side-by-side.

## Install

```sh
pip install bithuman                 # the library (slim wheel — no CLI, no brain)
```

> **Status.** As of 2.3, the PyPI `bithuman` wheel is a slim
> library-only wheel. The bundled Rust CLI and conversation brain that
> 2.0–2.2.x shipped have been **extracted into the sibling `bithuman-cli`
> wheel** (`pip install bithuman-cli`, with `bithuman-cli[local]` for the
> on-device brain). This wheel exports the avatar runtime only:
> `AsyncBithuman` (streaming) and `Bithuman` (sync single-shot).

## Compatibility

- **Platforms:** macOS arm64, Linux x86_64, Linux arm64 — all ship as wheels.
  Windows is tracked for a follow-up.
- **Python:** 3.10 – 3.14 (cp310 through cp314). CPython only.
- **ABI:** wraps the `libessence` engine core (2.3.6, **ABI v7**) via
  pybind11.
- **Auth:** ships with live heartbeat against `api.bithuman.ai` baked into
  `libessence`. `Bithuman.load(api_secret=...)` /
  `AsyncBithuman.create(api_secret=...)` is the entry point;
  `BITHUMAN_API_SECRET` env var works too.

## What you get

The package exposes three API tiers (all importable from `bithuman`):

| Tier        | Types                                                            | Use when…                                            |
| ----------- | ---------------------------------------------------------------- | ---------------------------------------------------- |
| Async       | `AsyncBithuman` (alias `AsyncAvatar`), `AudioChunk`, `VideoControl`, `VideoFrame` | Streaming a live conversation / hosting a service |
| Sync facade | `Bithuman` (alias `Avatar`), `ComposedFrame`, `EP`              | Offline / batch single-shot rendering               |
| Low-level   | `Fixture`, `Runtime`, `EP_CPU`/`EP_AUTO`/`EP_COREML`/`EP_NNAPI`/`EP_QNN` | Direct C ABI access, custom audio pipeline           |

Error types: `BithumanError` (base), `TokenError` /
`TokenExpiredError` / `TokenValidationError` / `TokenRequestError` /
`AccountStatusError` (auth), `ModelError` / `ModelNotFoundError` /
`ModelLoadError` / `ModelSecurityError` / `ExpressionModelNotSupported`
(fixture), `RuntimeNotReadyError`.

Version info: `bithuman.__version__` (Python package),
`bithuman.__core_version__` (linked libessence), `bithuman.__abi_version__`.

## Quickstart (async streaming — `AsyncBithuman`)

Build the runtime with `AsyncBithuman.create(...)`, feed PCM with
`push_audio`, mark end-of-speech with `flush()`, and drain composed
frames from the `run()` async iterator.

```python
import asyncio
from bithuman import AsyncBithuman

async def main():
    avatar = await AsyncBithuman.create(
        model_path="model.imx",
        api_secret="bh-...",  # or BITHUMAN_API_SECRET env var
    )

    await avatar.push_audio(pcm_16k_mono_int16_bytes,
                            sample_rate=16000, last_chunk=True)
    await avatar.flush()      # end-of-speech marker

    async for frame in avatar.run():
        # frame.bgr_image is (H, W, 3) uint8 in BGR order
        ...

    await avatar.stop()

asyncio.run(main())
```

PCM accepted is int16 little-endian bytes; `sample_rate` is resampled to
16 kHz mono internally. WAV / MP3 / FLAC / OGG decoding is the caller's
responsibility (use `soundfile`).

## Quickstart (sync single-shot — `Bithuman`)

For offline / batch rendering, `Bithuman.load(...)` then iterate
`compose(audio)`, which yields one `ComposedFrame` per 40 ms (25 fps)
of input:

```python
from bithuman import Bithuman

avatar = Bithuman.load("model.imx", api_secret="bh-...")
for frame in avatar.compose("speech.wav"):   # ndarray or WAV/MP3/FLAC path
    # frame.bgr is (H, W, 3) uint8 BGR; frame.frame_idx / frame.cluster_idx
    ...
```

## Live avatar / CLI — `bithuman-cli` (separate wheel)

The `bithuman run` / `bithuman render` command-line tool and the
conversation brain (`bithuman-cli[local]` for the fully on-device
whisper.cpp + llama.cpp + Supertonic stack) are **no longer bundled in
this wheel**. They ship as the sibling **`bithuman-cli`** PyPI wheel,
which depends on this `bithuman` runtime:

```sh
pip install bithuman-cli            # cloud brain (OpenAI Realtime)
pip install 'bithuman-cli[local]'   # fully on-device brain

bithuman run avatar.imx             # live avatar (browser-to-talk)
```

See the [CLI docs](https://docs.bithuman.ai/cli) and the `bithuman-cli` README
for the full command surface and brain modes. Both wheels share the same
libessence native engine.

## Low-level API

If you need finer control or want to swap in a custom audio pipeline,
the C ABI is exposed directly:

```python
import numpy as np
from bithuman import Fixture, Runtime, EP_CPU

fx = Fixture("model.imx", preferred_ep=EP_CPU, intra_op_threads=1)
rt = Runtime(fx)
pcm = np.fromfile("speech.f32", dtype=np.float32)  # 16 kHz mono float32
cluster_idx, bgr = rt.tick_compose(pcm, frame_idx_hint=-1)
# bgr.shape == (fx.frame_height, fx.frame_width, 3), dtype uint8
```

Pass the entire pcm buffer to each `tick_compose` call; the runtime
maintains an internal cursor and advances one tick per call until the
audio is exhausted.

### Zero-alloc hot path (since 1.12.4)

For tight render loops, pre-allocate the BGR buffer once and pass it
via `out=`. The runtime writes into it in place and returns just the
`cluster_idx`. This drops wrapper overhead to within ~3 % of raw
libessence (vs ~8 % for the alloc-per-tick path):

```python
out = np.empty((fx.frame_height, fx.frame_width, 3), dtype=np.uint8)
for _ in range(num_ticks):
    cluster_idx = rt.tick_compose(pcm, -1, out=out)
    # `out` now holds this tick's frame; read it before the next call.
```

The same `out=` keyword works on `tick_compose_to_size`. See
`docs/ARCHITECTURE.md` §9 for the cross-wrapper perf table.

## Build from source

You need the prebuilt parent C++ archive at
`engine/essence/build/libessence.a` (run the parent CMake build first), plus
the runtime deps from Homebrew (`onnxruntime`, `webp`, `ffmpeg`,
`hdf5`, `jpeg-turbo`).

```sh
cd sdks/python
uv pip install -e '.[test]' --no-build-isolation   # only the `test` extra ships
```

The CMake glue links the prebuilt static archive directly — it does NOT
re-run the parent build, so iterate on bindings without paying the C++
rebuild cost.

## Performance

Measured with `tests/bench.py` against the v1 compose path
(audio → composited BGR frame) on Apple M5 24 GB, libessence 1.16.0:

| Metric                       | Alloc per tick     | `out=` reuse buffer |
| ---------------------------- | ------------------ | ------------------- |
| Steady-state mean            | 1.53 ms / frame    | **1.45 ms / frame** |
| p99                          | 1.66 ms            | 1.53 ms             |
| Sustained throughput         | 655 FPS            | **692 FPS**         |
| Overhead vs raw libessence   | +8.3 %             | **+2.6 %**          |
| Peak RSS (proc)              | 192 MB             | 182 MB              |

Wrapper overhead is within 5 % of raw libessence on the `out=` path;
see `docs/ARCHITECTURE.md` §9 for the apples-to-apples methodology and
the cross-wrapper comparison. Reproduce with:

```sh
scripts/bench-wrappers.sh
```

## Linux wheels

Pre-built `manylinux_2_28` wheels ship for x86_64 + aarch64 across cp310
through cp314 — 10 wheels in total, all auditwheel-repaired with the
full dep tree bundled (ORT, FFmpeg, HDF5, libjpeg-turbo, libwebp,
libcurl, OpenSSL).

To rebuild them locally:

```sh
# One-time: build the dep-baked Docker images (~10 min each).
docker build --platform linux/amd64 -t libessence/manylinux-x86_64:0.1 \
    -f scripts/Dockerfile.manylinux-x86_64 scripts/
docker build --platform linux/arm64/v8 -t libessence/manylinux-aarch64:0.1 \
    -f scripts/Dockerfile.manylinux-aarch64 scripts/

# Per wheel build (~2 min):
docker run --rm --platform linux/amd64 -v "$REPO":/src \
    -e PYTAG=cp311 -e ARCH_INSIDE=x86_64 \
    libessence/manylinux-x86_64:0.1 \
    bash /src/sdks/python/scripts/build-wheel-in-container.sh
```

## Limitations

- Windows wheels not yet built — tracked for a follow-up.
- `compose()` emits a fixed 25 fps frame stream to match the model's
  internal rate; resample downstream if you need a different cadence.
- `preferred_ep=COREML/NNAPI/QNN` is accepted but currently no-ops to
  CPU in the current build.

## License

Commercial. Contact <hello@bithuman.ai>.

## See also

- [docs.bithuman.ai](https://docs.bithuman.ai) — full product + SDK docs
- [docs.bithuman.ai/cli](https://docs.bithuman.ai/cli) — `bithuman` CLI reference
- [bithuman-sdk-public](https://github.com/bithuman-product/bithuman-sdk-public) — examples (Python, Swift, CLI, REST) + Swift package
- [`bithuman-cli` on PyPI](https://pypi.org/project/bithuman-cli/) — the sibling CLI wheel
- In-repo (private monorepo): `engine/essence/README.md` (engine internals), `sdks/swift/README.md`, `sdks/kotlin/README.md`, `docs/BUILD_AND_RELEASE.md`
