Metadata-Version: 2.1
Name: bithuman
Version: 1.12.2
Summary: Portable C++ avatar runtime — Python bindings via pybind11. Powers the bitHuman Essence pipeline cross-platform.
Keywords: bithuman,avatar,essence,lipsync,pybind11
Author-Email: bitHuman <hello@bithuman.ai>
License: Commercial — see LICENSE file
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: MacOS
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: C++
Classifier: Topic :: Multimedia
Classifier: Topic :: Multimedia :: Graphics
Classifier: Topic :: Multimedia :: Sound/Audio
Classifier: Topic :: Multimedia :: Video
Project-URL: Homepage, https://bithuman.ai
Project-URL: Documentation, https://docs.bithuman.ai
Project-URL: Source, https://github.com/bithuman-product/bithuman-sdk
Requires-Python: >=3.9
Requires-Dist: numpy>=1.24
Provides-Extra: test
Requires-Dist: pytest>=7; extra == "test"
Requires-Dist: psutil>=5.9; extra == "test"
Provides-Extra: cli
Requires-Dist: soundfile>=0.12; extra == "cli"
Requires-Dist: imageio>=2.34; extra == "cli"
Requires-Dist: imageio-ffmpeg>=0.5; extra == "cli"
Description-Content-Type: text/markdown

# bithuman

This is the **Python flavor of Layer 3**: a platform-specific library for app developers. It wraps the Layer 1 [`libessence` engine](../../README.md). For the CLI tool see [`docs/CLI.md`](../../../docs/CLI.md).

```
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Platform-specific libraries (app developers)       │
│   - Python wheel       pip install bithuman    ◄──── you are here
│   - Swift package      SwiftPM Bithuman                     │
│   - Kotlin AAR         ai.bithuman:sdk                      │
│   - (future) Rust crate, JS/TS, Go, ...                     │
└─────────────────────────────────────────────────────────────┘
                          ▼ embeds
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: bithuman CLI (end-user tool)                       │
│   - one cross-platform binary on macOS / Linux / Windows    │
│   - brew install bithuman · curl-pipe installer             │
└─────────────────────────────────────────────────────────────┘
                          ▼ links
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: libessence engine (cross-platform C++ core)        │
│   - portable C ABI, same source on every target             │
│   - macOS · iOS · Android · Linux · Windows                 │
│   - never imported directly by app developers               │
└─────────────────────────────────────────────────────────────┘
```

Python bindings for the **bitHuman SDK** — the portable C++ avatar engine
(`libessence`) that powers our cross-platform lipsync pipeline. The wheel
ships a native pybind11 module that talks directly to `libessence`,
so you get the same per-frame cost as our Swift and Kotlin clients with
none of the GIL noise.

On an Apple M5 with 24 GB unified memory we measure **~640 FPS sustained
compose** (1.56 ms/frame mean, 2.03 ms p99) for a 1248×704 avatar, with
**~206 MB peak RSS** end-to-end. Cold load is ~14 ms for the fixture and
~400 ms for the first compose tick (lazy ONNX init).

This package is namespace-isolated from the v0 `bithuman` SDK; you can
install both side-by-side.

## Install

```sh
pip install bithuman
```

Current PyPI version: **0.1.1** (matches the `libessence-v0.1.1` tag).

## Compatibility

- **Platforms:** macOS arm64, Linux x86_64, Linux arm64 — all ship as wheels.
  Windows is tracked for a follow-up.
- **Python:** 3.10 – 3.13 (cp310, cp311, cp312, cp313). CPython only.
- **ABI:** wraps `libessence` ABI v4 (auth + auto-fit canvas).
- **Auth:** ships with live heartbeat against `api.bithuman.ai` baked into
  `libessence`. `Avatar.load(api_secret=...)` is the entry point;
  `BITHUMAN_API_SECRET` env var works too. Set `BITHUMAN_UNMETERED=1`
  for dev / parity-test runs.

## What you get

The package exposes three API tiers (all importable from `bithuman`):

| Tier        | Types                                                            | Use when…                                            |
| ----------- | ---------------------------------------------------------------- | ---------------------------------------------------- |
| Async       | `AsyncAvatar`, `AudioChunk`, `VideoControl`, `VideoFrame`        | Hosting a service / parity with legacy `AsyncBithuman` |
| Sync facade | `Avatar`, `ComposedFrame`, `EP`                                  | Offline / batch / CLI rendering                      |
| Low-level   | `Fixture`, `Runtime`, `EP_CPU`/`EP_AUTO`/`EP_COREML`/`EP_NNAPI`/`EP_QNN` | Direct C ABI access, custom audio pipeline           |

Error types: `BithumanError` (base), `TokenError` /
`TokenExpiredError` / `TokenValidationError` / `TokenRequestError` /
`AccountStatusError` (auth), `ModelError` / `ModelNotFoundError` /
`ModelLoadError` / `ModelSecurityError` / `ExpressionModelNotSupported`
(fixture), `RuntimeNotReadyError`.

Version info: `bithuman.__version__` (Python package),
`bithuman.__core_version__` (linked libessence), `bithuman.__abi_version__`.

## Quickstart

```python
from bithuman import Avatar

with Avatar.load("model.imx") as avatar:
    for frame in avatar.compose("speech.wav"):
        # frame.bgr is a (H, W, 3) uint8 numpy array in BGR pixel order
        cv2.imshow("avatar", frame.bgr)
        cv2.waitKey(40)
```

`Avatar.compose` accepts a 16 kHz float32 mono numpy array OR a path to
any WAV / MP3 / FLAC / OGG file (decoded and resampled via
[`soundfile`](https://pysoundfile.readthedocs.io) when needed).

## CLI

A `essence-render` console script ships with the wheel:

```sh
pip install 'bithuman[cli]'

essence-render \
  --model ~/.cache/bithuman/models/sample-avatar.imx \
  --audio speech.wav \
  --output out.mp4
```

Pass `--output -` to stream raw BGR24 frames to stdout (handy for piping
into a separate ffmpeg pipeline or a custom encoder). Other flags:

| Flag | Default | Description |
| ---- | ------- | ----------- |
| `--fps` | 25 | Output FPS for the MP4 container. |
| `--quality` | 80 | libx264 quality 1..100 (higher = better). |
| `--ep` | `cpu` | Execution provider hint (`cpu`/`auto`/`coreml`/…). |
| `--threads` | 1 | ORT intra-op thread count. |
| `--no-audio` | – | Skip audio muxing; produce a silent video. |

Example end-to-end run (5 s sine sweep):

```
essence-render 0.1.0: model=sample-avatar.imx audio=sine_sweep_5s.wav ep=cpu threads=1
essence-render: loaded fixture in 14.9 ms — 1248x704 @ 25 fps, 183 clusters, 202 src frames
essence-render: composed 122 frames in 1.83s (14.96 ms/frame, 66.8 fps)
essence-render: wrote /tmp/sine_sweep_5s.mp4
```

(Throughput here is bounded by H.264 encode, not Essence inference. Use
`--output -` if you want to measure raw compose speed.)

## Low-level API

If you need finer control or want to swap in a custom audio pipeline,
the C ABI is exposed directly:

```python
import numpy as np
from bithuman import Fixture, Runtime, EP_CPU

fx = Fixture("model.imx", preferred_ep=EP_CPU, intra_op_threads=1)
rt = Runtime(fx)
pcm = np.fromfile("speech.f32", dtype=np.float32)  # 16 kHz mono float32
cluster_idx, bgr = rt.tick_compose(pcm, frame_idx_hint=-1)
# bgr.shape == (fx.frame_height, fx.frame_width, 3), dtype uint8
```

Pass the entire pcm buffer to each `tick_compose` call; the runtime
maintains an internal cursor and advances one tick per call until the
audio is exhausted.

## Build from source

You need the prebuilt parent C++ archive at
`cpp/build/libessence.a` (run the parent CMake build first), plus
the runtime deps from Homebrew (`onnxruntime`, `webp`, `ffmpeg`,
`hdf5`, `jpeg-turbo`).

```sh
cd cpp/bindings/python
uv pip install -e '.[cli,test]' --no-build-isolation
```

The CMake glue links the prebuilt static archive directly — it does NOT
re-run the parent build, so iterate on bindings without paying the C++
rebuild cost.

## Performance

Measured with `tests/bench.py` against the v1 compose path
(audio → composited BGR frame) on Apple M5 24 GB:

| Metric | Value |
| ------ | ----- |
| Fixture load | 14 ms |
| First compose (lazy v1 init) | 399 ms |
| Steady-state mean | **1.56 ms / frame** |
| p50 | 1.54 ms |
| p99 | 2.03 ms |
| Sustained throughput | **641 FPS** |
| Peak RSS (proc) | 206 MB |

## Linux wheels

Pre-built `manylinux_2_28` wheels ship for x86_64 + aarch64 across cp310
through cp313 — 8 wheels in total, all auditwheel-repaired with the
full dep tree bundled (ORT, FFmpeg, HDF5, libjpeg-turbo, libwebp,
libcurl, OpenSSL).

To rebuild them locally:

```sh
# One-time: build the dep-baked Docker images (~10 min each).
docker build --platform linux/amd64 -t libessence/manylinux-x86_64:0.1 \
    -f scripts/Dockerfile.manylinux-x86_64 scripts/
docker build --platform linux/arm64/v8 -t libessence/manylinux-aarch64:0.1 \
    -f scripts/Dockerfile.manylinux-aarch64 scripts/

# Per wheel build (~2 min):
docker run --rm --platform linux/amd64 -v "$REPO":/src \
    -e PYTAG=cp311 -e ARCH_INSIDE=x86_64 \
    libessence/manylinux-x86_64:0.1 \
    bash /src/cpp/bindings/python/scripts/build-wheel-in-container.sh
```

## Limitations

- Windows wheels not yet built — tracked for v0.2.
- The CLI's output framerate is fixed at 25 fps to match the model's
  internal rate. Pass `--output -` and pipe to your own encoder if you
  need temporal resampling.
- `preferred_ep=COREML/NNAPI/QNN` is accepted but currently no-ops to
  CPU in the v0.1 build.

## License

Commercial. Contact <hello@bithuman.ai>.

## See also

- [Root `README.md`](../../../README.md) — install matrix
- [`cpp/README.md`](../../README.md) — libessence engine internals + C ABI
- [`docs/CLI.md`](../../../docs/CLI.md) — `bithuman` CLI reference
- [`cpp/bindings/swift/README.md`](../swift/README.md) — Swift binding
- [`cpp/bindings/kotlin/README.md`](../kotlin/README.md) — Kotlin/Android binding
- [`docs/BUILD_AND_RELEASE.md`](../../../docs/BUILD_AND_RELEASE.md) — release flow
