Metadata-Version: 2.4
Name: diffpose-video
Version: 0.1.0
Summary: 3D human pose estimation from video using DiffPose + MixSTE
License: MIT
Keywords: pose estimation,3d pose,diffusion,human motion
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: einops
Requires-Dist: timm>=0.9.0
Requires-Dist: tqdm
Requires-Dist: pyyaml
Requires-Dist: matplotlib
Requires-Dist: opencv-python-headless
Requires-Dist: rtmlib
Requires-Dist: onnxruntime-gpu==1.20.1
Requires-Dist: dash
Requires-Dist: plotly
Requires-Dist: requests
Provides-Extra: torch-cuda
Requires-Dist: torch>=2.0; extra == "torch-cuda"
Requires-Dist: torchvision; extra == "torch-cuda"
Requires-Dist: torchaudio; extra == "torch-cuda"
Dynamic: license-file

# DiffPose-Video

3D human pose estimation from arbitrary video using **MixSTE** (2D→3D lifting) and **DiffPose** (diffusion-based refinement).

This package wraps the original [DiffPose](https://github.com/GONGJIA0208/Diffpose) research code with a clean inference pipeline, an interactive visualisation dashboard, and a video renderer — all accessible as CLI commands after a single `pip install`.

> **Paper:** [DiffPose: Toward More Reliable 3D Pose Estimation](https://arxiv.org/abs/2211.16940), CVPR 2023.

---

## Install

```bash
# 1. Install PyTorch for your CUDA version first (example: CUDA 12.8)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

# 2. Install this package
pip install diffpose-video
```

> **Note:** `onnxruntime-gpu==1.20.1` is pinned because later versions have a broken CUDA provider on some systems.

---

## Download pretrained checkpoints

```bash
diffpose-download
```

Downloads all pretrained weights to `~/.cache/diffpose_video/checkpoints/`. Safe to re-run — skips files that already exist.

---

## Usage

### 1. Run inference on a video

```bash
diffpose-infer \
  --input        video.mp4 \
  --config       configs/human36m_diffpose_uvxyz_cpn.yml \
  --model_pose   ~/.cache/diffpose_video/checkpoints/mixste_cpn_243f.bin \
  --model_diff   ~/.cache/diffpose_video/checkpoints/diffpose_video_uvxyz_cpn.pth \
  --output_dir   results/
```

Output: `results/<video_name>.npz` containing:
- `poses_3d` — `(T, 17, 3)` root-relative 3D joint positions
- `keypoints_2d` — `(T, 17, 3)` pixel-space 2D detections + confidence

Process a whole folder of videos by passing a directory to `--input`.

### 2. Interactive dashboard

```bash
diffpose-explore \
  --npz   results/video.npz \
  --video video.mp4 \
  --fps   30
```

Opens a Plotly Dash app at `http://localhost:8050` with:
- Synchronized video playback with 2D skeleton overlay
- Animated 3D skeleton
- X / Y / Z trajectory graphs per joint
- Play/pause + frame scrubber, all linked

### 3. Render a side-by-side MP4

```bash
diffpose-visualise \
  --npz    results/video.npz \
  --video  video.mp4 \
  --output results/video_vis.mp4
```

Produces a video with the original footage (+ 2D overlay) on the left and the animated 3D skeleton on the right.

---

## Docker

```bash
docker build -t diffpose-video .

docker run --gpus all \
  -v $(pwd)/checkpoints:/workspace/checkpoints \
  -v $(pwd)/results:/workspace/results \
  -v /path/to/videos:/videos \
  diffpose-video \
  diffpose-infer --input /videos/clip.mp4 \
    --config configs/human36m_diffpose_uvxyz_cpn.yml \
    --model_pose checkpoints/mixste_cpn_243f.bin \
    --model_diff checkpoints/diffpose_video_uvxyz_cpn.pth
```

---

## Citation

```bibtex
@InProceedings{gong2023diffpose,
    author    = {Gong, Jia and Foo, Lin Geng and Fan, Zhipeng and Ke, Qiuhong and Rahmani, Hossein and Liu, Jun},
    title     = {DiffPose: Toward More Reliable 3D Pose Estimation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
}
```
