Metadata-Version: 2.1
Name: agibot-videodataset
Version: 0.1.2
Summary: A GPU-accelerated library that enables random frame access and efficient video decoding for data loading.
Keywords: dataset,nvcodec,serious-scaffold,video
Author-Email: genie <genie@agibot.com>
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering
Classifier: Typing :: Typed
Project-URL: documentation, https://AgiBot-World.github.io/VideoDataset
Project-URL: issue, https://github.com/AgiBot-World/VideoDataset/issues
Project-URL: repository, https://github.com/AgiBot-World/VideoDataset
Requires-Python: >=3.10
Requires-Dist: numpy
Requires-Dist: torch>=2.2.1
Description-Content-Type: text/markdown

# VideoDataset

<!-- SPHINX-START -->

A GPU-accelerated library that enables random frame access and efficient video decoding for data loading.

[![Documentation](https://img.shields.io/badge/Documentation-sphinx-blue)](https://AgiBot-World.github.io/VideoDataset)
[![License](https://img.shields.io/github/license/AgiBot-World/VideoDataset)](https://github.com/AgiBot-World/VideoDataset/blob/master/LICENSE)
[![SS Badge](https://img.shields.io/badge/Serious%20Scaffold-pybind11-blue)](https://github.com/serious-scaffold/ss-pybind11)

[![CI](https://github.com/AgiBot-World/VideoDataset/actions/workflows/ci.yml/badge.svg)](https://github.com/AgiBot-World/VideoDataset/actions/workflows/ci.yml)
[![CD](https://github.com/AgiBot-World/VideoDataset/actions/workflows/cd.yml/badge.svg)](https://github.com/AgiBot-World/VideoDataset/actions/workflows/cd.yml)
[![CommitLint](https://github.com/AgiBot-World/VideoDataset/actions/workflows/commitlint.yml/badge.svg)](https://github.com/AgiBot-World/VideoDataset/actions/workflows/commitlint.yml)
[![Renovate](https://github.com/AgiBot-World/VideoDataset/actions/workflows/renovate.yml/badge.svg)](https://github.com/AgiBot-World/VideoDataset/actions/workflows/renovate.yml)
[![Semantic Release](https://github.com/AgiBot-World/VideoDataset/actions/workflows/semantic-release.yml/badge.svg)](https://github.com/AgiBot-World/VideoDataset/actions/workflows/semantic-release.yml)
[![Coverage](https://img.shields.io/endpoint?url=https://AgiBot-World.github.io/VideoDataset/_static/badges/coverage.json)](https://AgiBot-World.github.io/VideoDataset/reports/coverage)

[![Release](https://img.shields.io/github/v/release/AgiBot-World/VideoDataset)](https://github.com/AgiBot-World/VideoDataset/releases)
[![PyPI](https://img.shields.io/pypi/v/agibot-videodataset)](https://pypi.org/project/agibot-videodataset/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/agibot-videodataset)](https://pypi.org/project/agibot-videodataset/)
[![GitHub](https://img.shields.io/github/license/AgiBot-World/VideoDataset)](https://github.com/AgiBot-World/VideoDataset/blob/main/LICENSE)

[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)
[![Checked with mypy](https://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Conventional Commits](https://img.shields.io/badge/Conventional%20Commits-1.0.0-%23FE5196?logo=conventionalcommits&logoColor=white)](https://conventionalcommits.org)
[![Copier](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/copier-org/copier/master/img/badge/badge-grayscale-inverted-border-orange.json)](https://github.com/copier-org/copier)
[![Serious Scaffold Python](https://img.shields.io/endpoint?url=https://serious-scaffold.github.io/ss-python/_static/badges/logo.json)](https://serious-scaffold.github.io/ss-python)

> [!WARNING]
> _VideoDataset_ is in the **Alpha** phase.
> Frequent changes and instability should be anticipated.
> Any feedback, comments, suggestions and contributions are welcome!

## Overview

VideoDataset is a high-performance video decoding multi-framework supporting library. It aims to provide framework-integrated solutions for working with video decoding tasks.

Key Features:

- GPU-accelerated video decoding using NvCodec library
- Support for common video formats (H.264, H.265, etc.)
- Easy integration with multi-frameworks and multi-formats.

## Installation

### Prerequisites

- NVIDIA GPU with CUDA support and CUDA Toolkit 12.0+ installed
- FFmpeg installed
- Python 3.10 or later

### Install from PyPI

```bash
pip install agibot-videodataset
```

### Building from Source

```bash
pip install git+https://github.com/AgiBot-World/VideoDataset.git
```

## Quick Start

The complete example can be found in the [quickstart](https://AgiBot-World.github.io/VideoDataset/quickstart.html) documentation.

## Documentation

Please refer to full [documentation](https://AgiBot-World.github.io/VideoDataset) here.

Also, a sphinx-based documentation can be generated by running the following command:

```bash
make dev-doc doc
```

It will generate the documentation in the `docs/_build/html` directory and serve it on <http://localhost:8000>.

## Performance

VideoDataset is optimized for high-throughput video processing. Benchmark results show:

- **GPU Decoding:** A decoding throughput of 20,000 FPS is achieved in a multiprocessing scenario.
- **Random Access:** Minimal overhead for non-sequential frame access.
- **GPU Decoder Utilization:** Over 90% GPU decoder utilization is achieved in a multiprocessing scenario.

See the [benchmark documentation](https://AgiBot-World.github.io/VideoDataset/benchmark.html) for detailed performance analysis.

### Comparison with other CPU decoding solutions

In addition​, we conducted a comprehensive benchmark comparing it against mainstream CPU software decoding solutions, including OpenCV, Torchvision (PyAV), Torchvision (VideoReader), and TorchCodec (CPU).The results demonstrate that VideoDataset achieves a 3 to 4 times improvement in decoding throughput.

![CPU Throughput](https://AgiBot-World.github.io/VideoDataset/_static/images/cpu_throughput_benchmark.jpg)

Furthermore, it also demonstrates outstanding performance in reducing CPU utilization.

![CPU Utilization](https://AgiBot-World.github.io/VideoDataset/_static/images/cpu_utilization_benchmark.jpg)

## Development Status

- [X] GPU acceleration via NvCodec
- [X] Random frame access
- [X] PyTorch integration
- [ ] Compatibility with LeRobot
- [ ] Asynchronous pipeline optimization

## License

MIT License, for more details, see the [LICENSE](https://github.com/AgiBot-World/VideoDataset/blob/master/LICENSE) file.
