Metadata-Version: 2.4
Name: anycv
Version: 0.2.4
Summary: A simple, unified computer vision inference toolkit. One-liner API for common CV tasks.
Project-URL: Homepage, https://www.nrl.ai
Project-URL: Repository, https://github.com/vietanhdev/anycv
Project-URL: Issues, https://github.com/vietanhdev/anycv/issues
Author-email: Viet-Anh Nguyen <vietanh.dev@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: classification,computer-vision,detection,inference,onnx,segmentation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Requires-Python: >=3.8
Requires-Dist: click>=8.0
Requires-Dist: huggingface-hub
Requires-Dist: numpy
Requires-Dist: onnxruntime
Requires-Dist: pillow
Requires-Dist: requests
Provides-Extra: all
Requires-Dist: anyllm>=0.1.0; extra == 'all'
Requires-Dist: onnxruntime-gpu; extra == 'all'
Requires-Dist: opencv-python>=4.5.0; extra == 'all'
Provides-Extra: benchmark
Requires-Dist: numpy; extra == 'benchmark'
Provides-Extra: dev
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Provides-Extra: gpu
Requires-Dist: onnxruntime-gpu; extra == 'gpu'
Provides-Extra: llm
Requires-Dist: anyllm>=0.1.0; extra == 'llm'
Provides-Extra: progress
Requires-Dist: tqdm; extra == 'progress'
Provides-Extra: video
Requires-Dist: opencv-python>=4.5.0; extra == 'video'
Description-Content-Type: text/markdown

<h1 align="center">anycv</h1>
<p align="center"><em>One-liner computer vision — detect, classify, and segment in 3 lines of code.</em></p>

<p align="center">
<img src="https://img.shields.io/pypi/v/anycv.svg" alt="PyPI">
<img src="https://img.shields.io/pypi/pyversions/anycv.svg" alt="Python">
<img src="https://img.shields.io/pypi/l/anycv.svg" alt="License">
</p>

**anycv** is a dead-simple computer vision inference library. It wraps the most popular pretrained models (YOLOv8 for detection, MobileNetV2/ResNet50 for classification, DeepLabV3 for segmentation) behind a one-liner API. Models are executed via ONNX Runtime for fast CPU inference (or GPU if available), auto-downloaded from Hugging Face Hub on first use, and cached locally. No PyTorch required at inference time.

Built by [Viet-Anh Nguyen](https://github.com/vietanhdev) at [NRL.ai](https://www.nrl.ai).

## Why anycv?

- **One-liner API** — `anycv.detect("photo.jpg")` returns bounding boxes immediately
- **Plugin architecture** — Register custom backends (TensorRT, OpenVINO, VLMs) via `@register_backend`
- **Local-first** — Models cached in `~/.cache/anycv/`, zero network after first run
- **Minimal core deps** — Only `onnxruntime`, `pillow`, `numpy`; PyTorch is optional
- **Production-ready** — Type hints, dataclass results, streaming, batch inference

## Installation

```bash
pip install anycv
```

For optional features:

```bash
pip install anycv[gpu]       # onnxruntime-gpu for CUDA inference
pip install anycv[torch]     # use PyTorch models directly
pip install anycv[vlm]       # Vision-LLM backend via anyllm (GPT-4o, Claude)
pip install anycv[all]       # everything
```

**Python 3.8+ supported** (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

## Quick Start

```python
import anycv

# 1. Object detection (YOLOv8n via ONNX Runtime — auto-downloaded from HF Hub)
detections = anycv.detect("street.jpg", model="yolov8n")
for d in detections:
    print(d.label, d.confidence, d.bbox)  # e.g. "person" 0.92 (x1,y1,x2,y2)

# 2. Image classification (MobileNetV2 ImageNet-1k via ONNX Runtime)
result = anycv.classify("cat.jpg", model="mobilenetv2")
print(result.top_k(5))                   # [(label, prob), ...]

# 3. Semantic segmentation (DeepLabV3 Pascal-VOC via ONNX Runtime)
mask = anycv.segment("scene.jpg", model="deeplabv3")
mask.save("mask.png")                    # per-pixel class map
```

## Models & Methods

All models are distributed as ONNX files, auto-downloaded from Hugging Face Hub on first use and cached in `~/.cache/anycv/`. Inference runs on **ONNX Runtime** (CPU by default; GPU via `onnxruntime-gpu`).

### Object Detection

| Model | Dataset | Size | Notes |
|---|---|---|---|
| `yolov8n` (default) | COCO (80 classes) | ~6 MB | Fastest, ~10ms on CPU |
| `yolov8s` | COCO (80 classes) | ~22 MB | Balanced |
| `yolov8m` | COCO (80 classes) | ~50 MB | Higher accuracy |

Exported from [Ultralytics YOLOv8](https://github.com/ultralytics/ultralytics).

### Image Classification

| Model | Dataset | Size | Notes |
|---|---|---|---|
| `mobilenetv2` (default) | ImageNet-1k | ~14 MB | Fast, mobile-friendly |
| `resnet50` | ImageNet-1k | ~98 MB | Higher accuracy |

### Semantic Segmentation

| Model | Dataset | Classes | Notes |
|---|---|---|---|
| `deeplabv3` | Pascal VOC 2012 | 21 | Classic segmentation baseline |

### Vision-LLM Backend (optional)

When installed with `anycv[vlm]`, you can use multi-modal LLMs for detection/classification via natural-language prompts:

```python
# Uses anyllm to call GPT-4o, Claude 3.5 Sonnet, Gemini, or local LLaVA
result = anycv.classify("x-ray.jpg", backend="vlm", model="gpt-4o",
                        prompt="Classify findings: normal, pneumonia, or other")
```

### Preprocessing pipeline

Every backend applies: letterbox resize -> BGR/RGB normalization -> mean/std standardization -> NCHW transpose. All handled automatically.

## API Reference

| Function | Purpose |
|---|---|
| `anycv.detect(image, model="yolov8n", conf=0.25)` | Returns `List[Detection]` |
| `anycv.classify(image, model="mobilenetv2")` | Returns `Classification` |
| `anycv.segment(image, model="deeplabv3")` | Returns `SegmentationMask` |
| `anycv.list_models(task="detect")` | List available models |
| `anycv.load_model(name)` | Preload a model (warm cache) |
| `anycv.register_backend(name, cls)` | Register a custom backend |
| `anycv.draw(image, detections)` | Visualize results on the image |

### Result dataclasses

```python
@dataclass
class Detection:
    label: str
    confidence: float
    bbox: tuple[float, float, float, float]  # x1, y1, x2, y2
    class_id: int
```

## CLI Usage

```bash
anycv detect street.jpg --model yolov8n --conf 0.3 --save annotated.jpg
anycv classify cat.jpg --top 5
anycv segment scene.jpg --out mask.png
anycv list-models
```

## Examples

### Batch inference over a folder

```python
import anycv
from pathlib import Path

# Load once, reuse across images (warm model, no reload)
model = anycv.load_model("yolov8s")
for img in Path("images/").glob("*.jpg"):
    dets = model.detect(img, conf=0.4)
    print(img.name, len(dets), "objects")
```

### Use a custom confidence + visualization

```python
import anycv

dets = anycv.detect("crowd.jpg", model="yolov8m", conf=0.5)
annotated = anycv.draw("crowd.jpg", dets)   # Pillow image with boxes
annotated.save("out.jpg")
```

### Register a custom backend

```python
from anycv import register_backend, BaseDetector

@register_backend("detect", "my_tensorrt")
class TensorRTDetector(BaseDetector):
    def predict(self, image): ...

anycv.detect("photo.jpg", model="my_tensorrt")
```

## License

MIT (c) Viet-Anh Nguyen
