Metadata-Version: 2.4
Name: pyaccelerate
Version: 0.1.0
Summary: High-performance Python acceleration engine — CPU, threads, virtual threads, multi-GPU and virtualization.
Author-email: Guilherme Pinheiro <guilherme@example.com>
License: MIT License
        
        Copyright (c) 2026 Guilherme Pinheiro
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/GuilhermeP96/pyaccelerate
Project-URL: Repository, https://github.com/GuilhermeP96/pyaccelerate
Project-URL: Issues, https://github.com/GuilhermeP96/pyaccelerate/issues
Keywords: acceleration,gpu,cuda,opencl,threads,performance,optimization
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: System :: Hardware
Classifier: Topic :: Scientific/Engineering
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: psutil>=5.9
Provides-Extra: cuda
Requires-Dist: cupy-cuda12x>=13.0; extra == "cuda"
Provides-Extra: opencl
Requires-Dist: pyopencl>=2024.1; extra == "opencl"
Provides-Extra: intel
Requires-Dist: dpctl>=0.16; extra == "intel"
Requires-Dist: dpnp>=0.14; extra == "intel"
Provides-Extra: all-gpu
Requires-Dist: cupy-cuda12x>=13.0; extra == "all-gpu"
Requires-Dist: pyopencl>=2024.1; extra == "all-gpu"
Provides-Extra: numpy
Requires-Dist: numpy>=1.26; extra == "numpy"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov>=5.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: mypy>=1.10; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: pre-commit>=3.7; extra == "dev"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.6; extra == "docs"
Requires-Dist: mkdocs-material>=9.5; extra == "docs"
Requires-Dist: mkdocstrings[python]>=0.25; extra == "docs"
Dynamic: license-file

# PyAccelerate

**High-performance Python acceleration engine** — CPU, threads, virtual threads, multi-GPU and virtualization.

[![CI](https://github.com/GuilhermeP96/pyaccelerate/actions/workflows/ci.yml/badge.svg)](https://github.com/GuilhermeP96/pyaccelerate/actions)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)

---

## Features

| Module | Description |
|---|---|
| **`cpu`** | CPU detection, topology, NUMA, affinity, ISA flags, dynamic worker recommendations |
| **`threads`** | Persistent virtual-thread pool, sliding-window executor, async bridge, process pool |
| **`gpu`** | Multi-vendor GPU detection (NVIDIA/CUDA, AMD/OpenCL, Intel oneAPI), ranking, multi-GPU dispatch |
| **`virt`** | Virtualization detection (Hyper-V, VT-x/AMD-V, KVM, WSL2, Docker, container detection) |
| **`memory`** | Memory pressure monitoring, automatic worker clamping, reusable buffer pool |
| **`profiler`** | `@timed`, `@profile_memory` decorators, `Timer` context manager, `Tracker` statistics |
| **`benchmark`** | Built-in micro-benchmarks (CPU, threads, memory bandwidth, GPU compute) |
| **`engine`** | Unified orchestrator — auto-detects everything and provides a single API |

## Quick Start

```bash
pip install pyaccelerate
```

```python
from pyaccelerate import Engine

engine = Engine()
print(engine.summary())

# Submit I/O-bound tasks to the virtual thread pool
future = engine.submit(my_io_func, arg1, arg2)

# Run many tasks with auto-tuned concurrency
engine.run_parallel(process_file, [(f,) for f in files])

# GPU dispatch (auto-fallback to CPU)
results = engine.gpu_dispatch(my_kernel, data_chunks)
```

## CLI

```bash
pyaccelerate info          # Full hardware report
pyaccelerate benchmark     # Run micro-benchmarks
pyaccelerate gpu           # GPU details
pyaccelerate cpu           # CPU details
pyaccelerate virt          # Virtualization info
pyaccelerate memory        # Memory stats
pyaccelerate status        # One-liner
```

## Modules in Depth

### Virtual Thread Pool

Inspired by Java's virtual threads — a persistent `ThreadPoolExecutor` sized for I/O (`cores × 3`, cap 32). All I/O-bound work shares this pool instead of creating/destroying threads per operation.

```python
from pyaccelerate.threads import get_pool, run_parallel, submit

# Single task
fut = submit(download_file, url)

# Bounded concurrency (sliding window)
run_parallel(process, [(item,) for item in items], max_concurrent=8)
```

### Multi-GPU Dispatch

Auto-detects GPUs across CUDA, OpenCL and Intel oneAPI. Distributes workloads with configurable strategies.

```python
from pyaccelerate.gpu import detect_all, dispatch

gpus = detect_all()
results = dispatch(my_kernel, data_chunks, strategy="score-weighted")
```

### Profiling

Zero-config decorators for timing and memory tracking:

```python
from pyaccelerate.profiler import timed, profile_memory, Tracker

@timed(level=logging.INFO)
def heavy_computation():
    ...

tracker = Tracker("db_queries")
for batch in batches:
    with tracker.measure():
        run_query(batch)
print(tracker.summary())
```

## Installation Options

```bash
# Core (CPU + threads + memory + virt)
pip install pyaccelerate

# With NVIDIA GPU support
pip install pyaccelerate[cuda]

# With OpenCL support (AMD/Intel/NVIDIA)
pip install pyaccelerate[opencl]

# With Intel oneAPI support
pip install pyaccelerate[intel]

# All GPU backends
pip install pyaccelerate[all-gpu]

# Development
pip install pyaccelerate[dev]
```

## Docker

```bash
# CPU-only
docker build -t pyaccelerate .
docker run --rm pyaccelerate info

# With NVIDIA GPU
docker build -f Dockerfile.gpu -t pyaccelerate:gpu .
docker run --rm --gpus all pyaccelerate:gpu info

# Docker Compose
docker compose up pyaccelerate    # CPU
docker compose up gpu             # GPU
```

## Development

```bash
git clone https://github.com/GuilhermeP96/pyaccelerate.git
cd pyaccelerate
pip install -e ".[dev]"

# Run tests
pytest -v

# Lint + format
ruff check src/ tests/
ruff format src/ tests/

# Type check
mypy src/

# Build wheel
python -m build
```

## Architecture

```
pyaccelerate/
├── cpu.py          # CPU detection & topology
├── threads.py      # Virtual thread pool & executors
├── gpu/
│   ├── detector.py # Multi-vendor GPU enumeration
│   ├── cuda.py     # CUDA/CuPy helpers
│   ├── opencl.py   # PyOpenCL helpers
│   ├── intel.py    # Intel oneAPI helpers
│   └── dispatch.py # Multi-GPU load balancer
├── virt.py         # Virtualization detection
├── memory.py       # Memory monitoring & buffer pool
├── profiler.py     # Timing & profiling utilities
├── benchmark.py    # Built-in micro-benchmarks
├── engine.py       # Unified orchestrator
└── cli.py          # Command-line interface
```

## Roadmap

- [ ] npm package (Node.js bindings via pybind11/napi)
- [ ] gRPC server mode for multi-language integration
- [ ] Kubernetes operator for auto-scaling GPU workloads
- [ ] Prometheus metrics exporter
- [ ] Auto-tuning feedback loop (benchmark → config → re-tune)

## Origin

Evolved from the acceleration & virtual-thread systems built for:
- [adb-toolkit](https://github.com/GuilhermeP96/adb-toolkit) — multi-GPU acceleration, virtual thread pool
- [python-gpu-statistical-analysis](https://github.com/GuilhermeP96/python-gpu-statistical-analysis) — GPU compute foundations

## License

MIT — see [LICENSE](LICENSE).
