Metadata-Version: 2.4
Name: edgecompiler
Version: 0.1.0
Summary: Native edge AI compiler and deployment toolchain for Coral USB, Apple Silicon, and sloth/unsloth workflows
Project-URL: Homepage, https://github.com/rotsl/edgecompiler
Project-URL: Documentation, https://rotsl.github.io/edgecompiler/
Project-URL: Repository, https://github.com/rotsl/edgecompiler
Project-URL: Issues, https://github.com/rotsl/edgecompiler/issues
Project-URL: Changelog, https://github.com/rotsl/edgecompiler/releases
Project-URL: Benchmarks, https://rotsl.github.io/edgecompiler/benchmarks/
Author: Rohan R
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: apple-silicon,compiler,coral,coreml,edge-ai,edgetpu,metal,mps,quantization,slm,sloth-integration,tflite,unsloth
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: MacOS
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Compilers
Requires-Python: >=3.10
Requires-Dist: flatbuffers>=23.5
Requires-Dist: numpy>=1.24
Requires-Dist: tflite>=2.18.0
Provides-Extra: all
Requires-Dist: bitsandbytes>=0.45.0; extra == 'all'
Requires-Dist: coremltools>=7.0; extra == 'all'
Requires-Dist: datasets>=2.14; extra == 'all'
Requires-Dist: onnx-tf>=1.10; extra == 'all'
Requires-Dist: onnx>=1.14; extra == 'all'
Requires-Dist: onnxsim>=0.4; extra == 'all'
Requires-Dist: peft>=0.18.0; extra == 'all'
Requires-Dist: sentencepiece>=0.2.0; extra == 'all'
Requires-Dist: tensorflow>=2.16; extra == 'all'
Requires-Dist: tf-keras>=2.16; extra == 'all'
Requires-Dist: torch>=2.0; extra == 'all'
Requires-Dist: torchvision>=0.15; extra == 'all'
Requires-Dist: transformers>=4.36; extra == 'all'
Requires-Dist: trl>=0.18.0; extra == 'all'
Requires-Dist: unsloth>=2024.1; extra == 'all'
Provides-Extra: coral
Requires-Dist: tensorflow>=2.16; extra == 'coral'
Provides-Extra: coreml
Requires-Dist: coremltools>=7.0; extra == 'coreml'
Provides-Extra: dev
Requires-Dist: flatbuffers>=23.5; extra == 'dev'
Requires-Dist: mypy>=1.8; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest-timeout>=2.1; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: onnx
Requires-Dist: onnx>=1.14; extra == 'onnx'
Requires-Dist: onnxsim>=0.4; extra == 'onnx'
Provides-Extra: pytorch
Requires-Dist: torch>=2.0; extra == 'pytorch'
Requires-Dist: torchvision>=0.15; extra == 'pytorch'
Provides-Extra: sloth
Requires-Dist: datasets>=2.14; extra == 'sloth'
Requires-Dist: sentencepiece>=0.2.0; extra == 'sloth'
Requires-Dist: transformers>=4.36; extra == 'sloth'
Provides-Extra: sloth-onnx
Requires-Dist: onnx-tf>=1.10; extra == 'sloth-onnx'
Requires-Dist: onnx>=1.14; extra == 'sloth-onnx'
Requires-Dist: onnxsim>=0.4; extra == 'sloth-onnx'
Provides-Extra: sloth-unsloth
Requires-Dist: bitsandbytes>=0.45.0; extra == 'sloth-unsloth'
Requires-Dist: peft>=0.18.0; extra == 'sloth-unsloth'
Requires-Dist: trl>=0.18.0; extra == 'sloth-unsloth'
Requires-Dist: unsloth>=2024.1; extra == 'sloth-unsloth'
Provides-Extra: tensorflow
Requires-Dist: tensorflow>=2.16; extra == 'tensorflow'
Requires-Dist: tf-keras>=2.16; extra == 'tensorflow'
Description-Content-Type: text/markdown

# edgecompiler

# edgecompiler <img src="logo.jpg" align="right" height="139" alt="edgecompiler logo" />

[![PyPI version](https://img.shields.io/pypi/v/edgecompiler.svg)](https://pypi.org/project/edgecompiler/)
[![CI](https://github.com/rotsl/edgecompiler/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/rotsl/edgecompiler/actions/workflows/ci.yml)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-green.svg)](https://opensource.org/licenses/Apache-2.0)
[![macOS ARM64](https://img.shields.io/badge/platform-macOS%20ARM64-orange.svg)](https://support.apple.com/en-us/
---
# EdgeCompiler – Universal Model Compiler for Apple Silicon & Google Coral USB

EdgeCompiler is a **native Apple Silicon compiler toolchain** that liberates the
Google Coral USB Accelerator from its x86‑64 Debian shackles.  It replaces the
official `edgetpu_compiler` and **runs entirely on a MacBook M1/M2/M3 Pro**,
accepting models from PyTorch (`.pt`), ONNX, TensorFlow, and TensorFlow Lite, then
compiling them into device‑ready `*_edgetpu.tflite` files **without Docker, virtual
machines, or Rosetta emulation**.

## Why This Exists

Google’s Coral USB is a tiny, low‑power inferencing beast, but its official
compiler has three deal‑breakers:

1.  **Locked to Debian Linux + x86‑64** – Mac users must spin up a VM or a
    separate Linux machine just to compile a model.
2.  **TFLite‑only input** – PyTorch, ONNX, and Keras models require clumsy
    conversions before the Coral compiler can even look at them.
3.  **Zero Apple Silicon support** – Even the runtime is tricky to install
    natively, and the compiler simply doesn’t exist for `arm64`.

EdgeCompiler fixes all of this.  It runs **natively on Apple Silicon**, gives you
a single‑command pipeline from **any major framework** to a compiled Edge TPU
model, and even generates **Metal‑optimised GPU code** when you want to run
inference directly on the M1 Pro chip instead.

With EdgeCompiler, you can prototype on your Mac, quantise and compile for the
Coral USB, and then **offload 100% of the inference compute to the Edge TPU** –
freeing your CPU/GPU entirely.  The entire toolchain is modular, test‑driven, and
designed to be easily extended to other accelerators (Hailo‑8, Intel VPU, etc.).

## Optional UnsLoth Integration

Because the best Edge TPU models are often tiny, efficient architectures that
have been **fine‑tuned for a specific task**, EdgeCompiler ships with an optional
integration for **[Unsloth](https://github.com/unslothai/unsloth)**.  Unsloth is a
lightning‑fast fine‑tuning engine that makes training quantised‑aware models
2–5 × faster while using less memory.

With `edgecompiler + unsloth`, you can:

- Fine‑tune a MobileNet, EfficientNet, or even a small transformer **directly on
  your MacBook** with QAT (Quantisation‑Aware Training).
- Export the trained model through EdgeCompiler’s pipeline and deploy it to the
  Coral USB in **one command**.
- Iterate on your model without ever leaving the Apple Silicon ecosystem.

The integration is **optional** – EdgeCompiler works perfectly with pre‑trained
models – but it turns your Mac into an end‑to‑end edge‑AI development
workstation.
## Features

| Category | Details |
| --- | --- |
| **Model Formats** | PyTorch `.pt` / `.pth`, TFLite `.tflite`, ONNX `.onnx`, TF SavedModel, Keras `.h5` / `.keras` |
| **Quantisation** | Post-Training Quantisation (PTQ), Quantisation-Aware Training (QAT), Dynamic Range |
| **Backends** | Google Coral USB (Edge TPU), Apple Silicon GPU (Metal / MPS / Neural Engine) |
| **Platform** | macOS ARM64 (M1/M2/M3/M4) — no Docker, no Rosetta 2, no x86-64 emulation |
| **CLI** | Single `edgecompile` command with sensible defaults |
| **Python API** | Programmatic access via `edgecompiler.compile()` |
| **SLM Pipeline** | Integrated `sloth_integration` package for unsloth fine-tuning to Coral deployment |
| **Testing** | Full test suite with hardware-specific markers |

---

## Quick Start

### Install

```bash
# Basic install (TFLite support only)
pip install edgecompiler

# With PyTorch support
pip install "edgecompiler[pytorch]"

# With all frontends
pip install "edgecompiler[all]"

# Development install
pip install -e ".[dev]"
```

### Compile a Model

```bash
# PyTorch → Coral USB
edgecompile model.pt --target coral --output model_coral.tflite

# TFLite → Apple Silicon GPU
edgecompile model.tflite --target metal --output model.mlpackage

# ONNX → Coral USB with INT8 PTQ
edgecompile model.onnx --target coral --quantize ptq --output model_coral.tflite

# SavedModel → Metal with calibration data
edgecompile saved_model/ --target metal --quantize ptq --calibration-data calib.npy
```

---

## Installation

### pip (recommended)

```bash
pip install edgecompiler
```

### Optional dependencies

Install only the frontends you need:

```bash
pip install "edgecompiler[pytorch]"     # PyTorch + TorchScript support
pip install "edgecompiler[tensorflow]"  # TF SavedModel / Keras support
pip install "edgecompiler[onnx]"        # ONNX model support
pip install "edgecompiler[coreml]"      # Core ML / Metal backend support
pip install "edgecompiler[coral]"       # Coral runtime extras (TensorFlow Lite interpreter path)
pip install "edgecompiler[all]"         # Everything
```

For Coral USB acceleration, also install `libedgetpu` using
`./scripts/install_coral_runtime.sh`.

### From source

```bash
git clone https://github.com/rotsl/edgecompiler.git
cd edgecompiler
make install   # pip install -e ".[dev]"
```

### macOS M1 / M2 / M3 specific notes

`edgecompiler` runs **natively on ARM64** — no Docker, no Rosetta 2, and no x86-64
Python environment required. However, there are a few platform-specific considerations:

1. **Python**: Use an ARM64-native Python build (Homebrew or `python.org` installer).
   Verify with:

   ```bash
   python3 -c "import platform; print(platform.machine())"
   # Expected: arm64
   ```

2. **Coral USB runtime**: The `libedgetpu` library does not ship official ARM64 macOS
   builds. Use our helper script to install a compatible build:

   ```bash
   ./scripts/install_coral_runtime.sh
   ```

3. **Core ML / Metal**: `coremltools` installs natively via pip on ARM64 macOS.
   No additional setup is needed for the Metal backend.

4. **TensorFlow**: As of TF 2.16+, native ARM64 wheels are available on PyPI for
   macOS. If you encounter issues, use `tensorflow-macos` as a fallback.

---

## Compiling a PyTorch Model for Coral USB

```python
from edgecompiler import compile

result = compile(
    "mobilenet_v2.pt",
    target="coral",
    quantize="ptq",
    calibration_data="calibration_images.npy",  # N x C x H x W, float32
    output="mobilenet_v2_coral.tflite",
)

print(result)  # CompileResult(output_path=..., ops_on_target=147, ops_fallback=0)
```

**CLI equivalent:**

```bash
edgecompile mobilenet_v2.pt \
    --target coral \
    --quantize ptq \
    --calibration-data calibration_images.npy \
    --output mobilenet_v2_coral.tflite
```

### What happens under the hood (TFLite path)

1. The PyTorch model is traced via `torch.jit.trace` (or scripted if trace fails).
2. The TorchScript graph is converted to our unified IR.
3. INT8 post-training quantisation is applied using the provided calibration data.
4. The quantised IR is lowered to a TFLite FlatBuffer with Edge TPU custom ops.
5. The output `.tflite` file can be run directly with the Coral USB runtime.

---

## Compiling a TFLite Model for Apple Silicon GPU

```python
from edgecompiler import compile

result = compile(
    "mobilenet_v2.tflite",
    target="metal",
    quantize="ptq",
    output="mobilenet_v2_ml.mlpackage",
)

print(result)  # CompileResult(output_path=..., backend="metal")
```

**CLI equivalent:**

```bash
edgecompile mobilenet_v2.tflite \
    --target metal \
    --quantize ptq \
    --output mobilenet_v2_ml.mlpackage
```

### What happens under the hood

1. The TFLite model is parsed into our unified IR.
2. INT8 quantisation is applied (Core ML supports both INT8 and FP16).
3. The IR is lowered to a Core ML model via `coremltools`.
4. Compute unit selection (Neural Engine > GPU > CPU) is configured for maximum throughput.
5. The output `.mlpackage` can be loaded with Core ML directly or via the `edgecompiler` runtime.

---

## Running Examples

The `examples/` directory contains ready-to-run scripts for common workflows:

```bash
# PyTorch model → Coral + Metal compilation pipeline
python examples/pytorch_mobilenet.py

# TFLite model compilation walkthrough
python examples/tflite_mobilenet.py

# ONNX ResNet compilation
python examples/onnx_resnet.py

# Coral USB benchmark (1 000 inferences, latency stats)
python examples/coral_usb_benchmark.py --num-runs 100
```

See [examples documentation](./docs/examples.md) for detailed walkthroughs.

### Sloth Integration Examples

The repository also includes an integrated Sloth pipeline under `sloth-integration/`
for unsloth fine-tuning and Coral deployment. This integration is based on
[Unsloth](https://github.com/unslothai/unsloth):

```bash
# Run sloth integration tests
pytest sloth-integration/tests -v

# Run sloth benchmark example
python sloth-integration/examples/benchmark_coral.py \
    --model sloth-integration/test_models/synthetic_text_classifier.tflite
```

See `sloth-integration/docs/benchmarks_sloth.md` for the latest measured values
and `docs/sloth_integration.md` for setup and workflow guidance.

---

## Coral USB Quick Start

Run inference on a Google Coral USB Accelerator connected to your MacBook M1 Pro in just a few steps:

```bash
# 1. Install edgecompiler with Coral support
pip install edgecompiler

# 2. Install the Edge TPU runtime (see docs/coral_macos_setup.md for details)
./scripts/install_coral_runtime.sh

# 3. Plug in your Coral USB Accelerator

# 4. Compile and run inference in one command
edgecompiler coral-usb model.tflite --image parrot.jpg --labels imagenet_labels.txt
```

**Python API:**

```python
from edgecompiler.runtime.coral_usb import CoralUSBRuntime
import numpy as np

with CoralUSBRuntime() as runtime:
    devices = runtime.detect_devices()
    if devices:
        runtime.load_model("model_edgetpu.tflite")
        result = runtime.infer(np.zeros((1, 224, 224, 3), dtype=np.uint8))
        for cls_id, score in result.top_classes:
            print(f"  Class {cls_id}: {score:.3f}")
```

For detailed setup instructions, see [docs/coral_macos_setup.md](docs/coral_macos_setup.md) and [instructions.md](instructions.md).

---

## Running Tests

```bash
# One command: auto-detect Coral and separate simulation/hardware suites
edge-test

# Make target wrapper for auto mode
make test-auto

# Run all unit tests (no hardware required)
make test

# Run with pytest directly
pytest tests/ -v

# Run hardware-specific tests (requires Coral USB or Apple Silicon)
make test-hardware

# Run only Coral tests
pytest tests/ -v -m coral

# Run only Metal tests
pytest tests/ -v -m metal
```

Useful options:

```bash
# Force simulation-only run
edge-test --mode simulation

# Force hardware-only run
edge-test --mode hardware

# Include slow tests
edge-test --include-slow
```

### Hardware test models

Download Coral hardware test assets with:

```bash
bash scripts/download_models.sh --output-dir tests/hardware/test_models
```

The downloader is compatible with macOS default Bash 3.x.

### Benchmark report

Current measured results (with and without hardware-focused runs) are tracked in:

- `benchmarks.md`

---

## Architecture Overview

```text
                           ┌──────────────────┐
                           │   edgecompiler   │
                           │       CLI        │
                           └────────┬─────────┘
                                    │
                           ┌────────▼─────────┐
                           │  Python API      │
                           │  compile()       │
                           └────────┬─────────┘
                                    │
        ┌───────────┬──────────────┼──────────────┬────────────┐
        │           │              │              │            │
  ┌─────▼────┐ ┌────▼─────┐ ┌──────▼─────┐ ┌──────▼────┐ ┌─────▼──────┐
  │ PyTorch  │ │  TFLite  │ │   ONNX     │ │ TF Saved  │ │   Keras    │
  │ Frontend │ │ Frontend │ │ Frontend   │ │  Model    │ │ Frontend   │
  └─────┬────┘ └────┬─────┘ └──────┬─────┘ └──────┬────┘ └─────┬──────┘
        │           │              │              │            │
        └───────────┴──────────────┼─────────────-┴────────────┘
                                   │
                           ┌───────▼───────-──-┐
                           │   Unified IR      │
                           │  (Graph + Tensors)│
                           └────────┬─────────-┘
                                    │
                           ┌────────▼─────────-┐
                           │  Optimisation     │
                           │  Passes           │
                           │  ├─ Constant fold │
                           │  ├─ Op fusion     │
                           │  ├─ Dead code elim│
                           │  └─ Layout xform  │
                           └────────┬─────────-┘
                                    │
                           ┌────────▼─────────┐
                           │  Quantisation    │
                           │  Pipeline        │
                           │  ├─ PTQ          │
                           │  ├─ QAT          │
                           │  └─ Dynamic range│
                           └────────┬─────────┘
                                    │
                    ┌───────────────┼───────────────┐
                    │                               │
           ┌────────▼────────┐             ┌────────▼────────┐
           │  Coral Backend  │             │  Metal Backend  │
           │  (Edge TPU)     │             │  (Apple Silicon)│
           │                 │             │                 │
           │  ├─ TFLite FB   │             │  ├─ Core ML     │
           │  ├─ Custom ops  │             │  ├─ MPSGraph    │
           │  └─ Partitioning│             │  └─ ANE routing │
           └────────┬────────┘             └────────┬────────┘
                    │                               │
           ┌────────▼────────┐             ┌────────▼────────┐
           │  .tflite        │             │  .mlpackage     │
           │  (Edge TPU)     │             │  (Core ML)      │
           └─────────────────┘             └─────────────────┘

                          Sloth Integration (in-repo)

     unsloth / HF checkpoints -> sloth_integration adapter/converter
                           -> edgecompiler compile()
                           -> SlothCoralRuntime
                           -> Coral USB inference
```

`sloth_integration` is packaged from `sloth-integration/src/sloth_integration`
and reuses the same frontend -> IR -> backend pipeline, adding text-specific
adapter/converter/runtime layers for SLM deployment workflows.

See [architecture documentation](./docs/architecture.md) for the full design.

---

## Supported Operations

### Edge TPU (Coral USB)

| Operation | INT8 | Notes |
| --- | --- | --- |
| Conv2D | ✅ | Depthwise + standard |
| DepthwiseConv2D | ✅ | |
| FullyConnected | ✅ | |
| MaxPool2D | ✅ | |
| AveragePool2D | ✅ | |
| ReLU | ✅ | ReLU, ReLU6, ReLUN1To1 |
| Softmax | ✅ | |
| Sigmoid | ✅ | |
| Tanh | ✅ | |
| Add | ✅ | Element-wise |
| Sub | ✅ | Element-wise |
| Mul | ✅ | Element-wise |
| Concatenation | ✅ | |
| Reshape | ✅ | |
| Transpose | ✅ | |
| Pad | ✅ | |
| ReduceMin / ReduceMax | ✅ | |
| Mean | ✅ | |
| ExpandDims | ✅ | |
| Squeeze | ✅ | |
| Split | ✅ | |
| Slice | ✅ | |
| ResizeBilinear | ✅ | |
| ResizeNearestNeighbor | ✅ | |
| Logistic | ✅ | |
| L2Normalization | ✅ | |
| BatchToSpaceND | ✅ | |
| SpaceToBatchND | ✅ | |
| Gather | ⚠️ | Fallback on some dims |
| StridedSlice | ⚠️ | Limited mask support |
| LSTM | ❌ | Falls back to CPU |
| Einsum | ❌ | Not supported |
| ScatterND | ❌ | Not supported |

### Apple Silicon (Metal / Neural Engine)

| Operation | INT8 | FP16 | Notes |
| --- | --- | --- | --- |
| Conv2D | ✅ | ✅ | ANE preferred |
| DepthwiseConv2D | ✅ | ✅ | |
| FullyConnected | ✅ | ✅ | |
| MaxPool2D | ✅ | ✅ | |
| AveragePool2D | ✅ | ✅ | |
| ReLU | ✅ | ✅ | |
| Softmax | ✅ | ✅ | |
| Sigmoid | ✅ | ✅ | |
| Tanh | ✅ | ✅ | |
| Add / Sub / Mul | ✅ | ✅ | |
| Concatenation | ✅ | ✅ | |
| Reshape | ✅ | ✅ | |
| Transpose | ✅ | ✅ | |
| BatchNorm | ✅ | ✅ | Fused into conv |
| LayerNorm | ❌ | ✅ | FP16 only on ANE |
| LSTM / GRU | ❌ | ✅ | FP16 on GPU |
| Attention | ❌ | ✅ | FP16, GPU preferred |
| Einsum | ❌ | ✅ | GPU only |
| ScatterND | ❌ | ✅ | GPU only |

---

## Contributing

We welcome contributions! Please follow these guidelines:

### Getting Started

1. Fork the repository
2. Clone your fork: `git clone https://github.com/your-username/edgecompiler.git`
3. Install in development mode: `make install`
4. Create a feature branch: `git checkout -b feature/my-feature`

### Development Workflow

```bash
make lint      # Check code style
make format    # Auto-format code
make test      # Run core test suite
make test-auto # Auto split simulation/hardware where possible
make test-all  # Root + sloth integration tests
```

You can also run the same paths used in CI:

```bash
python -m edgecompiler.test_runner --mode simulation --path tests/unit --path tests/integration
pytest -q sloth-integration/tests
```

### Code Areas

- Frontends: `src/edgecompiler/frontend/`
- IR and passes: `src/edgecompiler/ir/`
- Quantisation: `src/edgecompiler/quantisation/`
- Backends: `src/edgecompiler/backend/`
- Runtime: `src/edgecompiler/runtime/`
- Sloth integration: `sloth-integration/src/sloth_integration/`
- Examples: `examples/` and `sloth-integration/examples/`

### Adding a New Frontend

1. Create `src/edgecompiler/frontend/my_frontend.py`
2. Expose a converter function in `src/edgecompiler/frontend/__init__.py`
3. Convert the model into the unified IR graph and tensor model
4. Add targeted unit tests in `tests/unit/`

See [architecture documentation](./docs/architecture.md#extension-points) for details.

### Adding a New Backend

1. Create `src/edgecompiler/backend/my_backend.py`
2. Expose compile helpers in `src/edgecompiler/backend/__init__.py`
3. Keep unsupported-op behavior explicit (clear fallback/error messages)
4. Add tests in `tests/unit/` and integration checks in `tests/integration/` as needed

### Pull Request Checklist

- [ ] Code passes `make lint`
- [ ] Code is formatted with `make format`
- [ ] Unit tests pass with `make test`
- [ ] Integration paths are validated (`make test-all` for cross-package changes)
- [ ] New features include documentation
- [ ] Breaking changes are documented in the PR description

### Reporting Issues

Please use [GitHub Issues](https://github.com/rotsl/edgecompiler/issues) and include:

- macOS version and hardware (e.g., macOS 14.2, M1 Pro)
- Python version (`python3 --version`)
- `edgecompiler` version (`edgecompile --version`)
- Full error output with `--verbose` flag

---

## License

Licensed under the [Apache License 2.0](./LICENSE).

```text
Copyright 2026 Rohan R

```

> Note: The sloth-integration in this repo is based on the **[unsloth](https://github.com/unslothai/unsloth)** repository. We would like to thank the unslothai team.
