Metadata-Version: 2.2
Name: macfleet
Version: 2.0.0
Summary: Pool Apple Silicon Macs into a distributed ML training cluster
Author: MacFleet Contributors
License: MIT
Project-URL: Homepage, https://github.com/vikranthreddimasu/MacFleet
Project-URL: Documentation, https://github.com/vikranthreddimasu/MacFleet#readme
Project-URL: Repository, https://github.com/vikranthreddimasu/MacFleet
Project-URL: Issues, https://github.com/vikranthreddimasu/MacFleet/issues
Keywords: distributed,machine-learning,apple-silicon,mps,mlx,pytorch,training,gpu-pooling,data-parallel
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: MacOS
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: zeroconf>=0.131.0
Requires-Dist: rich>=13.0.0
Requires-Dist: click>=8.1.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: msgpack>=1.0.0
Provides-Extra: torch
Requires-Dist: torch>=2.1.0; extra == "torch"
Provides-Extra: mlx
Requires-Dist: mlx>=0.5.0; extra == "mlx"
Provides-Extra: yaml
Requires-Dist: pyyaml>=6.0; extra == "yaml"
Provides-Extra: all
Requires-Dist: torch>=2.1.0; extra == "all"
Requires-Dist: mlx>=0.5.0; extra == "all"
Requires-Dist: pyyaml>=6.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
Requires-Dist: ruff>=0.3.0; extra == "dev"
Requires-Dist: mypy>=1.8.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"

# MacFleet

**Pool Apple Silicon Macs into a distributed ML training cluster.**

Zero-config discovery. N-node scaling. WiFi, Ethernet, or Thunderbolt.

```
  macfleet join                macfleet join               macfleet join
 ┌──────────────┐           ┌──────────────┐           ┌──────────────┐
 │  MacBook Pro  │◄────────►│  MacBook Air  │◄────────►│  Mac Studio   │
 │  M4 Pro       │  WiFi /  │  M4           │  WiFi /  │  M4 Ultra     │
 │  16 GPU cores │  ETH /   │  10 GPU cores │  ETH /   │  60 GPU cores │
 │  48 GB RAM    │  TB4     │  16 GB RAM    │  TB4     │  192 GB RAM   │
 │  weight: 0.35 │           │  weight: 0.15 │           │  weight: 0.50 │
 └──────────────┘           └──────────────┘           └──────────────┘
         ▲                          ▲                          ▲
         └──────────────────────────┴──────────────────────────┘
                        Ring AllReduce (gradient sync)
```

## Features

- **Zero-Config Pooling**: `pip install macfleet && macfleet join` — auto-discovers peers via mDNS/Bonjour
- **N-Node Scaling**: Ring AllReduce for 2+ nodes (not limited to pairs)
- **Any Network**: WiFi, Ethernet, and Thunderbolt with adaptive buffer tuning
- **Dual Engine**: PyTorch+MPS and Apple MLX — pluggable via Engine protocol
- **Heterogeneous Scheduling**: Weighted batch allocation based on GPU cores + thermal state
- **Gossip Heartbeat**: Peer-to-peer failure detection, automatic coordinator election
- **Adaptive Compression**: Bandwidth-aware TopK+FP16 (auto-selects by link type: WiFi=200x, Ethernet=20x, TB4=off)
- **Framework-Agnostic Core**: Communication layer uses numpy — never imports torch/mlx
- **Health Monitoring**: Thermal, memory, battery, loss trend — composite health score per node
- **Rich TUI Dashboard**: Real-time cluster topology, training progress, and warnings

## Quick Start

```bash
pip install macfleet
```

### Join the pool

```bash
# On each Mac:
macfleet join
```

### Train a model (Python SDK)

```python
import macfleet

# PyTorch
with macfleet.Pool() as pool:
    pool.train(
        model=MyModel(),
        dataset=my_dataset,
        epochs=10,
        batch_size=128,
    )

# MLX (Apple native)
with macfleet.Pool(engine="mlx") as pool:
    pool.train(
        model=mlx_model,
        dataset=(X, y),
        epochs=10,
        loss_fn=my_loss_fn,
    )

# One-liner
macfleet.train(model=MyModel(), dataset=ds, epochs=10)

# Decorator
@macfleet.distributed(engine="torch")
def my_training():
    ...
```

### CLI commands

```bash
macfleet info       # Local hardware profile
macfleet status     # Discover pool members on the network
macfleet diagnose   # System health check (MPS, thermal, network)
macfleet train      # Demo training on synthetic data
macfleet bench      # Benchmark compute, network, and allreduce
```

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│  CLI: macfleet join | status | train | bench | info | diagnose  │
│  SDK: macfleet.Pool() | macfleet.train()                        │
├─────────────────────────────────────────────────────────────────┤
│  Training: DataParallel | TrainingLoop | WeightedSampler        │
├─────────────────────────────────────────────────────────────────┤
│  Engines: TorchEngine (PyTorch+MPS) | MLXEngine (Apple MLX)    │
├─────────────────────────────────────────────────────────────────┤
│  Compression: TopK + FP16 + Adaptive pipeline                   │
├─────────────────────────────────────────────────────────────────┤
│  Pool: Agent | Registry | Discovery | Scheduler | Heartbeat     │
├─────────────────────────────────────────────────────────────────┤
│  Communication: PeerTransport | WireProtocol | Collectives      │
├─────────────────────────────────────────────────────────────────┤
│  Monitoring: Thermal | Health | Throughput | Dashboard            │
└─────────────────────────────────────────────────────────────────┘
```

## Development

```bash
git clone https://github.com/yourusername/MacFleet.git
cd MacFleet
pip install -e ".[dev]"
make test          # 268 tests
make bench         # compute + network + allreduce benchmarks
make lint          # ruff + mypy
```

## Requirements

- Python 3.11+
- macOS with Apple Silicon (M1/M2/M3/M4)
- PyTorch 2.1+ (for torch engine)
- MLX 0.5+ (optional, for mlx engine)

## License

MIT
