Metadata-Version: 2.1
Name: ProFed
Version: 0.7.6
Summary: A benchmark for proximity-based non-IID Federated Learning
Home-page: https://github.com/davidedomini/ProFed
License: MIT
Author: Davide Domini
Author-email: davide.domini@unibo.it
Requires-Python: >=3.12,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Benchmark
Requires-Dist: datasets (==3.6.0)
Requires-Dist: fsspec (==2025.3.0)
Requires-Dist: matplotlib (>=3.10.8,<4.0.0)
Requires-Dist: numpy (>=2.2.2,<3.0.0)
Requires-Dist: tensorflow-datasets (==4.9.9)
Requires-Dist: torch (>=2.7.0,<3.0.0)
Requires-Dist: torchvision (>=0.22.0,<0.23.0)
Project-URL: Changelog, https://github.com/davidedomini/ProFed/blob/main/CHANGELOG.md
Project-URL: Repository, https://github.com/davidedomini/ProFed
Description-Content-Type: text/markdown

# ProFed: A Benchmark for Proximity-based Federated Learning

**🔗 ProFed: A Benchmark for Proximity‑based Non‑IID Federated Learning**

ProFed is a framework for evaluating federated learning (FL) systems under *realistic*, geographically clustered, non‑IID data scenarios. It simulates clients grouped into regions such that data is IID *within* regions but non‑IID *across* regions.

---

## 🚀 Features

- **Built‑in datasets**: Support for MNIST, FashionMNIST, CIFAR‑10, CIFAR‑100 and UTKFace via PyTorch/TorchVision.
- **Flexible partitioning**: Implements Dirichlet-based splits, hard label skews, and can model arbitrary proximity-driven distribution skews.
- **Customizable proximity modeling**: Define how many geographic clusters (regions) to simulate and control skew intensity (e.g., Dirichlet α).

---

## 🔧 Getting Started

### Prerequisites

- Python ≥ 3.12

### Installation

ProFed is [publicly released on PyPi](https://pypi.org/project/ProFed/), to install ProFed on your machine:

```bash
pip install ProFed
```

## API Explanation

### 1. Downloading and importing the dataset
```python
train_data, test_data = download_dataset('EMNIST')
```

### 2. Splitting into train & validation sets
```python
train_data, validation_data = split_train_validation(train_data, 0.8)
```

### 3. Partitioning into geographic “regions” (i.i.d. internally)
```python
environment = partition_to_subregions(
    train_data,
    validation_data,
    dataset_name = 'EMNIST',
    partitioning_method = 'Hard',
    number_of_regions = 5,
    seed = 42,
)
```

- method: partition strategy ('Hard', 'Dirichlet', or 'IID')

- number_of_regions: how many simulated geographic clusters

- dirichlet_alpha: optional, controls how concentrated the Dirichlet splits are

- min_region_size: optional, retries Dirichlet sampling until each region has at least this many samples

- Returns an Environment object. Each region within it contains IID data internally, but non-IID across regions.

Example for Dirichlet:

```python
environment = partition_to_subregions(
    train_data,
    validation_data,
    dataset_name = 'CIFAR100',
    partitioning_method = 'Dirichlet',
    number_of_regions = 8,
    seed = 42,
    dirichlet_alpha = 0.3,
    min_region_size = 20,
)
```

### 1. Distributing region data across devices
```python
mapping = {}
for region_id, devices in mapping_devices_area.items():
    mapping_devices_data = environment.from_subregion_to_devices(
        region_id,
        len(devices)
    )
    for device_index, data in mapping_devices_data.items():
        device_id = devices[device_index]
        mapping[device_id] = data
```
Splits the region’s IID data equally among its devices, assigning each a local subset.

The result is a mapping:

```python
device_id → local_dataset
```

## ✍🏻 Examples of usage

1. [Proximity-based Self-Federated Learning](https://github.com/domm99/PSFL)
2. [SParSeFuL](https://github.com/domm99/SParSeFuL)
3. [Baselines implementation](https://github.com/domm99/experiments-2026-pmc-baselines)
4. [ProFed Startup Experiemnts](https://github.com/domm99/experiments-2025-jors)

## 📄 License 
MIT License — feel free to freely use, modify, and distribute.

## 📬 Contact
For questions, issues, or contributions, feel free to reach out:

- **Author**: Davide Domini  
- **Email**: davide.domini@unibo.it  
- **GitHub**: [domm99](https://github.com/domm99)

You can also open an [issue](https://github.com/davidedomini/ProFed/issues) or submit a pull request on GitHub!

