Metadata-Version: 2.4
Name: quinkgl
Version: 0.3.2
Summary: A decentralized gossip learning framework for P2P edge intelligence
Author-email: Ali Seyhan <aliseyhan@posta.mu.edu.tr>, Baki Turhan <bakiturhan@posta.mu.edu.tr>
License:                                  Apache License
                                   Version 2.0, January 2004
                                http://www.apache.org/licenses/
        
           TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
        
           1. Definitions.
        
              "License" shall mean the terms and conditions for use, reproduction,
              and distribution as defined by Sections 1 through 9 of this document.
        
              "Licensor" shall mean the copyright owner or entity authorized by
              the copyright owner that is granting the License.
        
              "Legal Entity" shall mean the union of the acting entity and all
              other entities that control, are controlled by, or are under common
              control with that entity. For the purposes of this definition,
              "control" means (i) the power, direct or indirect, to cause the
              direction or management of such entity, whether by contract or
              otherwise, or (ii) ownership of fifty percent (50%) or more of the
              outstanding shares, or (iii) beneficial ownership of such entity.
        
              "You" (or "Your") shall mean an individual or Legal Entity
              exercising permissions granted by this License.
        
              "Source" form shall mean the preferred form for making modifications,
              including but not limited to software source code, documentation
              source, and configuration files.
        
              "Object" form shall mean any form resulting from mechanical
              transformation or translation of a Source form, including but
              not limited to compiled object code, generated documentation,
              and conversions to other media types.
        
              "Work" shall mean the work of authorship, whether in Source or
              Object form, made available under the License, as indicated by a
              copyright notice that is included in or attached to the work
              (an example is provided in the Appendix below).
        
              "Derivative Works" shall mean any work, whether in Source or Object
              form, that is based on (or derived from) the Work and for which the
              editorial revisions, annotations, elaborations, or other modifications
              represent, as a whole, an original work of authorship. For the purposes
              of this License, Derivative Works shall not include works that remain
              separable from, or merely link (or bind by name) to the interfaces of,
              the Work and Derivative Works thereof.
        
              "Contribution" shall mean any work of authorship, including
              the original version of the Work and any modifications or additions
              to that Work or Derivative Works thereof, that is intentionally
              submitted to Licensor for inclusion in the Work by the copyright owner
              or by an individual or Legal Entity authorized to submit on behalf of
              the copyright owner. For the purposes of this definition, "submitted"
              means any form of electronic, verbal, or written communication sent
              to the Licensor or its representatives, including but not limited to
              communication on electronic mailing lists, source code control systems,
              and issue tracking systems that are managed by, or on behalf of, the
              Licensor for the purpose of discussing and improving the Work, but
              excluding communication that is conspicuously marked or otherwise
              designated in writing by the copyright owner as "Not a Contribution."
        
              "Contributor" shall mean Licensor and any individual or Legal Entity
              on behalf of whom a Contribution has been received by Licensor and
              subsequently incorporated within the Work.
        
           2. Grant of Copyright License. Subject to the terms and conditions of
              this License, each Contributor hereby grants to You a perpetual,
              worldwide, non-exclusive, no-charge, royalty-free, irrevocable
              copyright license to reproduce, prepare Derivative Works of,
              publicly display, publicly perform, sublicense, and distribute the
              Work and such Derivative Works in Source or Object form.
        
           3. Grant of Patent License. Subject to the terms and conditions of
              this License, each Contributor hereby grants to You a perpetual,
              worldwide, non-exclusive, no-charge, royalty-free, irrevocable
              (except as stated in this section) patent license to make, have made,
              use, offer to sell, sell, import, and otherwise transfer the Work,
              where such license applies only to those patent claims licensable
              by such Contributor that are necessarily infringed by their
              Contribution(s) alone or by combination of their Contribution(s)
              with the Work to which such Contribution(s) was submitted. If You
              institute patent litigation against any entity (including a
              cross-claim or counterclaim in a lawsuit) alleging that the Work
              or a Contribution incorporated within the Work constitutes direct
              or contributory patent infringement, then any patent licenses
              granted to You under this License for that Work shall terminate
              as of the date such litigation is filed.
        
           4. Redistribution. You may reproduce and distribute copies of the
              Work or Derivative Works thereof in any medium, with or without
              modifications, and in Source or Object form, provided that You
              meet the following conditions:
        
              (a) You must give any other recipients of the Work or
                  Derivative Works a copy of this License; and
        
              (b) You must cause any modified files to carry prominent notices
                  stating that You changed the files; and
        
              (c) You must retain, in the Source form of any Derivative Works
                  that You distribute, all copyright, patent, trademark, and
                  attribution notices from the Source form of the Work,
                  excluding those notices that do not pertain to any part of
                  the Derivative Works; and
        
              (d) If the Work includes a "NOTICE" text file as part of its
                  distribution, then any Derivative Works that You distribute must
                  include a readable copy of the attribution notices contained
                  within such NOTICE file, excluding those notices that do not
                  pertain to any part of the Derivative Works, in at least one
                  of the following places: within a NOTICE text file distributed
                  as part of the Derivative Works; within the Source form or
                  documentation, if provided along with the Derivative Works; or,
                  within a display generated by the Derivative Works, if and
                  wherever such third-party notices normally appear. The contents
                  of the NOTICE file are for informational purposes only and
                  do not modify the License. You may add Your own attribution
                  notices within Derivative Works that You distribute, alongside
                  or as an addendum to the NOTICE text from the Work, provided
                  that such additional attribution notices cannot be construed
                  as modifying the License.
        
              You may add Your own copyright statement to Your modifications and
              may provide additional or different license terms and conditions
              for use, reproduction, or distribution of Your modifications, or
              for any such Derivative Works as a whole, provided Your use,
              reproduction, and distribution of the Work otherwise complies with
              the conditions stated in this License.
        
           5. Submission of Contributions. Unless You explicitly state otherwise,
              any Contribution intentionally submitted for inclusion in the Work
              by You to the Licensor shall be under the terms and conditions of
              this License, without any additional terms or conditions.
              Notwithstanding the above, nothing herein shall supersede or modify
              the terms of any separate license agreement you may have executed
              with Licensor regarding such Contributions.
        
           6. Trademarks. This License does not grant permission to use the trade
              names, trademarks, service marks, or product names of the Licensor,
              except as required for reasonable and customary use in describing the
              origin of the Work and reproducing the content of the NOTICE file.
        
           7. Disclaimer of Warranty. Unless required by applicable law or
              agreed to in writing, Licensor provides the Work (and each
              Contributor provides its Contributions) on an "AS IS" BASIS,
              WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
              implied, including, without limitation, any warranties or conditions
              of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
              PARTICULAR PURPOSE. You are solely responsible for determining the
              appropriateness of using or redistributing the Work and assume any
              risks associated with Your exercise of permissions under this License.
        
           8. Limitation of Liability. In no event and under no legal theory,
              whether in tort (including negligence), contract, or otherwise,
              unless required by applicable law (such as deliberate and grossly
              negligent acts) or agreed to in writing, shall any Contributor be
              liable to You for damages, including any direct, indirect, special,
              incidental, or consequential damages of any character arising as a
              result of this License or out of the use or inability to use the
              Work (including but not limited to damages for loss of goodwill,
              work stoppage, computer failure or malfunction, or any and all
              other commercial damages or losses), even if such Contributor
              has been advised of the possibility of such damages.
        
           9. Accepting Warranty or Additional Liability. While redistributing
              the Work or Derivative Works thereof, You may choose to offer,
              and charge a fee for, acceptance of support, warranty, indemnity,
              or other liability obligations and/or rights consistent with this
              License. However, in accepting such obligations, You may act only
              on Your own behalf and on Your sole responsibility, not on behalf
              of any other Contributor, and only if You agree to indemnify,
              defend, and hold each Contributor harmless for any liability
              incurred by, or claims asserted against, such Contributor by reason
              of your accepting any such warranty or additional liability.
        
           END OF TERMS AND CONDITIONS
        
           Copyright 2026 Ali Seyhan, Baki Turhan
        
           Licensed under the Apache License, Version 2.0 (the "License");
           you may not use this file except in compliance with the License.
           You may obtain a copy of the License at
        
               http://www.apache.org/licenses/LICENSE-2.0
        
           Unless required by applicable law or agreed to in writing, software
           distributed under the License is distributed on an "AS IS" BASIS,
           WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
           See the License for the specific language governing permissions and
           limitations under the License.
        
Project-URL: Homepage, https://github.com/aliseyhann/QuinkGL-Gossip-Learning-Framework
Project-URL: Bug Tracker, https://github.com/aliseyhann/QuinkGL-Gossip-Learning-Framework/issues
Project-URL: Documentation, https://github.com/aliseyhann/QuinkGL-Gossip-Learning-Framework#readme
Keywords: gossip-learning,federated-learning,p2p,decentralized-ai,ipv8
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: System :: Distributed Computing
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<3,>=1.26
Requires-Dist: torch<3,>=2.1
Requires-Dist: torchvision<1,>=0.16
Requires-Dist: scikit-learn<2,>=1.4
Requires-Dist: pandas<3,>=2.0
Requires-Dist: pyipv8<4,>=3.1
Requires-Dist: grpcio<2,>=1.62
Requires-Dist: protobuf<7,>=6.31
Requires-Dist: msgpack<2,>=1.1.2
Requires-Dist: fastapi<1,>=0.110
Requires-Dist: pydantic<3,>=2.0
Requires-Dist: httpx<1,>=0.27
Requires-Dist: uvicorn<1,>=0.29
Provides-Extra: tensorflow
Requires-Dist: tensorflow>=2.12.0; extra == "tensorflow"
Provides-Extra: dev
Requires-Dist: pytest<9,>=8; extra == "dev"
Requires-Dist: pytest-asyncio<1,>=0.23; extra == "dev"
Requires-Dist: black<27,>=24; extra == "dev"
Requires-Dist: flake8<8,>=7; extra == "dev"
Requires-Dist: mypy<2,>=1.8; extra == "dev"
Requires-Dist: grpcio-tools<2,>=1.62; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=7.0; extra == "docs"
Requires-Dist: myst-parser>=2.0; extra == "docs"
Requires-Dist: sphinx-autodoc2>=0.5; extra == "docs"
Requires-Dist: lychee-bin>=0.14; sys_platform != "win32" and extra == "docs"
Provides-Extra: all
Requires-Dist: quinkgl[tensorflow]; extra == "all"
Dynamic: license-file

# QuinkGL: Decentralized Gossip Learning Framework

[![PyPI version](https://badge.fury.io/py/quinkgl.svg)](https://badge.fury.io/py/quinkgl)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

**QuinkGL** is a fully **decentralized, peer-to-peer (P2P) federated learning framework** that enables collaborative model training across distributed devices without relying on a central parameter server. Built on gossip-based protocols, QuinkGL addresses the core challenges of decentralized learning: **communication efficiency**, **non-IID data heterogeneity**, and **Byzantine fault tolerance**.

---

## Motivation

Centralized federated learning (FL) architectures such as FedAvg [[McMahan et al., 2017]](#references) depend on a parameter server for global aggregation, introducing a single point of failure and a communication bottleneck. As edge computing scales — driven by IoT proliferation and privacy-sensitive domains like healthcare — decentralized alternatives become essential.

QuinkGL draws from the gossip learning paradigm [[Ormándi et al., 2013]](#references), where nodes exchange model updates directly with randomly selected peers. This eliminates server dependency and enables organic convergence through repeated local interactions. The framework extends this foundation with:

- **Data-aware peer selection** via privacy-preserving fingerprints
- **Entropy-weighted aggregation** inspired by RNEP [[Kang & Lee, 2024]](#references)
- **Byzantine-resilient strategies** including Krum [[Blanchard et al., 2017]](#references) and TrimmedMean
- **Pluggable architecture** for topology, aggregation, and model strategies

---

## Key Features

| Feature | Description |
|---------|-------------|
| **Fully Decentralized** | No central server — pure P2P gossip protocol |
| **Non-IID Resilient** | AffinityTopology + EntropyWeightedAvg + FedProx + SCAFFOLD for heterogeneous data |
| **Privacy-Preserving Fingerprints** | Quantized, noised, schema-validated data summaries with per-round binding for peer matching |
| **Byzantine Fault Tolerance** | Krum, MultiKrum, TrimmedMean aggregation strategies |
| **NAT Traversal** | IPv8 with UDP hole punching + automatic tunnel fallback |
| **Framework Agnostic** | PyTorch, TensorFlow, or custom model wrappers |
| **Swarm Manifest** | Canonical SHA-256 commitment to training protocol and privacy policy |
| **Personalized FL** | APFL adaptive mixing, FedRep-style backbone/head split |
| **Staleness-Aware** | StalenessWeightedFedAvg for asynchronous environments |
| **Variance Reduction** | SCAFFOLD with gossip-adapted control variates (Karimireddy et al., 2020) |
| **Spectral Analysis** | Runtime algebraic connectivity (λ₂) and spectral gap measurement for topology evaluation |
| **Observability** | Event-driven telemetry with terminal rendering |

---

## Installation

```bash
pip install quinkgl
```

For development:

```bash
git clone https://github.com/QuinkGL/quinkgl-framework.git
cd quinkgl-framework
pip install -e ".[dev]"
```

---

## Quick Start

### CLI (New in Phase 1)

```bash
# Install
pip install quinkgl

# 1. Create a manifest (this is the swarm blueprint, not the swarm itself)
quinkgl manifest create \
  --name my-swarm \
  --task-type class \
  --input-shape 3,224,224 \
  --output-shape 10 \
  --label-type integer \
  --model-framework pytorch \
  --model-arch-hash sha256:7f2c1a9b3e4d0123456789abcdef0123456789abcdef0123456789abcdef0123 \
  --aggregation FedAvg \
  --topology Random \
  --output swarm.qgl

# 2. Verify the manifest
quinkgl manifest verify swarm.qgl

# 3. Get a shareable magnet URI
quinkgl manifest magnet swarm.qgl

# 4. Scaffold a custom peer project
quinkgl init --output-dir my-peer --template pytorch-vision --manifest swarm.qgl

# 5. Start a peer — the swarm is born when the first peer runs
quinkgl run --manifest swarm.qgl --script my-peer/peer_script.py --dry-run
```

> **Note:** Creating the manifest does **not** start a swarm. The manifest is
> only a static blueprint. A swarm comes into existence when the first peer
> calls `quinkgl run` with that manifest.

### Python API

```python
import asyncio
import torch.nn as nn
from quinkgl import GossipNode, PyTorchModel, AffinityTopology, EntropyWeightedAvg

# 1. Define your model
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = x.view(x.size(0), -1)
        return self.fc2(self.relu(self.fc1(x)))

# 2. Wrap the model
model = PyTorchModel(SimpleNet(), device="cpu")

# 3. Create and run the node
async def main():
    node = GossipNode(
        node_id="alice",
        domain="mnist",
        model=model,
        port=7000,
        topology=AffinityTopology(min_affinity=0.3),
        aggregation=EntropyWeightedAvg(),
    )

    await node.start()
    await node.run_continuous(training_data)
    await node.shutdown()

asyncio.run(main())
```

---

## Architecture

```
┌──────────────────────────────────────────────────────────────────┐
│                          GossipNode                              │
│    (Production-ready node with P2P networking + fallback)        │
├──────────────────────────────────────────────────────────────────┤
│  ┌──────────────┐  ┌────────────────┐  ┌──────────────────────┐ │
│  │ PyTorchModel │  │ RandomTopology │  │      FedAvg          │ │
│  │ TensorFlow   │  │ CyclonTopology │  │ FedProx  │ FedAvgM  │ │
│  │ CustomModel  │  │ AffinityTopol. │  │ Krum │ TrimmedMean  │ │
│  │              │  │                │  │ EntropyWeightedAvg   │ │
│  │              │  │                │  │ StalenessWeighted    │ │
│  └──────────────┘  └────────────────┘  └──────────────────────┘ │
├──────────────────────────────────────────────────────────────────┤
│  ┌────────────────────────────────────────────────────────────┐  │
│  │    DataFingerprint ─► AffinityScore ─► Peer Selection     │  │
│  │    (Privacy-preserving data distribution summaries)       │  │
│  └────────────────────────────────────────────────────────────┘  │
├──────────────────────────────────────────────────────────────────┤
│  ┌────────────────────────────────────────────────────────────┐  │
│  │           ModelAggregator (Train → Gossip → Aggregate)    │  │
│  └────────────────────────────────────────────────────────────┘  │
├──────────────────────────────────────────────────────────────────┤
│  ┌────────────────────────────────────────────────────────────┐  │
│  │         IPv8 Network Layer + Tunnel Fallback              │  │
│  │      (P2P, NAT Traversal, UDP Hole Punching, Relay)      │  │
│  └────────────────────────────────────────────────────────────┘  │
├──────────────────────────────────────────────────────────────────┤
│  ┌────────────────────────────────────────────────────────────┐  │
│  │    Observability: EventEmitter → TelemetryClient          │  │
│  └────────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────┘
```

---

## Project Structure

```
QuinkGL/
├── src/quinkgl/
│   ├── core/                  # LearningNode (network-agnostic abstraction)
│   ├── models/                # PyTorch, TensorFlow, personalized model wrappers
│   ├── topology/              # RandomTopology, CyclonTopology, AffinityTopology, SpectralAnalyzer
│   ├── aggregation/           # FedAvg, FedProx, FedAvgM, Krum, TrimmedMean,
│   │                          # EntropyWeightedAvg, StalenessWeightedFedAvg, Scaffold
│   ├── fingerprint/           # DataFingerprint, AffinityWeights, FingerprintComputer
│   ├── manifest/              # SwarmManifest, DataPolicy, CollaborationPolicy
│   ├── gossip/                # Protocol primitives, ModelAggregator orchestration
│   ├── network/               # GossipNode, IPv8 manager, gossip community
│   ├── training/              # Convergence monitoring, prototype-based alignment
│   ├── serialization/         # Model weight serialization, compression pipeline, Error Feedback
│   ├── storage/               # Model checkpointing
│   ├── observability/         # EventEmitter, RuntimeEvent, TerminalObserver
│   ├── telemetry/             # TelemetryClient
│   └── utils/                 # Shared utilities
├── tests/                     # 364+ unit tests
└── docs/                      # Deployment guides, research notes
```

### Package Responsibilities

| Package | Responsibility |
|---------|---------------|
| `core` | Public node abstraction without transport concerns |
| `gossip` | Round orchestration and protocol primitives |
| `network` | IPv8 transport, NAT traversal, and wire delivery |
| `aggregation` | Model merge strategies (pluggable) |
| `topology` | Peer selection, partial-view management, spectral analysis |
| `fingerprint` | Privacy-preserving data distribution summaries |
| `manifest` | Cryptographic swarm identity and policy declaration |
| `training` | Convergence monitoring, prototype alignment (FedProto/FedPAC) |
| `serialization` | Model weight serialization, compression pipeline, error feedback |
| `observability` | Event-driven runtime telemetry |

---

## Topology Strategies

QuinkGL provides pluggable peer selection strategies that determine *which* peers to exchange models with each round.

| Strategy | Approach | Literature |
|----------|----------|-----------|
| `RandomTopology` | Uniform random peer selection | Ormándi et al., 2013 |
| `CyclonTopology` | Periodic shuffling for network exploration | Voulgaris et al., 2005 |
| `AffinityTopology` | **Data-aware** peer selection via fingerprint similarity with exploration–exploitation balancing | Domain-aware collaboration (this work) |

### Spectral Analysis

The `SpectralAnalyzer` provides **runtime measurement** of topology quality through algebraic connectivity and spectral gap — quantities that directly determine gossip convergence speed [[Koloskova et al., 2020]](#references).

```python
from quinkgl.topology import SpectralAnalyzer, build_ring_adjacency

analyzer = SpectralAnalyzer()
report = analyzer.analyze(build_ring_adjacency(10))
print(report.summary())
# n=10 e=10 λ₂=0.3820 gap=0.1315 connected=True mix_time≤17.5
```

| Metric | Meaning |
|--------|--------|
| `algebraic_connectivity` (λ₂) | Fiedler value — positive ↔ connected graph |
| `spectral_gap` (1−\|λ₂(W)\|) | Larger gap → faster gossip convergence |
| `mixing_time_upper` | Upper bound: `log(n) / spectral_gap` |
| `is_connected` | Whether the graph is fully connected |

### AffinityTopology — Like-Attracts-Like

`AffinityTopology` selects peers based on **data distribution similarity** using privacy-preserving fingerprints. It incorporates:

- **Multi-signal affinity** — label buckets (40%), feature moments (30%), gradient similarity (15%), collaboration history (15%)
- **Cold-start resilience** — three phases (blind → learning → exploiting) with decaying exploration ratio
- **Adaptive collaboration graph** — EMA-blended edge weights with automatic decay and eviction of stale edges

---

## Communication Efficiency — Error Feedback

QuinkGL's compression pipeline (Delta → Sparsify → Quantize → Serialize → Zlib) uses **biased compressors** (Top-k, QSGD). Without correction, these break convergence guarantees. The `ErrorFeedbackState` module implements the **Error Feedback** mechanism [[Alistarh et al., 2018]](#references) that accumulates the compression residual and re-injects it in the next round:

```python
from quinkgl.serialization import CompressionConfig, SparsificationConfig

config = CompressionConfig(
    sparsification=SparsificationConfig(top_k_ratio=0.01),
    error_feedback=True,   # activate EF — turns biased compressor effectively unbiased
)
```

**Key property**: Over K rounds, `Σ compressed_outputs + final_residual = Σ raw_deltas` (information conservation, verified by unit tests). Supports EF21-style momentum blending and optional residual norm capping.

## Aggregation Strategies

All strategies implement the `AggregationStrategy` interface and are hot-swappable.

| Strategy | Type | Description | Reference |
|----------|------|-------------|-----------|
| `FedAvg` | Standard | Weighted averaging by sample count | McMahan et al., 2017 |
| `FedProx` | Non-IID | Proximal term to limit client drift | Li et al., 2020 |
| `FedAvgM` | Stability | Server momentum for smoother convergence | Hsu et al., 2019 |
| `EntropyWeightedAvg` | Non-IID | Shannon entropy–based weighting (RNEP-inspired) | Kang & Lee, 2024 |
| `StalenessWeightedFedAvg` | Async | Exponential penalty for stale updates | — |
| `Scaffold` | Non-IID | Control-variate drift correction (gossip variant) | Karimireddy et al., 2020 |
| `TrimmedMean` | Byzantine | Trim extreme values before averaging | Yin et al., 2018 |
| `Krum` / `MultiKrum` | Byzantine | Select most central update(s) | Blanchard et al., 2017 |

### EntropyWeightedAvg — RNEP-Inspired Aggregation

Weights each peer's contribution by the **Shannon entropy** of its local label distribution. Peers with diverse (high-entropy) data exert more influence on the aggregated model, while skewed (low-entropy) peers contribute less — preventing overfitting to biased local distributions.

```python
from quinkgl import EntropyWeightedAvg

aggregation = EntropyWeightedAvg(
    entropy_floor=0.01,    # minimum weight for single-class peers
    fallback_weight=1.0,   # weight when no distribution metadata available
)
```

### Scaffold — Variance Reduction via Control Variates

Implements the SCAFFOLD algorithm [[Karimireddy et al., 2020]](#references) adapted for gossip topology. Each node maintains a *control variate* that estimates its local gradient drift. The gossip variant replaces the central server's global control variate with a running EMA of peer control variates.

```python
from quinkgl import Scaffold

aggregation = Scaffold(
    learning_rate=0.01,       # local SGD learning rate
    global_learning_rate=1.0, # aggregation-side scaling
    control_momentum=0.0,     # 0.0 = classic EF, 0.9 = EF21 momentum
)
```

Key property: SCAFFOLD provably reduces the gradient variance caused by non-IID data, unlike FedProx which only adds a proximal penalty.

---

## Privacy-Preserving Data Fingerprints

Each node computes a lightweight, **privacy-preserving summary** of its local data distribution. Raw statistics are never shared — all fields are transformed before transmission.

| Raw Field | Privacy Transform | Output |
|-----------|-------------------|--------|
| Label distribution | Quantize into buckets (low/medium/high) | `label_buckets` |
| Feature moments (mean, var) | Add calibrated Gaussian noise | `noised_moments` |
| Sample count | Bucket into ranges (e.g., "1k–10k") | `sample_bucket` |
| Gradient moments | Noise + **disabled by default** (gradient inversion risk) | `gradient_moments` |

Fingerprints are exchanged during peer discovery and used by `AffinityTopology` to compute affinity scores.

Fingerprint payloads are schema-versioned, strictly validated on parse, and can be refreshed with a per-round nonce during long-running gossip sessions to reduce cross-round linkability.

---

## Swarm Manifest

The **Swarm Manifest** (`.qgl` file) is the canonical protocol-identity layer
that binds swarm compatibility to a description of the training protocol,
model architecture, aggregation strategy, topology, and trust boundary.

A manifest is **not** a running swarm — it is only a static blueprint.  The
swarm comes into existence when peers call `quinkgl run --manifest swarm.qgl`.

Manifests are:

- **Canonically hashed** (SHA-256 over deterministic JSON) so any change to
  policy or architecture produces a new swarm identity.
- **Schema-versioned** and strictly validated to avoid silent field drops or
  incompatible policy mixes.
- **Optionally signed** with Ed25519 so peers can verify creator identity
  before joining.

To create a manifest you need the architecture hash of your model, which is a
fingerprint of layer names, shapes, and dtypes (not weights).  Compute it with
`quinkgl.manifest.compute_arch_hash(model)` and pass it to
`quinkgl manifest create --model-arch-hash <hash>`.

---

## Personalized Federated Learning

QuinkGL supports personalization techniques to handle statistical heterogeneity:

| Technique | Description |
|-----------|-------------|
| **APFL** (Adaptive Personalized FL) | Adaptive mixing coefficient between local and global models |
| **FedRep-style split** | Shared backbone + personalized head via `ModelSplit` |
| **FedProto / FedPAC** | Prototype-based alignment and classifier collaboration |

---

## Public API Overview

### Core

| Class | Description |
|-------|-------------|
| `LearningNode` | Framework node without networking (bring your own transport) |
| `GossipNode` | Production node with IPv8 P2P + automatic tunnel fallback |

### Models

| Class | Description |
|-------|-------------|
| `PyTorchModel` | Wrapper for PyTorch `nn.Module` with NaN validation, gradient clipping |
| `TensorFlowModel` | Wrapper for TensorFlow/Keras models |
| `ModelWrapper` | Base class for custom framework wrappers |
| `PersonalizedModelWrapper` | Base for APFL-style personalized models |
| `TrainingConfig` | Training configuration (epochs, batch_size, lr, grad_clip, optimizer) |

### Fingerprint

| Class | Description |
|-------|-------------|
| `DataFingerprint` | Privacy-preserving data distribution summary |
| `FingerprintComputer` | Computes fingerprints from raw data with configurable privacy |
| `AffinityWeights` | Weights for multi-signal affinity computation |
| `FingerprintPrivacyConfig` | ε-DP budget, noise levels, bucket granularity |

### Manifest & Policy

| Class | Description |
|-------|-------------|
| `DataPolicy` | Minimum affinity, privacy level, cold-start rounds |
| `CollaborationPolicy` | Aggregation and topology parameters |
| `PersonalizationPolicy` | APFL, FedRep configuration |
| `PrototypePolicy` | FedProto/FedPAC alignment settings |

### Observability

| Class | Description |
|-------|-------------|
| `EventEmitter` | Publish/subscribe runtime events |
| `RuntimeEvent` | Structured event payload |
| `TerminalObserver` | Human-readable terminal rendering |
| `TelemetryClient` | Telemetry data collection |

---

## Requirements

- Python 3.9+
- PyTorch 1.9+ (optional, for `PyTorchModel`)
- TensorFlow 2.x (optional, for `TensorFlowModel`)
- IPv8 2.0+ (for P2P networking)
- NumPy

---

## Documentation

The canonical documentation set lives under [`docs/`](docs/). Use **[`docs/index.md`](docs/index.md)** as the entry point: it has a short decision tree and a table of contents that mirrors the book layout (Sphinx toctree).

### Quick entry

| Document | Description |
|----------|-------------|
| [`docs/index.md`](docs/index.md) | Hub: decision tree and links into all sections |
| [`docs/quickstart.md`](docs/quickstart.md) | Minimal “get running” path |
| [`docs/getting-started.md`](docs/getting-started.md) | Full getting started (English) |
| [`docs/getting-started-tr.md`](docs/getting-started-tr.md) | Full getting started (Turkish) |
| [`docs/faq.md`](docs/faq.md) | Frequently asked questions |

### By section

| Section | Start here |
|---------|------------|
| User guide | [`docs/user-guide/index.md`](docs/user-guide/index.md) (manifest, peer script, trust, telemetry, troubleshooting) |
| CLI | [`docs/cli/index.md`](docs/cli/index.md) (`manifest`, `run`, `init`, `keygen`, …) |
| Tutorials | [`docs/tutorials/index.md`](docs/tutorials/index.md) (T1–T6) |
| Concepts | [`docs/concepts/index.md`](docs/concepts/index.md) (gossip, swarm, fingerprints) |
| Reference | [`docs/reference/index.md`](docs/reference/index.md) (API, manifest schema, error codes) |
| Security | [`docs/security/index.md`](docs/security/index.md) (threat model, signing, TOFU, rate limits) |
| Cookbook | [`docs/cookbook/index.md`](docs/cookbook/index.md) (local swarm, multi-peer testing, custom wrappers) |
| Migration | [`docs/migration/index.md`](docs/migration/index.md) |

---

## References

- **McMahan et al.** (2017). *Communication-Efficient Learning of Deep Networks from Decentralized Data.* AISTATS. (FedAvg)
- **Ormándi et al.** (2013). *Gossip Learning with Linear Models on Fully Distributed Data.* Concurrency and Computation.
- **Blanchard et al.** (2017). *Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent.* NeurIPS. (Krum)
- **Yin et al.** (2018). *Byzantine-Robust Distributed Learning.* ICML. (TrimmedMean)
- **Li et al.** (2020). *Federated Optimization in Heterogeneous Networks.* MLSys. (FedProx)
- **Hsu et al.** (2019). *Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification.* (FedAvgM)
- **Kang & Lee** (2024). *RNEP: Random Node Entropy Pairing for Efficient Decentralized Training with Non-IID Local Data.* Electronics, 13(21), 4193. (EntropyWeightedAvg)
- **Karimireddy et al.** (2020). *SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.* ICML. (Scaffold)
- **Alistarh et al.** (2018). *The Convergence of Sparsified Gradient Methods.* NeurIPS. (Error Feedback)
- **Richtárik et al.** (2021). *EF21: A New, Simpler, Theoretically Better.* NeurIPS. (EF21 momentum)
- **Koloskova et al.** (2020). *Unified Theory of Decentralized SGD with Changing Topology and Local Updates.* ICML. (Spectral Gap)
- **Boyd et al.** (2006). *Randomized Gossip Algorithms.* IEEE Trans. Inf. Theory. (Metropolis–Hastings mixing)
- **Voulgaris et al.** (2005). *Cyclon: Inexpensive Membership Management for Unstructured P2P Overlays.* JNSM. (CyclonTopology)
- **Deng et al.** (2021). *Adaptive Personalized Federated Learning.* (APFL)

---

## License

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Copyright 2026 Ali Seyhan, Baki Turhan

---

## Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests to the main repository.
