Metadata-Version: 2.4
Name: PLD_accounting
Version: 0.3.0
Summary: Numerical privacy accounting for random allocation and subsampling using PLDs.
Author: Moshe Shenfeld
Maintainer: Moshe Shenfeld
License-Expression: MIT
Project-URL: Homepage, https://github.com/moshenfeld/PLD_accounting
Project-URL: Repository, https://github.com/moshenfeld/PLD_accounting
Project-URL: Issues, https://github.com/moshenfeld/PLD_accounting/issues
Keywords: differential privacy,privacy accounting,privacy loss distribution,pld
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.23
Requires-Dist: scipy>=1.10
Requires-Dist: numba>=0.58
Requires-Dist: dp-accounting>=0.4.3
Requires-Dist: typing-extensions>=4.0
Provides-Extra: test
Requires-Dist: pytest>=8.0; extra == "test"
Requires-Dist: pytest-cov>=4.0; extra == "test"
Requires-Dist: mypy>=1.10; extra == "test"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: mypy>=1.10; extra == "dev"
Requires-Dist: build>=1.2.0; extra == "dev"
Dynamic: license-file

# PLD_accounting

`PLD_accounting` is a Python package for tight differential privacy accounting of random allocation and subsampling using Privacy Loss Distributions (PLDs) as described in: [Efficient privacy loss accounting for subsampling and random allocation](https://arxiv.org/pdf/2602.17284)

## Purpose

- Compute tight upper/lower DP bounds for random allocation.
- Support both Gaussian mechanisms and explicit PLD realizations.
- Return `dp_accounting` PLDs for epsilon/delta queries and composition workflows.

## Random Allocation Model

The package accounts for the following sampling pattern:

- Per epoch: choose `k` steps out of `t` uniformly at random.
- Across training: repeat this for `num_epochs` epochs.

## Parameter Mapping

- `num_steps = t` (total candidate steps per epoch)
- `num_selected = k` (selected steps per epoch)
- `num_epochs` (number of repeated epochs)

Internal composition:

- `floor_steps = floor(num_steps / num_selected)`
- `remainder = num_steps - num_selected * floor_steps`
- `floor_epochs = (num_selected - remainder) * num_epochs`
- `ceil_steps = floor_steps + 1`
- `ceil_epochs = remainder * num_epochs`
- We compute 1-out-of-`floor_steps` then self compose it `floor_epochs` times, compute 1-out-of-`ceil_steps` then self compose it `ceil_epochs` times, and finally compose them with each other

## API Overview

### Random Allocation APIs

Gaussian path (most common):

- `gaussian_allocation_epsilon_range(delta, sigma, num_steps, num_selected=1, num_epochs=1, epsilon_accuracy=-1.0)`
  - Adaptive upper/lower bounds for epsilon.
- `gaussian_allocation_delta_range(epsilon, sigma, num_steps, num_selected=1, num_epochs=1, delta_accuracy=-1.0)`
  - Adaptive upper/lower bounds for delta.
- `gaussian_allocation_epsilon_extended(params, config, bound_type=BoundType.DOMINATES)`
  - Single epsilon query with explicit discretization/convolution config.
- `gaussian_allocation_delta_extended(params, config, bound_type=BoundType.DOMINATES)`
  - Single delta query with explicit discretization/convolution config.
- `gaussian_allocation_PLD(params, config, bound_type=BoundType.DOMINATES)`
  - Build a reusable `dp_accounting.PrivacyLossDistribution`.

Realization path (advanced):

- `general_allocation_PLD(num_steps, num_selected, num_epochs, remove_realization, add_realization, config, bound_type=BoundType.DOMINATES)`
  - Build PLD from explicit `PLDRealization` inputs.
- `general_allocation_epsilon(delta, num_steps, num_selected, num_epochs, remove_realization, add_realization, config, bound_type=BoundType.DOMINATES)`
  - Epsilon query from explicit realizations.
- `general_allocation_delta(epsilon, num_steps, num_selected, num_epochs, remove_realization, add_realization, config, bound_type=BoundType.DOMINATES)`
  - Delta query from explicit realizations.

Common notes:

- `BoundType.DOMINATES` gives an upper (pessimistic) bound.
- `BoundType.IS_DOMINATED` gives a lower (optimistic) bound.
- Builders do not accept `BoundType.BOTH`; build two PLDs if both bounds are needed.

### Subsampling APIs

PLD-based subsampling helpers:

- `subsample_PLD(pld, sampling_probability)`
  - Applies subsampling amplification to a `dp_accounting` PLD.
- `subsample_PMF(base_pld, sampling_prob, direction)`
  - Lower-level helper for `PLDRealization` inputs (REMOVE/ADD direction).

Subsampling helpers use DOMINATES semantics (upper-bound style).

## Install

```bash
pip install PLD_accounting
```

## Where To Start

- Usage examples: [usage_example.py](usage_example.py)
- Implementation details: [IMPLEMENTATION_OVERVIEW.md](IMPLEMENTATION_OVERVIEW.md)
