Metadata-Version: 2.4
Name: jsleekr-ratelimit
Version: 1.0.0
Summary: Multi-algorithm rate limiter with pluggable backends
Author-email: JSLEEKR <93jslee@gmail.com>
License: MIT
Keywords: rate-limit,throttle,token-bucket,sliding-window,api
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Dynamic: license-file

<div align="center">

# ⏱️ ratelimit

### Multi-algorithm rate limiter with pluggable backends

[![GitHub Stars](https://img.shields.io/github/stars/JSLEEKR/ratelimit?style=for-the-badge&logo=github&color=yellow)](https://github.com/JSLEEKR/ratelimit/stargazers)
[![License](https://img.shields.io/badge/license-MIT-blue?style=for-the-badge)](LICENSE)
[![Python](https://img.shields.io/badge/python-3.10+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://python.org)
[![Tests](https://img.shields.io/badge/tests-416%20passing-brightgreen?style=for-the-badge)](#testing)

<br/>

**Choose the right rate limiting algorithm for your use case -- 7 algorithms, one unified async API**

Token Bucket + Sliding Window + Fixed Window + Leaky Bucket + GCRA + Concurrency Limiter

[Quick Start](#-quick-start) | [Features](#features) | [Algorithms](#-algorithms) | [Architecture](#-architecture)

</div>

---

## Why This Exists

Every API needs rate limiting, but no single algorithm fits all cases. Token Bucket allows bursts, Leaky Bucket smooths traffic, Sliding Window avoids boundary issues, GCRA powers Stripe and Shopify at scale, and Concurrency Limiter caps parallelism. Most libraries force you into one algorithm. If your needs change, you rewrite.

`ratelimit` gives you seven algorithms behind a single `acquire/peek/reset` interface. Swap algorithms without touching your application code. Add multi-tier limits with groups and chains. Get production-ready presets for common scenarios like login protection, API tiers, and webhook delivery. All async-first, all zero dependencies.

- **7 algorithms** -- pick the right one for your use case, swap anytime without code changes
- **Async-first** -- native `async/await` API designed for modern Python applications
- **Production presets** -- one-line setup for login protection, API tiers, webhook delivery, and more
- **Zero dependencies** -- pure Python, no external packages required

Stop implementing rate limiting from scratch. Start choosing the right algorithm.

---

## Features

| Category | Feature | Description |
|----------|---------|-------------|
| **Algorithms** | Token Bucket | Smooth rate limiting with configurable burst |
| **Algorithms** | Fixed Window | Simple time-window counters |
| **Algorithms** | Sliding Window Log | Exact request counting with per-request timestamps |
| **Algorithms** | Sliding Window Counter | Balanced accuracy/memory with weighted window overlap |
| **Algorithms** | Leaky Bucket | Constant-rate output for traffic smoothing |
| **Algorithms** | GCRA | Generic Cell Rate Algorithm (used by Stripe, Shopify) |
| **Algorithms** | Concurrency Limiter | Cap parallel connections/operations |
| **API** | Factory Function | `create_limiter(100, 60)` one-line setup |
| **API** | Decorator | `@rate_limit(limiter)` for function-level limiting |
| **API** | Context Manager | `async with RateLimitContext(...)` for scoped limiting |
| **API** | Wait Mode | `wait_and_acquire()` with automatic backpressure |
| **API** | HTTP Headers | `result.to_headers()` for standard rate limit headers |
| **API** | Callbacks | `on_limited()` and `on_allowed()` event hooks |
| **Composition** | Groups | Multi-tier limits (10/sec AND 1000/hour) -- all must allow |
| **Composition** | Chains | Sequential rate limit evaluation |
| **Composition** | Weighted Limiter | Different costs per endpoint with priority reserves |
| **Protection** | Circuit Breaker | Automatic failure protection (closed/open/half-open) |
| **Protection** | Penalty Tracker | Progressive backoff for repeat offenders |
| **Analytics** | Stats Collector | Per-key metrics (allowed, denied, latency) |
| **Analytics** | Rate Estimator | Real-time request rate estimation and prediction |
| **Analytics** | Quota Manager | Hourly/daily/weekly/monthly usage quota tracking |
| **Utilities** | Key Extractors | IP, user, API key, and endpoint pattern extraction |
| **Utilities** | Retry Strategies | Fixed, exponential backoff, retry-after header parsing |
| **Utilities** | Snapshots | State serialization for debugging and persistence |
| **Utilities** | Algorithm Info | Introspection and recommendation engine |
| **Presets** | 10 Presets | api_standard, api_strict, api_generous, login_protection, webhook_delivery, search_api, upload_limit, free_tier, pro_tier, enterprise_tier |
| **Tooling** | CLI Benchmark | Compare algorithm performance from the command line |
| **Backends** | Memory Backend | In-memory storage with TTL support |

---

## 🚀 Quick Start

```bash
# 1. Install ratelimit
pip install -e .

# 2. Use in your application
python -c "
import asyncio
from ratelimit import create_limiter

async def main():
    limiter = create_limiter(100, 60)  # 100 requests per minute
    result = await limiter.acquire('user:123')
    print(f'Allowed: {result.allowed}, Remaining: {result.remaining}')

asyncio.run(main())
"

# 3. Or use presets
python -c "
import asyncio
from ratelimit import get_preset

async def main():
    limiter = get_preset('login_protection')  # 5 attempts / 15 min
    result = await limiter.acquire('user:login')
    print(f'Allowed: {result.allowed}')

asyncio.run(main())
"
```

---

## 📊 Algorithms

| Algorithm | Best For | Burst | Memory | Boundary Issues |
|-----------|----------|-------|--------|-----------------|
| **Token Bucket** | General API limiting | Yes (configurable) | O(1) | None |
| **Fixed Window** | Simple counters, dashboards | Boundary 2x | O(1) | Yes (2x at boundary) |
| **Sliding Window Log** | Exact counting, compliance | No | O(n) | None |
| **Sliding Window Counter** | Balanced accuracy/memory | Minimal | O(1) | Approximate |
| **Leaky Bucket** | Traffic smoothing, webhooks | Configurable | O(1) | None |
| **GCRA** | Production (Stripe/Shopify) | Yes | O(1) | None |
| **Concurrency Limiter** | Parallel connection caps | N/A | O(n) | N/A |

### Token Bucket

Tokens refill at a constant rate. Each request consumes one token. If the bucket is empty, the request is denied. Supports burst by starting with a full bucket.

```python
from ratelimit import create_limiter

limiter = create_limiter(100, 60, algorithm="token_bucket", burst_size=20)
```

### Fixed Window

Simple counter per time window. Resets at window boundaries. Can allow 2x the limit at window boundaries.

```python
limiter = create_limiter(100, 60, algorithm="fixed_window")
```

### Sliding Window Log

Stores timestamp of every request. Most accurate but uses O(n) memory. Best for compliance and exact counting.

```python
limiter = create_limiter(100, 60, algorithm="sliding_window_log")
```

### Sliding Window Counter

Approximates sliding window using weighted overlap between current and previous fixed windows. O(1) memory with good accuracy.

```python
limiter = create_limiter(100, 60, algorithm="sliding_window_counter")
```

### Leaky Bucket

Requests enter a bucket that drains at a constant rate. Produces smooth, constant-rate output.

```python
limiter = create_limiter(10, 1, algorithm="leaky_bucket")
```

### GCRA (Generic Cell Rate Algorithm)

Used by Stripe and Shopify. Elegant single-value algorithm that tracks the next allowed request time. Best all-rounder for production.

```python
limiter = create_limiter(100, 60, algorithm="gcra")
```

### Concurrency Limiter

Caps the number of concurrent operations rather than request rate. Perfect for database connection pools or parallel API calls.

```python
from ratelimit import ConcurrencyLimiter, MemoryBackend, RateLimiter, RateLimitConfig

config = RateLimitConfig(max_requests=10, window_seconds=1)
limiter = RateLimiter(ConcurrencyLimiter(MemoryBackend(), config))
```

---

## 📋 Usage Patterns

### Decorator

```python
from ratelimit import rate_limit, create_limiter

limiter = create_limiter(100, 60)

@rate_limit(limiter, key=lambda user_id: f"user:{user_id}")
async def get_data(user_id: str):
    return await fetch_data(user_id)

# With wait mode (blocks instead of raising)
@rate_limit(limiter, wait=True, timeout=30.0)
async def get_data_wait(user_id: str):
    return await fetch_data(user_id)
```

### Context Manager

```python
from ratelimit import RateLimitContext, ConcurrencyContext

# Rate limiting context
async with RateLimitContext(limiter, "user:123") as result:
    if result.allowed:
        process_request()

# Concurrency context (auto-release on exit)
async with ConcurrencyContext(concurrency_limiter, "user:123"):
    await long_running_task()
```

### Multi-Tier Rate Limiting

```python
from ratelimit import create_limiter, RateLimitGroup

per_second = create_limiter(10, 1, key_prefix="sec")
per_minute = create_limiter(100, 60, key_prefix="min")
per_hour = create_limiter(1000, 3600, key_prefix="hour")

group = RateLimitGroup(per_second, per_minute, per_hour)
result = await group.acquire("user:123")  # All three must allow
```

### Presets

```python
from ratelimit import get_preset, list_presets

# See all available presets
print(list_presets())

# Use presets
limiter = get_preset("api_standard")       # 100 req/min, 20 burst
limiter = get_preset("api_strict")         # 30 req/min, no burst
limiter = get_preset("api_generous")       # 1000 req/min, 200 burst
limiter = get_preset("login_protection")   # 5 attempts / 15 min
limiter = get_preset("webhook_delivery")   # 10 req/sec, smoothed
limiter = get_preset("search_api")         # 10/sec AND 60/min (dual)
limiter = get_preset("upload_limit")       # 10 uploads / hour
limiter = get_preset("free_tier")          # 100 req/hour
limiter = get_preset("pro_tier")           # 5000 req/hour, 100 burst
limiter = get_preset("enterprise_tier")    # 50000 req/hour, 500 burst
```

### Circuit Breaker

```python
from ratelimit import CircuitBreaker

breaker = CircuitBreaker(
    failure_threshold=5,      # Open after 5 failures
    recovery_timeout=30.0,    # Try again after 30 seconds
    half_open_max_calls=3,    # Allow 3 test calls in half-open
)

if breaker.allow_request():
    try:
        result = await external_api_call()
        breaker.record_success()
    except Exception:
        breaker.record_failure()
```

### Penalty Tracker

```python
from ratelimit import PenaltyTracker

tracker = PenaltyTracker(
    base_penalty=60.0,        # 1 minute base penalty
    multiplier=2.0,           # Double each time
    max_penalty=3600.0,       # Cap at 1 hour
)

# Record violation
tracker.record_violation("abuser:ip")

# Check if penalized
penalty = tracker.get_penalty("abuser:ip")
if penalty > 0:
    print(f"Penalized for {penalty:.0f} more seconds")
```

### HTTP Headers

```python
result = await limiter.acquire("user:123")

# Standard rate limit headers
headers = result.to_headers()
# {
#   "X-RateLimit-Limit": "100",
#   "X-RateLimit-Remaining": "99",
#   "X-RateLimit-Reset": "1711468800",
#   "Retry-After": "60"  (only when denied)
# }
```

### Statistics

```python
from ratelimit import StatsCollector

stats = StatsCollector()
stats.record(key="user:123", allowed=True, latency_ms=1.2)
stats.record(key="user:123", allowed=False, latency_ms=0.8)

summary = stats.get_summary("user:123")
# {"total": 2, "allowed": 1, "denied": 1, "avg_latency_ms": 1.0}
```

### Quota Manager

```python
from ratelimit import QuotaManager

quota = QuotaManager()
quota.set_quota("user:123", hourly=1000, daily=10000, monthly=100000)

result = quota.check("user:123")
print(f"Hourly: {result.hourly_remaining}, Daily: {result.daily_remaining}")
```

### Algorithm Recommendation

```python
from ratelimit.info import recommend_algorithm, list_algorithms

# Get recommendation based on requirements
info = recommend_algorithm(needs_burst=True, memory_constrained=True)
# => GCRA - best all-rounder for production

# List all algorithms with descriptions
for algo in list_algorithms():
    print(f"{algo.algorithm.value}: {algo.name} - {algo.best_for}")
```

---

## 🏗️ Architecture

```
ratelimit/
├── core.py              # RateLimiter, RateLimitResult, RateLimitConfig, Backend ABC
├── algorithms/
│   ├── token_bucket.py  # Token Bucket algorithm
│   ├── fixed_window.py  # Fixed Window counter
│   ├── sliding_window.py # Sliding Window (Log + Counter)
│   ├── leaky_bucket.py  # Leaky Bucket algorithm
│   ├── gcra.py          # Generic Cell Rate Algorithm
│   └── concurrency.py   # Concurrency Limiter
├── backends/
│   └── memory.py        # In-memory storage backend with TTL
├── factory.py           # create_limiter() one-line factory
├── decorator.py         # @rate_limit decorator (sync + async)
├── context.py           # RateLimitContext, ConcurrencyContext
├── groups.py            # RateLimitGroup, RateLimitChain
├── presets.py           # 10 pre-configured policies
├── circuit.py           # CircuitBreaker (closed/open/half-open)
├── penalty.py           # PenaltyTracker with exponential backoff
├── stats.py             # StatsCollector for per-key metrics
├── estimator.py         # RateEstimator for traffic prediction
├── quota.py             # QuotaManager (hourly/daily/weekly/monthly)
├── events.py            # Event system with async-compatible emitter
├── keys.py              # Key extraction (IP, user, API key, endpoint)
├── retry.py             # Retry strategies (fixed, exponential, retry-after)
├── snapshot.py          # State serialization and debugging
├── weighted.py          # Weighted limiter with priority reserves
├── info.py              # Algorithm introspection and recommender
├── cli.py               # CLI benchmarking tool
└── middleware/           # Framework middleware (extensible)
```

### Request Flow

```
    Request
      │
      ▼
┌──────────────┐
│  Key Extract │  (IP, user, API key, endpoint)
└──────┬───────┘
       │
       ▼
┌──────────────┐    ┌──────────────┐
│   Penalty    │───▶│   Circuit    │
│   Tracker    │    │   Breaker    │
└──────┬───────┘    └──────┬───────┘
       │                   │
       ▼                   ▼
┌──────────────────────────────┐
│     RateLimiter.acquire()    │
│  ┌────────────────────────┐  │
│  │   Algorithm Engine     │  │
│  │  (Token Bucket, GCRA,  │  │
│  │   Sliding Window, etc) │  │
│  └───────────┬────────────┘  │
│              │               │
│  ┌───────────▼────────────┐  │
│  │   Memory Backend       │  │
│  │   (get/set/increment)  │  │
│  └────────────────────────┘  │
└──────────────┬───────────────┘
               │
      ┌────────┼────────┐
      ▼                 ▼
  Allowed            Denied
      │                 │
      ▼                 ▼
┌──────────┐    ┌──────────────┐
│  Stats   │    │  Retry-After │
│ Record   │    │  + Headers   │
└──────────┘    └──────────────┘
```

---

## 📡 API Reference

### Core

```python
from ratelimit import (
    # Algorithms
    TokenBucket, FixedWindow, SlidingWindowLog,
    SlidingWindowCounter, LeakyBucket, GCRA, ConcurrencyLimiter,
    # Backend
    MemoryBackend,
    # Core types
    RateLimiter, RateLimitResult, RateLimitConfig, Algorithm,
    # Decorator
    rate_limit, RateLimitExceeded,
    # Composition
    RateLimitGroup, RateLimitChain,
    # Context managers
    RateLimitContext, ConcurrencyContext,
    # Utilities
    StatsCollector, CircuitBreaker, PenaltyTracker,
    RateEstimator, QuotaManager,
    # Factory & Presets
    create_limiter, get_preset, list_presets,
)
```

### RateLimitResult

```python
result = await limiter.acquire("key")

result.allowed        # bool: was the request allowed?
result.remaining      # int: remaining requests in window
result.limit          # int: total limit
result.reset_at       # float: Unix timestamp when limit resets
result.retry_after    # float: seconds to wait before retrying
result.reset_in       # float: seconds until reset (computed property)
result.to_headers()   # dict: standard HTTP rate limit headers
```

### RateLimiter Methods

```python
# Try to acquire (non-blocking)
result = await limiter.acquire("key", cost=1)

# Check without consuming
result = await limiter.peek("key")

# Reset a key
await limiter.reset("key")

# Wait and acquire (blocking with timeout)
result = await limiter.wait_and_acquire("key", cost=1, timeout=30.0)

# Event callbacks
@limiter.on_limited
def handle_limited(key, result):
    log.warning(f"Rate limited: {key}")

@limiter.on_allowed
def handle_allowed(key, result):
    stats.record(key)
```

---

## 🔧 How It Works

1. **Key Extraction** -- Each request is identified by a key (user ID, IP, API key, or custom)
2. **Algorithm Selection** -- The configured algorithm determines how requests are counted/tracked
3. **Backend Query** -- The algorithm queries the storage backend for current state
4. **Decision** -- The algorithm decides allow/deny based on its specific logic
5. **State Update** -- On allow, the backend state is updated (decrement tokens, add timestamp, etc.)
6. **Result** -- A `RateLimitResult` is returned with allowed/denied, remaining count, reset time, and retry-after
7. **Headers** -- Results can be converted to standard HTTP headers for API responses

---

## 🛠️ CLI Benchmarking

```bash
# Benchmark an algorithm
python -m ratelimit.cli bench -a token_bucket -r 100 -w 1 -n 1000

# Output:
# Algorithm:         token_bucket
# Limit:             100 / 1.0s
# Total requests:    1000
# Allowed:           100
# Denied:            900
# Elapsed:           0.0123s
# Throughput:        81300.81 req/s

# List all algorithms
python -m ratelimit.cli list
```

---

## ❓ Troubleshooting

### Which Algorithm Should I Use?

| Use Case | Recommended | Why |
|----------|-------------|-----|
| General API | Token Bucket or GCRA | Both handle bursts well with O(1) memory |
| Login protection | Sliding Window Log | Exact counting prevents boundary attacks |
| Webhook delivery | Leaky Bucket | Smooth constant-rate output |
| Simple counters | Fixed Window | Simplest to understand and debug |
| Connection pooling | Concurrency Limiter | Caps parallelism, not rate |
| Production at scale | GCRA | Battle-tested at Stripe/Shopify |

### Memory Concerns

- Token Bucket, Fixed Window, GCRA, Leaky Bucket: O(1) memory per key
- Sliding Window Log: O(n) where n is requests in the window -- use Counter variant for large volumes
- Concurrency Limiter: O(n) where n is concurrent operations

### Async vs Sync

All APIs are async-first. For sync code, use the decorator which handles the event loop automatically:

```python
@rate_limit(limiter)
def sync_function():  # Works with sync functions too
    pass
```

---

## 🧪 Testing

```bash
# Install dev dependencies
pip install -e ".[dev]"

# Run all 416 tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=ratelimit --cov-report=term-missing

# Run specific algorithm tests
pytest tests/test_algorithms/test_token_bucket.py -v
pytest tests/test_algorithms/test_gcra.py -v

# Run integration tests
pytest tests/test_integration/ -v
```

---

## License

MIT
