Metadata-Version: 2.4
Name: raceguard
Version: 0.2.1
Summary: A heuristic-based zero-overhead thread race condition detector for Python.
Author: Chukwunwike Obodo
License: MIT License
        
        Copyright (c) 2026 Chukwunwike Obodo
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/raceguard/raceguard
Project-URL: Repository, https://github.com/raceguard/raceguard
Project-URL: Issues, https://github.com/raceguard/raceguard/issues
Project-URL: Changelog, https://github.com/raceguard/raceguard/blob/main/CHANGELOG.md
Keywords: debugging,threads,race-condition,concurrency,lock,threading,detector,heuristic
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Debuggers
Classifier: Topic :: Software Development :: Testing
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
Dynamic: license-file

# Raceguard

[![PyPI version](https://badge.fury.io/py/raceguard.svg)](https://badge.fury.io/py/raceguard)
[![Python Versions](https://img.shields.io/pypi/pyversions/raceguard.svg)](https://pypi.org/project/raceguard/)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Typing: Strict](https://img.shields.io/badge/mypy-strict-success.svg)](https://mypy.readthedocs.io/en/stable/)

**Detect real data races in your code before they become production bugs.**

**[View Live Showcase & Docs ↗](https://chukwunwike.github.io/raceguard/)**

Raceguard is a runtime concurrency safety tool that observes your program execution and flags unsafe memory access patterns across threads and async tasks, without requiring compiler support or complex setup.

---

## The Problem

Concurrency bugs are some of the hardest issues to detect and fix.

They are:
*   **Non-deterministic**: Bugs appear randomly and are hard to pin down.
*   **Invisible**: They often hide until high-traffic production environments.
*   **Corrupting**: They cause silent data corruption that is painful to debug.

Most developers only discover race conditions after something breaks. Existing tools are often too complex, slow, or invasive for everyday workflows.

---

## What Raceguard Does

Raceguard watches your shared objects as they are accessed and detects:
*   **Concurrent writes** to the same memory space.
*   **Read/Write conflicts** across threads or async flows.
*   **Unsafe shared state access** without proper synchronization.

It surfaces these issues immediately with clear, actionable output.

---

## Quick Example

### Problematic code

```python
import threading

# A shared list that multiple threads will update
counter = []

def increment():
    for _ in range(1000):
        counter.append(1)

threads = [threading.Thread(target=increment) for _ in range(10)]
for t in threads: t.start()
for t in threads: t.join()
```

### Protected with Raceguard

```python
from raceguard import protect, locked

# Just wrap your shared object
counter = protect([])

def increment():
    for _ in range(1000):
        # Access safely via context manager
        with locked(counter):
            counter.append(1)

# ... rest of the code ...
```

If you forget the `with locked(counter):` block, Raceguard will instantly throw a `RaceConditionError` with a full report.

---

## Why Raceguard Is Different

Raceguard is designed for **real developer workflows**, not just theory.

*   **High Performance**: Uses lazy frame capture, avoiding expensive stack inspection overhead until absolutely necessary.
*   **Flexible Detection**: Native support for `raise`, `warn`, and `log` modes to fit your testing strategy.
*   **Zero Production Overhead**: Set `RACEGUARD_ENABLED=0` to completely bypass the proxy in live environments.
*   **Async-Aware**: Seamlessly tracks races between mixed `asyncio` tasks and standard threads.
*   **Transactional Consistency**: Uses `AtomicGroup` to enforce logic invariants across multiple objects, preventing "Semantic Races."
*   **Deep Protection**: Automatically proxies nested mutable structures, including full interception of Python's dunder methods and context managers.
*   **Rich Reports**: Tells you exactly which threads accessed the object, at what time, and where to fix it.

---

## How It Works (Simple Mental Model)

Think of Raceguard as a **Synchronization Observer**.

1.  **Wrap**: You wrap a shared object with `protect()`.
2.  **Track**: It records the identity of every thread or task that touches the object.
3.  **Validate**: It checks if a lock is held when the same memory is accessed.
4.  **Report**: If two threads touch the same data too quickly without a lock, it flags the conflict.

---

## Installation

```bash
pip install raceguard
```

---

## Deployment & Usage

Typical usage patterns:

*   **Development** — Run with `configure(mode="raise")` (or `"warn"`, `"log"`) to catch the obvious cases fast with immediate feedback during local testing.
*   **Continuous Integration** — Use `configure(strict=True)` in CI for correctness assertions. Heuristic mode (`race_window`) depends on timing, which varies under CPU load. **Strict mode is the right tool for CI**: it flags any lockless write from a different thread, regardless of elapsed time.
*   **Production** — Set `RACEGUARD_ENABLED=0` for a true zero-cost passthrough of your original objects.

> **Heuristic vs. Strict — the key distinction**: The default `race_window` of 10ms catches overlapping accesses quickly, but in a highly loaded system two logically racy writes could be far apart in wall time and slip through. Strict mode removes this ambiguity entirely — if no lock was used, it's a race.

---

## Usage Patterns

```python
import threading
from raceguard import protect, with_lock, locked

# 1. Protect a shared mutable object
shared_list = protect([])

# 2. Access unsafely (Will throw RaceConditionError if races occur)
def unsafe_worker():
    shared_list.append(1) 

# 3. Access Safely via Context Manager
def safe_worker_ctx():
    with locked(shared_list):
        shared_list.append(1)

# 4. Access Safely via Decorator
@with_lock(shared_list)
def safe_worker_dec():
    shared_list.append(1)

# 5. Lock multiple proxies atomically (consistent ordering prevents deadlocks)
a = protect([])
b = protect({})
with locked(a, b):
    a.append(1)
    b["x"] = 1

# 6. Group objects for transactional safety (Automatic semantic race detection)
from raceguard import AtomicGroup
group = AtomicGroup(a, b)

# This is safe
with locked(group):
    a.append(2)

# This triggers a RaceConditionError if another thread holds the group lock
# even if the individual lock for 'a' is free!
_ = a[0]
```

### Supported Object Types

Raceguard can wrap any mutable Python object:

```python
protect([])            # list
protect({})            # dict
protect(set())         # set
protect(bytearray())   # bytearray
protect(MyClass())     # any custom object
protect(Value(0))      # scalar via Value wrapper
```

### `protect()` is idempotent

Wrapping an already-protected object returns the **same proxy** — no double-wrapping:

```python
p1 = protect(my_list)
p2 = protect(p1)   # same proxy as p1
assert p1 is p2    # True
```

### Concurrent Reads Are Safe

Two threads reading simultaneously do **not** trigger a race. Only write/write or read/write conflicts are flagged:

```python
shared = protect({"val": 42})

# Both threads reading at the same time — no RaceConditionError
def reader():
    _ = shared["val"]
```

---

## Advanced Features

### Automatic Nested Protection
Raceguard automatically protects child objects. You don't need to manually wrap every nested dictionary or list in your state tree.

```python
from raceguard import protect

# Wrap the parent object once
state = protect({"users": ["Alice", "Bob"]})

# The child list is automatically protected when accessed!
state["users"].append("Charlie")
```

### Iterator Race Detection

Raceguard catches writes that happen while another thread is mid-iteration:

```python
shared = protect([1, 2, 3])

def slow_reader():
    for item in shared:
        time.sleep(0.05)   # still iterating...

def writer():
    time.sleep(0.02)
    shared.append(4)       # RaceConditionError — write during iteration!
```

### Actionable Error Reports
When a race condition occurs, Raceguard tells you exactly what went wrong, including the specific Thread IDs and Async Task names involved.

```text
RaceConditionError: Concurrent access detected on object <list> at 0x...
Thread-1 (ID: 12345) wrote to object at 10:05:01.001
Thread-2 (ID: 67890) accessed object at 10:05:01.003
Location: mymodule.py:42 in worker()
Missing synchronization lock during access.
```

### Asyncio & Threading Support
Raceguard safely tracks state even in hybrid architectures where standard threads and `asyncio` event loops are running simultaneously and modifying the same objects.

### Strict Mode — Catching Temporally Distant Unsynchronized Writes

By default, Raceguard flags accesses within a time window. With `strict=True`, **any lockless write from a different thread is flagged**, even if it happens much later:

```python
from raceguard import protect, configure, Value

configure(strict=True)
shared = protect(Value("initial"))

def thread1():
    shared.value = "written by T1"  # First write

def thread2():
    time.sleep(0.5)                 # Waits well beyond the race window...
    shared.value = "written by T2"  # Still caught! No lock was used.
```

> **Tip**: In strict mode, use `reset(shared)` to manually clear access history when threads coordinate via a non-lock mechanism like a `queue.Queue`.

```python
from raceguard import reset

def stage2():
    result = my_queue.get()   # synchronized via Queue
    reset(shared)             # tell Raceguard this is a fresh access point
    shared.value = result     # safe — no false positive
```

### AtomicGroups — Enforcing Logical Transactions

When multiple objects must stay in sync (e.g., Account A and B), individual locks are not enough. If Thread 1 is moving money from A to B, Thread 2 should not be allowed to read *either* A or B until the transaction is complete.

`AtomicGroup` creates a shared safety boundary:

```python
from raceguard import protect, AtomicGroup, locked

acc_a = protect(Account(100))
acc_b = protect(Account(0))
bank = AtomicGroup(acc_a, acc_b)

def transfer(amount):
    with locked(bank):
        acc_a.balance -= amount
        acc_b.balance += amount

def audit():
    # Attempting to read acc_a while transfer() is running 
    # will trigger a RaceConditionError!
    total = acc_a.balance + acc_b.balance
```

### Cross-Platform Verified
Fully supported and tested across:
*   **Windows**
*   **Linux**
*   **macOS**

---


## Environment Variables

Configure Raceguard without modifying code. Useful for CI/CD pipelines and deployment scripts.

| Variable | Default | Description |
|---|---|---|
| `RACEGUARD_ENABLED` | `1` | Set to `0` to completely disable detection (zero overhead). |
| `RACEGUARD_MODE` | `raise` | Detection mode: `raise`, `warn`, or `log`. |
| `RACEGUARD_STRICT` | `0` | Set to `1` to flag any unsynchronized access regardless of timing. |
| `RACEGUARD_WINDOW` | `0.01` | Time window (seconds) within which concurrent accesses are flagged. |

---

## Full `configure()` Reference

```python
from raceguard import configure

configure(
    enabled=True,        # Toggle detection on/off at runtime
    mode="raise",        # "raise" | "warn" | "log"
    strict=False,        # Bypass timing heuristic, flag all unsynchronized access
    race_window=0.01,    # Seconds — the sensitivity window for detecting races
    max_warnings=1000,   # Cap collected warnings in "warn" mode to prevent flooding
)
```

---

## Protecting Scalar Values

Use `Value()` to protect simple types like `int`, `float`, or `str` that cannot be proxied directly.

`Value` exposes three access patterns — use whichever fits your style:

```python
from raceguard import protect, Value, locked

counter = protect(Value(0))

def worker():
    with locked(counter):
        counter.value += 1   # attribute access
        counter.set(5)       # setter method
        x = counter.get()    # getter method
```

---

## Utility Functions

```python
from raceguard import (
    get_config,       # Returns the current configuration dict
    clear_warnings,   # Returns and clears all collected RaceConditionWarning objects
    warnings,         # Direct access to the list of collected warnings
    reset,            # Resets library state (useful between test runs)
    unbind,           # Unwraps a proxy to retrieve the raw underlying object
)

# Example: Inspect warnings after a test run
from raceguard import configure, clear_warnings

configure(mode="warn")
# ... run concurrent code ...
collected = clear_warnings()
for w in collected:
    print(w)

# Example: Get the raw object for identity checks or serialization
from raceguard import protect, unbind

data = protect({"key": "value"})
raw = unbind(data)  # Returns the original dict
```

---

## Dev-Mode Overhead

In **production**, there are two ways to disable Raceguard. Both act as a completely transparent kill-switch that bypasses proxy creation entirely and returns your raw object directly, ensuring absolutely **zero overhead** at runtime:

1. **Outside your code (Recommended):** Run your app with the environment variable `RACEGUARD_ENABLED=0`.
2. **Inside your code:** Call `configure(enabled=False)` at the very start of your application. *(Note: This must be called **before** any objects are wrapped. It does not retroactively remove the proxy from objects that are already protected.)*

In **development mode**, every attribute access on a protected object passes through the proxy layer, which performs a thread-identity check and a timestamp comparison. This is intentionally lightweight, but it is not free.

As a rough guide:

| Access frequency | Expected impact |
|---|---|
| Occasional (locks, shared status flags) | Negligible — use freely |
| Moderate (per-request shared state) | Minimal — order of microseconds per access |
| Tight hot loop (millions/sec) | Measurable — consider wrapping only during test runs, not benchmarks |

Lazy frame capture means **stack traces are only resolved when a race is actually detected**, keeping the common (no-race) path as fast as possible. If you are profiling performance of concurrent code, run with `RACEGUARD_ENABLED=0` to eliminate all proxy overhead.

---

## Known Limitations & Blindspots

While Raceguard is highly effective for hunting in-memory thread races, there are fundamental "True Blindspots" governed by the physical and logical limits of high-level proxying:

1.  **Direct Memory Manipulation (Ghost Writes)**: Raceguard relies on Python's `__setattr__` and `__getattribute__` hooks. It **cannot see** memory changes made via:
    *   **C-Extensions**: Libraries like `numpy` or `lxml` that write directly to C-level pointers.
    *   **Buffer Access**: Using `ctypes` or `mmap` to modify memory addresses directly.
2.  **Unprotected Semantic Invariants**: While `AtomicGroup` helps detect races on multi-object transactions, it only works if the developer correctly groups the relevant objects. Logic races on hidden or non-proxied state (like a global internal C-level counter) remain invisible.
3.  **OS External State (TOCTOU)**: It cannot detect races between the Python process and the **Operating System**. For example, a "Time-of-Check to Time-of-Use" race on the file system (checking a file exists before opening it) is outside Raceguard's scope.
4.  **Inter-Process Contention**: Raceguard's tracking is local to the current process. It cannot detect races between two completely separate program instances (e.g., two different scripts racing for a database record).
5.  **Per-Interpreter Shared Memory (Python 3.12+)**: With PEP 684, multiple interpreters can have their own GIL. If they share a raw memory buffer, they can have true parallel data races that bypass the interpreter-local proxy.
6.  **Intentional Observer Blindspots**: To prevent recursion and "Heisenbugs," the library intentionally ignores metadata calls like `repr()`, `str()`, `id()`, and `type()`.

We recommend using Raceguard as a **Heuristic Safety Net** for application logic. For hardware-level or kernel-level verification, consider low-level tools like **ThreadSanitizer**, **Helgrind**, or **eBPF**.

---

## Author

Developed by **Chukwunwike Obodo**.

---

## License

This project is licensed under the MIT License.
