Metadata-Version: 2.4
Name: async-runtime-auditor
Version: 0.1.0
Summary: Runtime audit CLI for detecting asyncio degradation and threadpool saturation.
Author: Priyanshu
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: System :: Monitoring
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.31.0
Requires-Dist: PyYAML>=6.0.1
Dynamic: license-file


# Async Runtime Auditor

A lightweight CI/CD runtime audit CLI for Python `asyncio` applications.

Standard APM metrics (like P99 HTTP latency) frequently hide severe async runtime degradation. Synchronous blocking calls and executor queue amplification can stall the event loop while external HTTP metrics still appear healthy.

**Async Runtime Auditor** is a heuristic-driven operational tool designed to run in staging environments and deployment pipelines. It queries a telemetry backend (such as Prometheus), evaluates runtime state against configurable thresholds, and fails the build before blocking code reaches production.

---

# The Problem: Telemetry Asymmetry

In many production Python systems:

```text
healthy HTTP latency != healthy runtime state
```

Standard infrastructure metrics frequently fail to detect:

- synchronous I/O executed on the main async thread  
- threadpool saturation and executor queue wait times  
- asynchronous scheduler starvation  
- hidden queue amplification  

This tool exposes these blind spots by evaluating event-loop and executor telemetry directly.

---

# Installation

Requires Python 3.9+

```bash
pip install async-runtime-auditor
```

For local development:

```bash
pip install -e .
```

from the repository root.

---

# Quick Start

## 1. Initialize Configuration

Generate a local `metrics.yaml` configuration file:

```bash
async-auditor --init
```

---

## 2. Run an Audit

Run the tool against a staging or local telemetry backend:

```bash
async-auditor --target http://localhost:9090
```

---

## 3. CI/CD Pipeline Gating

Run with strict exit semantics to fail the pipeline if critical degradation is detected:

```bash
async-auditor --target http://localhost:9090 --fail-on-critical
```

The CLI exits with code `1` if runtime state is:

- `CRITICAL`  
- `COLLAPSE_RISK`  

---

# Configuration (`metrics.yaml`)

The auditor is intentionally decoupled from specific telemetry naming conventions.

Prometheus queries are mapped into the heuristic engine through `metrics.yaml`.

```yaml
target: "http://localhost:9090"

queries:
  p99_latency: "histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))"
  event_loop_lag: "max_over_time(asyncio_event_loop_lag_seconds[15m])"
  blocking_events_total: "blocking_events_total"
  blocking_duration_avg: "rate(blocking_duration_seconds_sum[15m]) / rate(blocking_duration_seconds_count[15m])"
  peak_active_requests: "max_over_time(active_requests[15m])"
  threadpool_queue_wait: "rate(threadpool_queue_wait_seconds_sum[15m]) / rate(threadpool_queue_wait_seconds_count[15m])"
  threadpool_tasks_started: "threadpool_tasks_started_total"
  threadpool_tasks_completed: "threadpool_tasks_completed_total"

thresholds:
  health_score_collapse: 2500
  health_score_critical: 1200
  health_score_degraded: 400
  max_event_loop_lag_s: 0.5
  max_blocking_duration_s: 1.0
  max_queue_wait_s: 1.0
  max_threadpool_backlog: 3
  max_active_requests: 150
```

---

# Output Formats

The CLI supports:

- human-readable terminal output  
- structured JSON output for CI pipelines  

---

## Standard Text Output

```bash
async-auditor --output-format text
```

---

## JSON Output

```bash
async-auditor --output-format json
```

A structured report is also written to:

```text
audit_results.json
```

inside the current working directory.

---

# CI/CD Integration Example

A reference GitHub Actions workflow is included in:

```text
examples/github-actions-gating.yml
```

This demonstrates how to use the auditor as a deployment quality gate inside pull request pipelines.

---

# Operational Model

The auditor computes a deterministic runtime health score using:

- event-loop lag  
- blocking duration  
- queue wait amplification  
- active request pressure  
- threadpool backlog behavior  

The scoring model is intentionally threshold-driven and explainable.

This project does not use:

- machine learning classification  
- anomaly detection systems  
- probabilistic runtime forecasting  

---

# Intended Usage

This tool is designed primarily for:

- deployment pipeline gating  
- staging environment validation  
- runtime regression detection  
- async infrastructure diagnostics  

It is not intended to replace full observability platforms.

---

# Scope

This project is intentionally narrow in scope.

## Included

- heuristic-based async runtime failure classification  
- CI/CD exit-code semantics  
- configurable operational thresholds  
- Prometheus-compatible telemetry querying  
- runtime degradation detection  

---

## Not Included

- distributed tracing infrastructure  
- observability data storage systems  
- automated remediation  
- Kubernetes orchestration  
- OpenTelemetry collectors  
- AI-driven diagnosis systems  

---

# Design Constraints

The project intentionally prioritizes:

- operational clarity  
- deterministic output  
- explainable runtime scoring  
- low dependency overhead  
- simple deployment integration  

over:

- infrastructure breadth  
- platform extensibility  
- distributed orchestration  
- autonomous remediation  

---

# License

MIT License
