Metadata-Version: 2.4
Name: failscope
Version: 0.3.0
Summary: AI-Powered Root Cause Analysis for pytest — Dual-Agent (Analyzer→Critic) pipeline that automatically triages test failures
Author-email: Yaniv <yaniv2809@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/Yaniv2809/failscope
Project-URL: Repository, https://github.com/Yaniv2809/failscope
Project-URL: Issues, https://github.com/Yaniv2809/failscope/issues
Keywords: pytest,testing,ai,rca,root-cause-analysis,flaky-tests,ci-cd,quality-assurance,automation
Classifier: Development Status :: 4 - Beta
Classifier: Framework :: Pytest
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: Software Development :: Quality Assurance
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pytest>=7.0
Requires-Dist: httpx>=0.24
Provides-Extra: dev
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Dynamic: license-file

# Failscope

**AI-Powered Root Cause Analysis for pytest**

Failscope is a zero-config pytest plugin that automatically triages test failures using a Dual-Agent AI pipeline. It deduplicates failures by fingerprint, runs parallel LLM analysis, and generates an HTML report your team can share — not just raw logs.

## Features

- **Dual-Agent RCA** — Analyzer (creative, temp 0.4) → Critic (deterministic, temp 0.0) prevents hallucinations by cross-checking every claim against raw evidence
- **Parallel async analysis** — all unique failures analysed concurrently; no serial API blocking in CI
- **Error fingerprinting** — clusters identical failures, LLM sees only unique root causes
- **PII & secrets sanitization** — API keys, passwords, JWTs, and tokens are redacted before leaving your machine
- **HTML report** — self-contained single-file report, shareable in Slack or email
- **A–F stability scoring** — flakiness detection and trend analysis across the last 20 runs
- **Local LLM support** — run fully offline with Ollama (zero API cost, full data privacy)
- **Multi-provider** — Groq (free tier), OpenAI, Anthropic, or any Ollama model
- **Offline fallback** — rule-based analysis when no API key is available

## Quick Start

```bash
pip install failscope
```

### Cloud LLM (recommended for best results)

```bash
export GROQ_API_KEY=your-key   # free at console.groq.com
pytest --failscope
```

### Local LLM via Ollama (zero cost, full privacy)

```bash
ollama pull llama3.2
pytest --failscope --fs-provider=ollama
```

### Offline / no API key

```bash
pytest --failscope --fs-offline
```

## How It Works

```
Test Failure
    │
    ▼
┌──────────────────────┐
│   Log Preprocessor   │  Strip pytest noise, smart truncate
│                      │  (first 10% + last 90%), sanitize PII
└─────────┬────────────┘
          │
          ▼
┌──────────────────────┐
│  Error Fingerprinting│  SHA-256 hash per unique error class
│                      │  Deduplicates before reaching the LLM
└─────────┬────────────┘
          │
          ▼ (parallel — all unique failures at once)
┌──────────────────────┐    ┌──────────────────────┐
│  Analyzer (temp 0.4) │    │  Analyzer (temp 0.4) │  ...
│  [Actor Agent]       │    │  [Actor Agent]       │
└─────────┬────────────┘    └─────────┬────────────┘
          │                           │
          ▼                           ▼
┌──────────────────────┐    ┌──────────────────────┐
│  Critic (temp 0.0)   │    │  Critic (temp 0.0)   │
│  Validates claims    │    │  Validates claims    │
│  overrides hallucin. │    │  overrides hallucin. │
└─────────┬────────────┘    └─────────┬────────────┘
          └──────────┬────────────────┘
                     ▼
          HTML + JSON reports in .failscope/
```

> **Ollama note:** For local models (3B–8B params), Failscope automatically switches to a single-pass prompt to stay within context window limits.

## CLI Options

| Flag | Default | Description |
|------|---------|-------------|
| `--failscope` | — | Enable Failscope analysis |
| `--fs-offline` | `false` | Rule-based analysis, no API key needed |
| `--fs-report` | `false` | Add stability report to output |
| `--fs-provider` | auto-detect | `groq` · `openai` · `anthropic` · `ollama` |
| `--fs-model` | provider default | Override model name (e.g. `llama3.1:8b`, `gpt-4o-mini`) |
| `--fs-max-log-size` | `80000` | Max log characters sent to LLM. Reduce for small local models |
| `--fs-output` | `.failscope/` | Output directory for reports |

## LLM Providers

| Provider | Default model | Env variable | Cost |
|----------|--------------|-------------|------|
| **Groq** (default) | `llama-3.3-70b-versatile` | `GROQ_API_KEY` | Free tier available |
| OpenAI | `gpt-4o` | `OPENAI_API_KEY` | Pay per token |
| Anthropic | `claude-haiku-4-5-20251001` | `ANTHROPIC_API_KEY` | Pay per token |
| **Ollama** | `llama3.2` | `OLLAMA_HOST` (optional) | Free, runs locally |

Auto-detection order: `OLLAMA_HOST` → `GROQ_API_KEY` → `OPENAI_API_KEY` → `ANTHROPIC_API_KEY`

Override the model without changing provider:

```bash
pytest --failscope --fs-provider=openai --fs-model=gpt-4o-mini
pytest --failscope --fs-provider=ollama --fs-model=mistral:7b
```

## Output

All reports are written to `.failscope/` (configurable with `--fs-output`).

### `rca_report.html` — interactive HTML report (always generated)

A self-contained file you can open in any browser or attach to a Slack message.

### `rca_report.json` — machine-readable RCA

```json
{
  "root_cause": "API endpoint /login returns 401 due to expired test token",
  "category": "assertion_failure",
  "severity": "high",
  "fix_suggestion": "Refresh auth token in conftest.py fixture before each test",
  "confidence": 0.87,
  "was_critic_override": false,
  "affected_tests": ["test_auth.py::test_login", "test_auth.py::test_profile"],
  "occurrence_count": 2
}
```

### `stability_report.json` — A–F grading per test (requires `--fs-report`)

```json
{
  "test_name": "test_checkout.py::test_payment_flow",
  "grade": "C",
  "pass_rate": "72.0%",
  "flakiness_score": 58,
  "verdict": "Flaky",
  "trend": "degrading"
}
```

## Security

Failscope sanitizes the following before sending any data to an LLM API:

- API keys and tokens (generic patterns, GitHub PATs, OpenAI/Anthropic/Stripe prefixes)
- Passwords and secrets in assignment context (`key=value`, `"key": "value"`, `key: value`)
- JWT tokens, Bearer tokens, AWS access keys
- Database connection strings containing credentials
- Email addresses and high-entropy hex strings

Redacted values appear as typed placeholders: `[REDACTED:api_key]`, `[REDACTED:password]`, etc.
A warning is printed to the terminal whenever a redaction occurs.

## Environment Variables

| Variable | Description |
|----------|-------------|
| `GROQ_API_KEY` | Groq API key |
| `OPENAI_API_KEY` | OpenAI API key |
| `ANTHROPIC_API_KEY` | Anthropic API key |
| `OLLAMA_HOST` | Ollama server URL (default: `http://localhost:11434`) |
| `OLLAMA_MODEL` | Default Ollama model (default: `llama3.2`) |

## License

MIT
