Metadata-Version: 2.4
Name: logomesh
Version: 0.1.2
Summary: Reproduce Sentry crashes as failing pytest tests — sandbox execution, verified evidence
Project-URL: Repository, https://github.com/LogoMesh/LogoMesh-Dev
License: MIT
Keywords: crash-reproduction,debugging,pytest,sentry,testing
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.11
Requires-Dist: docker>=7.0.0
Requires-Dist: fastapi>=0.100.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: hypothesis>=6.0.0
Requires-Dist: langchain-core>=0.3
Requires-Dist: langchain-openai>=0.2
Requires-Dist: langgraph>=0.2
Requires-Dist: openai>=2.8.1
Requires-Dist: pydantic>=2.11.9
Requires-Dist: pyjwt[crypto]>=2.8.0
Requires-Dist: python-dotenv>=1.1.1
Requires-Dist: rank-bm25>=0.2.2
Requires-Dist: sentry-sdk[fastapi]>=2.0.0
Requires-Dist: uvicorn>=0.35.0
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.40.0; extra == 'anthropic'
Provides-Extra: capture
Provides-Extra: cloud
Requires-Dist: supabase>=2.30.0; extra == 'cloud'
Provides-Extra: embed
Requires-Dist: numpy>=1.24.0; extra == 'embed'
Requires-Dist: voyageai>=0.2.0; extra == 'embed'
Description-Content-Type: text/markdown

# logomesh

Paste a Sentry URL. Get a failing pytest back.

```bash
pip install logomesh
logomesh repro https://sentry.io/organizations/your-org/issues/12345678/
```

---

## What it does

Takes the innermost in-app frame from a Sentry crash, grabs whatever locals Sentry captured at crash time, builds a pytest that calls that function with those exact values, runs it in a Docker sandbox, and tells you if it still reproduces on your current branch.

The test is synthesized deterministically — no LLM touches the test bytes. LLM reasoning is used for context recovery and strategy (advisory only, never in the evidence path).

If the sandbox raises the same exception type Sentry captured → reproduced, you get the test. If not → explicit refusal with a structured reason.

---

## Requirements

- Python 3.11+
- Docker running locally
- A Sentry auth token with `event:read` scope (`Settings → API Keys`)
- An OpenAI API key (used for advisory context recovery, not test synthesis)

```bash
export SENTRY_AUTH_TOKEN=sntryu_...
export OPENAI_API_KEY=sk-...
```

Or drop them in a `.env` file in your project root.

---

## Usage

```bash
# reproduce a crash
logomesh repro https://sentry.io/organizations/your-org/issues/12345678/

# skip LLM entirely — deterministic frame-locals replay only
logomesh repro <url> --no-llm

# emit a sealed audit artifact (SOC2 CC7.3 / PCI DSS 12.10.5)
logomesh repro <url> --artifact

# open a GitHub draft PR with the failing test attached
logomesh repro <url> --draft-pr

# machine-readable JSON output
logomesh repro <url> --json

# point at a local repo (default: cwd)
logomesh repro <url> --repo /path/to/repo

# set wall-clock timeout (default: 60s)
logomesh repro <url> --timeout 120
```

Supported Sentry URL formats:
```
https://sentry.io/organizations/{org}/issues/{id}/
https://sentry.io/issues/{id}/
https://{org}.sentry.io/issues/{id}/
```

---

## Example output

Reproduced:
```
  ✓ Reproduced: ZeroDivisionError at billing/calc.py:18
     division by zero
     rate = total / count
```

Not reproduced:
```
  ✗ Cannot reproduce ValueError at checkout.py:42
     The synthesized test passed against the current branch.
     Either the bug is fixed, or the captured locals are insufficient.
```

---

## How it works

The orchestrator is a LangGraph supervisor graph with 11 tools. Only `deterministic_repro` can produce the artifact and PR — the evidence path is contract-enforced.

**The 11 tools:**

| Tool | What it does |
|---|---|
| `fetch_sentry_event` | Fetches event + frame locals from Sentry API. PII redaction runs here before anything else sees the data. |
| `deterministic_repro` | Builds pytest from frame locals, runs it in Docker sandbox. Only tool that produces sealed evidence. Zero LLM. |
| `critic_validate` | Scores fidelity (0.0–1.0). Checks same exception type, same function, locals match. Min 0.9 to ship. Up to 3 attempts. |
| `context_reconstructor` | Called when repro falls short. Handles: `no_repro`, `async_state`, `db_state`, `globals`, `c_ext`, `missing_fixture`. |
| `hypothesis_invariant_suggester` | Suggests Hypothesis property tests for the crashed function. Advisory only — never touches artifact. |
| `web_search` | Searches PyPI / GitHub / StackOverflow / CVE / general. Used to recover source paths and dep advisories. |
| `rag_search` | Searches codebase / past runs / docs / memory. |
| `prepare_environment` | Builds a dependency snapshot. Pins exact prod versions from `event.modules` if available. CVE lookup per package. |
| `introspect_repo` | RAG window into the repo — imports, decorators, class shapes, manifests, entrypoints. Helps the agent decide how to bootstrap the sandbox. |
| `build_artifact` | Seals the artifact: SHA-256 stamp, sandbox image digest, `llm_in_evidence_path: false` attestation, SOC2/PCI control mapping. |
| `create_draft_pr` | Opens a GitHub draft PR with the failing test attached. |

**Scientific context engine:** When repro falls short, a deterministic Observe → Hypothesize → Experiment → Verify loop probes breadcrumbs, RAG, and PyPI for why the crash can't reproduce (DB state, async runtime, missing globals, dep drift). Never produces test code — only structured notes that direct the supervisor toward a different deterministic env state.

**Source resolution:** Fuzzy resolver tries absolute path, repo-relative, leading-component-strip, and `rglob`-basename matches — so prod paths like `/app/src/billing/x.py` find dev `src/billing/x.py`. On first failure, the supervisor searches for the path and retries with hints. After two attempts, it refuses to ship and flags for human review.

**Verified exception match:** The sandbox exception type must match the Sentry-captured exception type exactly. Anything else refuses to ship as evidence.

---

## Docker sandbox

- 128 MB RAM cap, 50% CPU, 50 PIDs
- Airgapped (no network)
- `nobody` user, read-only rootfs
- 15s default per-test timeout
- PYTHONHASHSEED retry: if a test passes but shouldn't, it retries with multiple seeds

---

## PII redaction

Runs before any LLM call and before any byte lands in the artifact:
- PAN (Luhn-validated credit card numbers)
- SSN, email, JWT, API keys
- Field-name scrubbing (e.g. `password`, `token`, `secret`, `card_number`)
- Request headers and query strings

---

## Audit trail

All LLM reasoning, tool calls, critic scores, and strategy outcomes are recorded in an `AuditSession`. The artifact preamble explicitly attests: `llm_in_evidence_path: false`. Control mapping on every artifact: `SOC2-CC7.3`, `SOC2-CC7.4`, `PCI-DSS-4.0-12.10.5`.

---

## What reproduces well

- Input validation bugs
- `NoneType` mismatches
- Decimal / type coercion errors
- Off-by-one, ordering, idempotency issues
- Anything where the inputs that crashed the call are captured in the Sentry frame

## What doesn't

- Race conditions (frame locals don't capture thread interleaving)
- Bugs that depend on live DB rows or Redis state not in the frame
- C extension crashes
- Distributed failures spanning services
- Timezone/DST edge cases (sandbox runs `TZ=UTC`)

When it can't reproduce cleanly, it says so with a structured reason. It never guesses.

---

## Sentry setup

Frame locals need to be enabled:

`Project Settings → SDK Setup → Enable "Send default PII"` — or in your SDK config:

```python
sentry_sdk.init(dsn="...", send_default_pii=True)
```

---

## License

MIT
