Metadata-Version: 2.4
Name: sentinel-agent
Version: 0.1.7
Summary: AI-driven functional testing agent. Point it at a URL; it explores the app, generates a test plan, runs it, and reports findings. Web (Playwright) + visual regression + accessibility (axe-core) + REST API contract tests, all driven by your choice of LLM.
Project-URL: Homepage, https://thinknextsoftware.com
Project-URL: Documentation, https://github.com/Thinknext-Software-Solutions/Sentinel
Project-URL: Repository, https://github.com/Thinknext-Software-Solutions/Sentinel
Project-URL: Issues, https://github.com/Thinknext-Software-Solutions/Sentinel/issues
Project-URL: Changelog, https://github.com/Thinknext-Software-Solutions/Sentinel/releases
Author-email: ThinkNext Software Solutions <hello@thinknextsoftware.com>
License: MIT License
        
        Copyright (c) 2026 ThinkNext Software Solutions
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: a11y,accessibility,agent,agentic-ai,ai,ai-agent,anthropic,axe-core,claude,e2e,gemini,llm,open-source,openai,playwright,qa,self-hosted,test-automation,testing,ui-testing,visual-regression,web-testing
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.11
Requires-Dist: click>=8.1.0
Requires-Dist: httpx>=0.27
Requires-Dist: pillow>=10.0
Requires-Dist: playwright>=1.40
Requires-Dist: pydantic>=2.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: all
Requires-Dist: anthropic>=0.30; extra == 'all'
Requires-Dist: claude-agent-sdk>=0.1; extra == 'all'
Requires-Dist: google-generativeai>=0.7; extra == 'all'
Requires-Dist: openai>=1.30; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.30; extra == 'anthropic'
Provides-Extra: claude-code
Requires-Dist: claude-agent-sdk>=0.1; extra == 'claude-code'
Provides-Extra: dev
Requires-Dist: black>=24.0; extra == 'dev'
Requires-Dist: build>=1.0; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Requires-Dist: twine>=5.0; extra == 'dev'
Provides-Extra: google
Requires-Dist: google-generativeai>=0.7; extra == 'google'
Provides-Extra: openai
Requires-Dist: openai>=1.30; extra == 'openai'
Description-Content-Type: text/markdown

# Sentinel

> Point it at a URL. It explores the app, generates a test plan, runs it, and reports findings. Web + visual regression + accessibility in v0.1; API + mobile in later versions.

[![PyPI](https://img.shields.io/pypi/v/sentinel-agent.svg?label=PyPI&color=22d3ee)](https://pypi.org/project/sentinel-agent/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Status](https://img.shields.io/badge/status-alpha-22d3ee.svg)](#roadmap)
[![Built by ThinkNext](https://img.shields.io/badge/built%20by-ThinkNext-22d3ee.svg)](https://thinknextsoftware.com)

> **Status**: alpha, live on PyPI as `sentinel-agent==0.1.0a1`. Web + visual regression + accessibility ship today. API testing in v0.1.0a2; mobile (React Native) in v0.1.0a3.
>
> **Install**: `pip install sentinel-agent` &middot; **Repo**: [GitHub](https://github.com/Thinknext-Software-Solutions/Sentinel) &middot; **Issues**: [file one](https://github.com/Thinknext-Software-Solutions/Sentinel/issues)

## What it does

Point Sentinel at a URL:

```bash
sentinel run https://your-app.com
```

In one command, the agent:

1. **Opens** the URL in headless Chromium
2. **Reads** the rendered HTML + visible text
3. **Asks the LLM** to generate a focused test plan (2-5 scenarios, 3-8 steps each)
4. **Runs** the plan in fresh browser sessions per scenario
5. **Captures** screenshots and compares against baselines (visual regression)
6. **Scans** each page state for WCAG 2.1 AA violations (axe-core)
7. **Reports** findings: failed scenarios, visual diffs, accessibility issues, with cost

## Why this exists

The same teams that need [Cascade](https://cascadeagent.dev) (meeting-to-PR) and [Relay](https://github.com/Thinknext-Software-Solutions/Relay) (issue-to-PR) need a way to verify that the PRs those agents produce actually work. Hand-writing Playwright tests for every feature is the bottleneck. Sentinel removes the bottleneck: generate tests with the same LLM that writes the code.

Sentinel is fully standalone. It carries its own LLM-client layer and config so it does not depend on any other ThinkNext package at runtime.

## Install

```bash
# Core install + the LLM provider you want:
pip install 'sentinel-agent[anthropic]'        # Anthropic Claude
pip install 'sentinel-agent[openai]'           # OpenAI
pip install 'sentinel-agent[google]'           # Google Gemini
pip install 'sentinel-agent[claude-code]'      # Local Claude Code subscription, no API key
pip install 'sentinel-agent[all]'              # All providers

# One-time: install the Chromium binary Playwright needs
playwright install chromium
```

## Configure

```bash
# Set up an LLM provider. Credentials live at ~/.config/sentinel/config.yaml.
sentinel configure llm anthropic --key sk-ant-xxx --set-default

# Or, if you have Claude Code installed locally (no API key needed):
sentinel configure llm claude_code --set-default
```

If you want a project-local config (highly recommended; lets you set viewport, baseline directory, accessibility thresholds):

```bash
sentinel init
```

This scaffolds `sentinel.yaml` with sensible defaults you can edit.

## Run

```bash
sentinel run https://cascadeagent.dev

# Output (truncated):
#   ✓  3/3 scenarios passed, 0 visual diff(s), 2 a11y violation(s)
#
#   ✓  Homepage loads and primary CTA is visible  (1.42s)
#   ✓  Get-started link navigates to /getting-started/  (1.83s)
#   ✓  Docs sidebar contains all expected sections  (2.10s)
#
#   Accessibility violations:
#     [moderate] color-contrast: Elements must meet minimum color contrast...
#       sample: .text-slate-500
#       (3 node(s) affected)
#     [minor] image-alt: Images must have alt text...
#       sample: img.hero-illustration
#       (1 node(s) affected)
#
#   cost:    $0.04 (5,210 in / 980 out tokens)
```

## What ships in v0.1.0

| Capability | Module |
|---|---|
| Web testing via Playwright | `sentinel.browser`, `sentinel.runner` |
| LLM-driven test plan generation | `sentinel.planner` |
| Self-healing tests (LLM re-plan on failed step + retry once) | `sentinel.planner.regenerate_step` |
| Multi-page exploration (up to 4 same-origin links) | `sentinel.agent` |
| Visual regression (PIL pixel diff) | `sentinel.visual` |
| Accessibility scan (axe-core 4.10, WCAG 2.1 AA) | `sentinel.a11y` |
| REST API contract testing (OpenAPI + URL-probe modes) | `sentinel.api_*` |
| Multi-LLM (Anthropic / OpenAI / Google / Claude Code / Ollama) | `sentinel.llm` |
| Mobile (React Native via Detox) | planned for a future release |

## How it differs from existing tools

| | Playwright Codegen | Pytest + Playwright | Percy / Chromatic | Sentinel |
|---|---|---|---|---|
| Generates tests from a URL | partial (record/replay) | ❌ | ❌ | ✅ |
| Self-hosted | ✅ | ✅ | ❌ | ✅ |
| Bring your own LLM | n/a | n/a | n/a | ✅ |
| Visual regression | ❌ | ❌ | ✅ | ✅ |
| Accessibility scan | ❌ | partial (plugin) | ❌ | ✅ |
| Open source | ✅ | ✅ | ❌ | ✅ |

Sentinel is for teams who want test coverage without spending the engineering hours to author it. The trade-off is that AI-generated tests have failure modes hand-written tests do not (e.g. an LLM picks a fragile selector). The self-healing v0.1.0a2 feature is the answer to that.

## Configuration

`sentinel.yaml` (after `sentinel init`):

```yaml
version: 1

agent:
  provider: anthropic
  model: claude-opus-4-7
  temperature: 0.2

browser:
  headless: true
  viewport_width: 1280
  viewport_height: 720
  timeout_ms: 30000

visual:
  enabled: true
  baseline_dir: sentinel-baselines
  diff_threshold_percent: 0.5

a11y:
  enabled: true
  fail_on:
    - critical
    - serious
```

## Architecture

```
   sentinel run <url>
          │
          ▼
   ┌──────────────┐
   │ explore page │  Playwright opens URL, grabs HTML + visible text
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │   planner    │  LLM produces TestPlan (2-5 scenarios, 3-8 steps each)
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │    runner    │  Fresh browser session per scenario
   │              │  Each step is one Playwright action
   │              │  screenshot steps → visual regression check
   │              │  a11y_scan steps → axe-core injection
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │ SentinelReport │  Scenarios + visual diffs + a11y violations + cost
   └──────────────┘
```

## Roadmap

| Version | Status | Highlights |
|---|---|---|
| **v0.1.0a1** | Shipped (2026-05-26) | Web testing, visual regression, accessibility |
| v0.1.0a2 | Planned | Multi-page exploration, self-healing tests, API contract testing |
| v0.1.0a3 | Planned | Mobile (React Native via Detox or Maestro) |
| v0.2 | Q4 2026 | CI integration (GitHub Actions / GitLab CI / Bitbucket / Azure), parallel execution |
| v1.0 | Mid-2027 | Stable API, full coverage of web + API + mobile + visual + a11y, baselined |

## License

MIT. See [LICENSE](LICENSE).

## About

Built and maintained by [ThinkNext Software Solutions](https://thinknextsoftware.com), alongside our other open-source projects [Cascade](https://cascadeagent.dev) (meeting-to-PR) and [Relay](https://github.com/Thinknext-Software-Solutions/Relay) (issue-to-PR).

Follow along: [@ThinkNextHQ](https://twitter.com/ThinkNextHQ) &middot; [LinkedIn](https://linkedin.com/company/thinknextsoftware) &middot; [Blog](https://thinknextsoftware.com/blog/)
