Metadata-Version: 2.4
Name: yare
Version: 0.1.0
Summary: Yare — SQL-aware firewall for AI agents. Don't give your agent psql. Give it Yare.
Project-URL: Homepage, https://github.com/gauthierpiarrette/yare
Project-URL: Repository, https://github.com/gauthierpiarrette/yare
Project-URL: Issues, https://github.com/gauthierpiarrette/yare/issues
Author: Yare contributors
License-Expression: Apache-2.0
License-File: LICENSE
License-File: NOTICE
Keywords: agent,ai,claude,cursor,guardrails,llm,mcp,security,sql,warehouse
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: click>=8.1
Requires-Dist: duckdb>=0.10
Requires-Dist: mcp>=1.0
Requires-Dist: pydantic>=2.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0
Requires-Dist: sqlglot>=23.0
Provides-Extra: all
Requires-Dist: google-cloud-bigquery>=3.0; extra == 'all'
Requires-Dist: litellm>=1.50; extra == 'all'
Requires-Dist: psycopg[binary]>=3.1; extra == 'all'
Requires-Dist: snowflake-connector-python>=3.0; extra == 'all'
Provides-Extra: bigquery
Requires-Dist: google-cloud-bigquery>=3.0; extra == 'bigquery'
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Provides-Extra: litellm
Requires-Dist: litellm>=1.50; extra == 'litellm'
Provides-Extra: postgres
Requires-Dist: psycopg[binary]>=3.1; extra == 'postgres'
Provides-Extra: redshift
Requires-Dist: psycopg[binary]>=3.1; extra == 'redshift'
Provides-Extra: snowflake
Requires-Dist: snowflake-connector-python>=3.0; extra == 'snowflake'
Description-Content-Type: text/markdown

# Yare

**SQL-aware firewall for AI agents.**

Don't give your agent psql. Give it Yare.

Yare sits between an AI agent (Claude Code, Cursor, internal agents) and your data warehouse. It parses the SQL the agent generates, joins it to your schema metadata, applies policy at the query+result level, and returns a structured decision the agent can recover from.

Yare composes with protocol-layer agent firewalls like [Pipelock](https://github.com/luckyPipewrench/pipelock) and the [Microsoft agent-governance-toolkit](https://github.com/microsoft/agent-governance-toolkit). Those scan tool calls for network and prompt-injection risks. Yare adds the data-semantic layer: it understands SQL, your schema, and what's safe to put into model context.

## Quickstart

```bash
pip install yare
yare quickstart
```

Seeds a local DuckDB with realistic-looking warehouse data, scans the schema, flags PII columns, and walks an agent through five SQL queries — three get blocked with structured retry guidance; two execute cleanly. No external warehouse required.

## Install

```bash
pip install yare                        # core + DuckDB
pip install "yare[postgres]"            # + Postgres
pip install "yare[redshift]"            # + Redshift (Postgres protocol)
pip install "yare[snowflake]"           # + Snowflake
pip install "yare[bigquery]"            # + BigQuery
pip install "yare[litellm]"             # + LiteLLM guardrail
pip install "yare[all]"                 # everything
```

## Supported warehouses

| Warehouse | Status | Connect command |
|---|---|---|
| DuckDB | tested | `yare connect duckdb ./warehouse.duckdb` |
| Postgres | tested (server-side cursor, batched scan) | `yare connect postgres "$DATABASE_URL"` |
| Redshift | tested (internal + Spectrum external schemas) | `yare connect redshift "$REDSHIFT_DSN"` |
| Snowflake | works, untested in CI | `yare connect snowflake "$SNOWFLAKE_DSN" -o warehouse=COMPUTE_WH` |
| BigQuery | works, untested in CI | `yare connect bigquery my-gcp-project` |

Connection strings can reference env vars with `${VAR}` — the value is resolved at runtime, the state file stays free of secrets.

## Real usage

```bash
yare init                                       # writes yare.yml policy
yare connect postgres '${DATABASE_URL}'         # validates the connection
yare scan                                       # loads information_schema, tags PII
yare run "SELECT email FROM users"              # check + execute if allowed
yare check "SELECT email FROM users"            # check only, don't execute
```

## Plug into Claude Code / Cursor / Claude Desktop (MCP)

Yare ships as an MCP server. Once installed, your agent gets four tools: `run_sql`, `list_tables`, `describe_table`, `explain_policy`.

One-shot installer:

```bash
yare mcp install claude-desktop     # ~/Library/Application Support/Claude/...
yare mcp install claude-code        # .claude/mcp.json (project)
yare mcp install cursor             # ~/.cursor/mcp.json
yare mcp install cursor-project     # .cursor/mcp.json (project)
yare mcp install                    # list all known clients
```

The installer merges Yare into any existing `mcpServers` config and backs up the original.

Or paste manually into the client's config:

```json
{
  "mcpServers": {
    "yare": { "command": "yare", "args": ["mcp"] }
  }
}
```

Then in your agent: *"Show me revenue trends for the last week."* The agent calls `list_tables` → `describe_table` → builds a query → calls `run_sql`. Safe queries get rows back. Unsafe queries get a structured block + safe_alternative, and the agent retries.

## Library usage

```python
from yare import check
from yare.warehouse.duckdb import DuckDBAdapter

adapter = DuckDBAdapter("./warehouse.duckdb")
schema = adapter.load_schema()
schema.tag_pii(["email", "phone", "ssn"])

decision = check(
    "SELECT email FROM analytics.users LIMIT 10",
    intent="investigate signup dropoff",
    agent="claude_code",
    schema=schema,
)

if decision.allowed:
    result = adapter.execute("SELECT email FROM analytics.users LIMIT 10")
    print(result.columns, result.rows)
else:
    print(decision.reasons)
    if decision.safe_alternative and decision.safe_alternative.query:
        print("Retry with:", decision.safe_alternative.query)
```

## LiteLLM integration

Drop-in for `litellm.completion(...)`:

```python
from yare.integrations.litellm import safe_completion, LiteLLMConfig

resp = safe_completion(
    model="claude-opus-4-7",
    messages=...,
    tools=[{"function": {"name": "run_sql", ...}}],
    yare_config=LiteLLMConfig(),
)
```

LiteLLM Proxy guardrail (`config.yaml`):

```yaml
guardrails:
  - guardrail_name: yare
    litellm_params:
      guardrail: yare.integrations.litellm.YareGuardrail
      mode: post_call
```

## What Yare does

- Parses SQL across dialects (sqlglot). Detects CTE-wrapped writes, multi-statement bypasses, and vendor commands (VACUUM, COPY, REINDEX) — not just bare `DROP TABLE`.
- Loads schema from `information_schema` (DuckDB, Postgres, Redshift, Snowflake, BigQuery). Postgres/Redshift use a server-side cursor + batched fetch so 10k+ tables don't OOM.
- Auto-tags PII columns from name patterns; engine uses those tags to evaluate joined tables, not just projected ones.
- Expands `SELECT *` against the loaded schema before deciding.
- Applies YAML policy: PII column patterns, blocked schemas, required LIMIT, no `SELECT *`, approval rules.
- Returns a `Decision` with verdict, risk, reasons, and a structured `safe_alternative` (text explanation + best-effort rewritten SQL).
- Executes the SQL against the connected warehouse when allowed; caps result rows; coerces Decimal/datetime/bytes to JSON-safe values for MCP transport.
- **Result firewall**: scans returned rows for PII values (email, phone, SSN, Luhn-validated credit cards, IPv4, IBAN). Masks, blocks, or warns per policy.
- Terminal approval flow for `approval_required` verdicts.
- JSONL audit log of every action — agent identity, intent, SQL, decision, reasons, what reached LLM context. Size-based rotation (default 100MB × 5 files).
- MCP server (`yare mcp`) exposing `run_sql`, `list_tables`, `describe_table`, `explain_policy`. `yare mcp install <client>` auto-wires it into Claude Code/Cursor/Claude Desktop.
- LiteLLM guardrail + drop-in `safe_completion()` wrapper.

## Out of scope

Yare is intentionally narrow. Adjacent concerns are handled by other tools or planned for later:

- **Agent process isolation / credential stripping** — out of scope; partner with a sandboxing layer.
- **EXPLAIN-plan cost estimation** — planned.
- **dbt manifest ingestion** — planned.
- **Hosted dashboard / SaaS control plane** — out of scope; Yare is OSS, self-hosted.

## Project layout

```
src/yare/
  analyzer.py        # SQL parsing → QueryFacts
  schema.py          # Column / Table / Schema data structures
  policy.py          # YAML → Policy
  engine.py          # facts + policy + schema → Decision
  decision.py        # Decision / Verdict / SafeAlternative
  audit.py           # JSONL audit log
  approval.py        # terminal approval flow
  state.py           # local connection + schema cache (.yare/state.json)
  quickstart.py      # seeded DuckDB demo
  mcp_server.py      # FastMCP server
  cli.py             # yare init | connect | scan | check | run | mcp | quickstart
  scanner.py         # result firewall — PII pattern scan of returned rows
  mcp_install.py     # auto-write MCP config into Claude Code/Cursor/Claude Desktop
  warehouse/
    __init__.py      # adapter protocol + factory + env expansion
    duckdb.py        # DuckDB adapter
    postgres.py      # Postgres adapter (server-side cursor, batched scan)
    redshift.py      # Redshift adapter (internal + Spectrum)
    snowflake.py     # Snowflake adapter
    bigquery.py      # BigQuery adapter
    _common.py       # JSON-safe value coercion
  integrations/
    litellm.py       # LiteLLM guardrail + safe_completion()
yare.yml             # example policy
tests/               # pytest suite
```

## License

Apache 2.0. See [LICENSE](LICENSE) and [NOTICE](NOTICE).

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md). New warehouse adapters and PII detectors are the highest-leverage contributions.
