Metadata-Version: 2.4
Name: saibre
Version: 0.1.6
Summary: ProxyShield local agent for selective AI egress interception and masking.
Author-email: ProxyShield <security@example.com>
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: typer==0.12.3
Requires-Dist: click==8.1.7
Requires-Dist: fastapi==0.115.0
Requires-Dist: uvicorn==0.30.3
Requires-Dist: h11==0.14.0
Requires-Dist: mitmproxy>=10.3.0
Requires-Dist: ruamel.yaml>=0.17.40
Requires-Dist: keyring>=25.2.0
Requires-Dist: platformdirs>=4.2.0
Requires-Dist: cryptography>=42.0.0
Requires-Dist: pykeepass>=4.0.6
Requires-Dist: requests>=2.32.0
Requires-Dist: hvac>=2.3.0
Requires-Dist: google-cloud-secret-manager>=2.20.0
Requires-Dist: azure-identity>=1.17.0
Requires-Dist: azure-keyvault-secrets>=4.8.0
Requires-Dist: pydantic==2.7.4
Requires-Dist: zstandard>=0.22.0
Provides-Extra: presidio
Requires-Dist: presidio-analyzer; extra == "presidio"
Requires-Dist: presidio-anonymizer; extra == "presidio"

# ProxyShield Agent (V1)

**ProxyShield Agent** is the Windows-first background service that selectively intercepts outbound HTTPS traffic destined for AI tools (Copilot, ChatGPT/OpenAI, Claude) and masks sensitive keywords pulled from the ProxyShield desktop app keychain. It runs a local explicit proxy, publishes a PAC file, and exposes a small local API for health/config/logs.

## Capabilities
- Explicit proxy on `127.0.0.1:3128` with CONNECT+TLS MITM for AI endpoints only.
- PAC server on `127.0.0.1:3129/proxy.pac` that routes only AI domains through the proxy; everything else is `DIRECT`.
- Masking of URLs, headers, JSON bodies, and x-www-form-urlencoded bodies using keywords stored by the desktop app (`service=saibre`, accounts like `entry:<uuid>` in the OS keychain).
- Local API (`/health`, `/config`, `/keywords`, `/logs/recent`, `/pac`) served over HTTP for debugging and control, including `POST /keywords` for desktop keyword sync.
- Policy API on `127.0.0.1:3130` for keyword policy status/updates, protected by an `X-ProxyShield-Token` header (token stored at `C:\ProgramData\ProxyShield\config\ui_token.dpapi`, override with `PROXYSHIELD_UI_TOKEN` for dev).
- Windows helpers to install/uninstall the ProxyShield Root CA, set/clear system PAC AutoConfigURL, and manage proxy-related environment variables.
- `proxyshield-agent doctor` to validate PAC, env vars, CA install, and connectivity to the main AI endpoints.
- CLI helpers to install/uninstall the OS auto-start service (`proxyshield-agent install-service`, `uninstall-service`, `enable-autostart`, `disable-autostart`).
- macOS installer attempts to expose bundled agent as `ProxyShieldAgent` via `/usr/local/bin/ProxyShieldAgent` symlink.


## OpenClaw integration direction
- The agent no longer exposes a dedicated OpenClaw HTTP scan endpoint.
- OpenClaw support is being reworked around an in-process wrapper architecture rather than an endpoint layered onto the agent API.
- The standard local control endpoints remain available for agent operations: `/health`, `/config`, `/keywords`, `/logs/recent`, and `/pac`.

## Layout
```
proxyshield_agent/
  pyproject.toml
  proxyshield_agent/
    main.py
    service.py
    proxy_runtime.py
    masking.py
    keyword_source.py
    pac.py
    windows/
      trust_store.py
      system_proxy.py
      env_vars.py
  tests/
    test_masking.py
    test_pac.py
```

## Development
1. Install deps (from repo root):
   ```bash
   python -m venv .venv
   source .venv/bin/activate  # PowerShell: .venv\Scripts\Activate.ps1
   pip install -r requirements.txt
   ```
2. Install the CLI entrypoint into the venv:
   ```bash
   pip install -e proxyshield_agent  # from repo root
   # or, if you're already in the proxyshield_agent directory:
   pip install -e .
   ```
3. Run unit tests: `pytest proxyshield_agent/tests`
4. Run macOS-specific tests (marked with `macos`) locally on macOS:
   ```bash
   pytest proxyshield_agent/tests -m macos
   ```
5. Run macOS-specific tests in CI on a macOS runner:
   ```bash
   pytest proxyshield_agent/tests -m macos
   ```
   On non-macOS CI runners, skip macOS-only tests:
   ```bash
   pytest proxyshield_agent/tests -m "not macos"
   ```
6. Run the agent API (dev) with uvicorn:
   ```bash
   uvicorn proxyshield_agent.api:app --reload --port 8787
   ```
7. Run the full agent in dev mode (foreground):
   ```bash
   python -m proxyshield_agent.main enable --dev
   ```

If `proxyshield-agent` still reports "command not found", confirm the venv is active and run:
```bash
python -m proxyshield_agent enable --dev
```

If the console script exists but errors with `ModuleNotFoundError: No module named 'proxyshield_agent'`,
reinstall the package in the active venv:
```bash
python -m pip install -e .  # run from proxyshield_agent/ directory
python -m pip show proxyshield-agent
```

On Windows, also ensure you're running the command from the repo root or the
`proxyshield_agent/` project directory (not the inner `proxyshield_agent/` package
folder), and force-reinstall the entrypoint if needed:
```powershell
python -m pip install -e .\proxyshield_agent --force-reinstall
Get-Command proxyshield-agent
```

## Service management (CLI)
The agent ships cross-platform commands to register and manage auto-start behavior:
- Install the service/LaunchAgent (autostart enabled by default): `proxyshield-agent install-service`
- Disable autostart: `proxyshield-agent disable-autostart`
- Re-enable autostart if needed: `proxyshield-agent enable-autostart`
- Remove the service/LaunchAgent: `proxyshield-agent uninstall-service`

On Windows, the installer uses `sc.exe` and configures the service to restart on failure. On macOS, the CLI installs the LaunchAgent plist (based on `scripts/com.saibre.proxyshield.agent.plist`) with `RunAtLoad` and `KeepAlive` enabled.
The onboarding scripts (`scripts/windows_onboard.ps1` and `scripts/macos_onboard.sh`) also register autostart automatically, so the CLI is optional if you're following those flows.

## Packaging
- Build a wheel: `pip wheel .` from the `proxyshield_agent` directory.
- The Windows installer (PowerShell/Inno Setup) should place binaries under Program Files, generate/install the CA in `%ProgramData%\ProxyShield\ca`, set PAC AutoConfigURL, and register the service for autostart. See `/docs/V1-onboarding.md` for scripts and steps. On macOS, data lives under `~/Library/Application Support/ProxyShield`.

### macOS packaging (PyInstaller)
We package the macOS agent as a standalone PyInstaller binary that the desktop app bundles inside the app bundle and launches directly (for example at `ProxyShield.app/Contents/Resources/ProxyShieldAgent`). This keeps the agent version locked to the desktop build and avoids requiring a separate pip/pipx install.

**Build steps (macOS):**
1. From the repo root:
   ```bash
   python -m venv .venv
   source .venv/bin/activate
   pip install -r requirements.txt
   pip install pyinstaller
   ```
2. (Optional) Preseed assets to bundle into the binary:
   - Add a `config.json` and/or a `ca/` directory under `proxyshield_agent/packaging/seed/`.
   - These will be copied into `~/Library/Application Support/ProxyShield` on first run if missing.
3. Build the agent binary:
   ```bash
   cd proxyshield_agent
   pyinstaller packaging/ProxyShieldAgent.spec
   ```
4. The output binary will be at `proxyshield_agent/dist/ProxyShieldAgent`. Bundle this file into the desktop app.

**Runtime behavior:**
- When frozen on macOS, the agent sets its working directory to `~/Library/Application Support/ProxyShield` and continues to write config/logs/CA assets there.
- The agent stores config at `~/Library/Application Support/ProxyShield/config.json`, CA assets under `~/Library/Application Support/ProxyShield/ca/`, and logs at `~/Library/Application Support/ProxyShield/logs/agent.log`.

## Notes
- Only AI domains are routed through the proxy; general browsing remains direct because the PAC returns `DIRECT` for everything else.
- If an app breaks, add its domains to `bypass_domains` in `%ProgramData%\ProxyShield\config.json` (Windows) or `~/Library/Application Support/ProxyShield/config.json` (macOS), or use the "Launch Unprotected" wrapper that clears proxy env vars for that process.

## Quick validation without `diagnose`
Use these checks when the agent appears to run but traffic is not being intercepted.

1. Confirm LaunchAgent is loaded (macOS):
   ```bash
   launchctl print "gui/$(id -u)/com.saibre.proxyshield.agent" | head -n 40
   ```

2. Confirm PAC endpoint is reachable:
   ```bash
   curl -sv "http://127.0.0.1:3129/proxy.pac" | head -n 20
   ```

3. Confirm macOS network service uses the PAC URL:
   ```bash
   networksetup -listallnetworkservices
   networksetup -getautoproxyurl "Wi-Fi"
   ```
   `Enabled: Yes` and `URL: http://127.0.0.1:3129/proxy.pac` should be present on the active service.

4. Confirm proxy listener is accepting connections:
   ```bash
   nc -vz 127.0.0.1 3128
   ```

5. Force-test a proxied request path manually:
   ```bash
   curl -x http://127.0.0.1:3128 -I https://api.openai.com
   ```
   If this fails, the local explicit proxy path is not healthy yet.

6. Verify masking engine sees your stored keywords:
   ```bash
   proxyshield-agent keyword-list
   proxyshield-agent mask-test --sample "my-secret-value"
   ```
   `mask-test` output includes `tier` and `enabled_detectors` so you can confirm
   whether entropy/API-key heuristics are active (Pro+) or only keyword masking is enabled (Free).

7. Tail agent logs while sending a test request:
   ```bash
   tail -f "${HOME}/Library/Application Support/ProxyShield/logs/agent.log"
   ```
   Then trigger traffic to an AI domain. If nothing appears, traffic is likely not entering the local proxy.

## macOS policy + token storage
- Machine policy keywords are stored as an encrypted payload at `/Library/Application Support/ProxyShield/policy/machine.dpapi`.
- User overlay keywords are stored as an encrypted payload at `~/Library/Application Support/ProxyShield/policy/user.dpapi` for the active policy user.
- To keep policy visible to system-account processes (for example LaunchDaemons running as root), the agent also writes a synchronized encrypted user copy at `/Library/Application Support/ProxyShield/policy/users/<uid>/user.dpapi`.
- All policy files above are encrypted at rest using the macOS DPAPI helper (`proxyshield_agent.macos.dpapi`) and never persisted as plaintext JSON.
- The policy API UI token on macOS is stored at `/Library/Application Support/ProxyShield/config/ui_token.enc` (override path with `PROXYSHIELD_UI_TOKEN_PATH`, or bypass file storage in dev with `PROXYSHIELD_UI_TOKEN`).
- Legacy keychain-based UI tokens are migrated automatically on first read into `ui_token.enc` and then removed from keychain.

### Resetting or migrating policy/UI token state (macOS)
- Reset policy UI token (forces token regeneration on next agent start):
  ```bash
  sudo rm -f "/Library/Application Support/ProxyShield/config/ui_token.enc"
  ```
- Migrate user policy from home-only storage to a system-visible copy (safe to run repeatedly):
  ```bash
  sudo python3 -m proxyshield_agent.main policy status-dump >/dev/null
  ```
  Loading policy triggers sync from `~/Library/.../policy/user.dpapi` to `/Library/.../policy/users/<uid>/user.dpapi` when needed.

### Troubleshooting (permissions + policy visibility)
- If onboarding fails with `Operation not permitted` or `Permission denied`, re-run onboarding with an admin account and ensure `/Library/Application Support/ProxyShield` is root-writable:
  ```bash
  sudo mkdir -p "/Library/Application Support/ProxyShield"
  sudo chown -R root:wheel "/Library/Application Support/ProxyShield"
  sudo chmod -R u+rwX,g+rX "/Library/Application Support/ProxyShield"
  ```
- If policy appears empty when the agent runs as a system account, check both user and system policy copies exist and are non-empty:
  ```bash
  ls -l "${HOME}/Library/Application Support/ProxyShield/policy/user.dpapi"
  sudo ls -l "/Library/Application Support/ProxyShield/policy/users/$(id -u)/user.dpapi"
  ```
- Verify effective policy source/visibility:
  ```bash
  proxyshield-agent policy status-dump
  ```
  Confirm `source_used` contains expected entries (`machine`, `user`) and counts are non-zero when policy is present.

## Configuration flow (UI -> Agent)
- The desktop UI sends a full config payload to `POST /config` on the local agent API.
- The agent validates `allowed_domains` and `bypass_domains` as lists of strings, trims whitespace, and removes duplicates before persisting to `%ProgramData%\ProxyShield\config.json` (Windows) or `~/Library/Application Support/ProxyShield/config.json` (macOS).
- The in-memory config used by `GET /config` and PAC generation is updated immediately after a successful `POST /config`.
- The proxy runtime keeps a lightweight reference to the latest config; masking and PAC rules will reflect the new lists without restarting the agent.

### Masking configuration (PHI patterns)
Enterprise tenants can inject additional regex patterns for PHI MRNs and member IDs via the `masking` config object. The patterns are compiled into regex detections and mapped to the PHI redaction labels.

Example:
```json
{
  "masking": {
    "tier": "enterprise",
    "mrn_patterns": ["\\bMRN\\d{6,10}\\b", "\\b\\d{8}\\b"],
    "member_id_patterns": ["\\bID-[A-Z0-9]{6,12}\\b"]
  }
}
```

Notes:
- The patterns are interpreted as Python regular expressions.
- They are appended to the existing regex detector policy, so the regex detector must be enabled for the active tier.

## Integrations (1Password)
- Integrations are enabled through the `integrations` config object and only run when keywords are loaded.
- The 1Password provider relies on the `op` CLI and either a service account token or a desktop session token.
- Use `GET /integrations` or `GET /integrations/status` to see registered integrations and their last known status.

Example configuration:
```json
{
  "integrations": {
    "onepassword": {
      "enabled": true,
      "settings": {
        "vaults": "My Vault,another-vault-id",
        "account": "my",
        "token_ref": "prod"
      }
    }
  }
}
```

Authentication options:
- Service account: set `OP_SERVICE_ACCOUNT_TOKEN` or store it under the keychain entry `proxyshield.onepassword` with `token_ref`.
- Desktop session: set `OP_SESSION` or `OP_SESSION_<account>`; the agent will store the token in the OS keychain for reuse.
