Metadata-Version: 2.2
Name: scim-sanity
Version: 0.6.0
Summary: Live SCIM server conformance testing and payload validation (RFC 7643/7644)
Author-email: Thomas Betz <thomas@seattlecodestudio.com>
License: MIT
Project-URL: Homepage, https://github.com/thomaselliottbetz/scim-sanity
Project-URL: Documentation, https://github.com/thomaselliottbetz/scim-sanity#readme
Project-URL: Repository, https://github.com/thomaselliottbetz/scim-sanity
Project-URL: Issues, https://github.com/thomaselliottbetz/scim-sanity/issues
Keywords: scim,scim2,validation,rfc7643,rfc7644,cli,identity,agent,agentic
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Systems Administration :: Authentication/Directory
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: jsonschema>=4.0.0; extra == "dev"
Provides-Extra: web
Requires-Dist: fastapi; extra == "web"
Requires-Dist: uvicorn; extra == "web"

# scim-sanity

Test your SCIM server's RFC 7643/7644 conformance and validate payloads before they hit production — from the command line or a browser-based GUI built with React and Cloudscape.

[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/thomaselliottbetz/scim-sanity/main.svg)](https://results.pre-commit.ci/latest/github/thomaselliottbetz/scim-sanity/main)

## Features

**scim-sanity** is a **pragmatic, production-oriented SCIM conformance and interoperability harness**:
- **Server conformance probe** — CRUD lifecycle tests against a live SCIM endpoint covering discovery, User/Group/Agent/AgenticApplication operations, search, pagination, and error handling. Failures produce a prioritised Fix Summary — each issue with a trouble description, fix, and rationale.
- **Payload validation** — Static SCIM JSON analysis before sending data to a server. Catches missing required attributes, immutable field violations, null value misuse, and schema URN errors.
- **Example payload library** — Curated valid and invalid SCIM resources with RFC citations, loadable directly into the validator.
- **Web GUI** — Browser-based interface built with React and Cloudscape Design System. Validate, probe, and browse examples without touching the terminal.
- **Agentic identity support** — Agent and AgenticApplication resources per IETF `draft-abbey-scim-agent-extension-00`.
- **Strict and compat modes** — Strict mode (default) treats all spec deviations as failures. Compat mode downgrades known real-world deviations (e.g., `application/json` instead of `application/scim+json`) to warnings.
- **Minimal dependencies** — Core CLI requires only Click. The web GUI is opt-in via `pip install scim-sanity[web]`.

## Installation

```bash
pip install scim-sanity
```

With the optional web GUI:

```bash
pip install 'scim-sanity[web]'
```

Or from source:

```bash
git clone https://github.com/thomaselliottbetz/scim-sanity.git
cd scim-sanity
python -m venv venv
source venv/bin/activate
pip install -e ".[web,dev]"
```

## Web GUI

scim-sanity includes an optional browser-based interface built with React and the [AWS Cloudscape Design System](https://cloudscape.design/). The CLI remains the primary interface and is unchanged.

### Installation

```bash
pip install 'scim-sanity[web]'
```

### Usage

```bash
scim-sanity web
```

Then open **http://127.0.0.1:8000** in your browser. Options:

```bash
scim-sanity web --port 8080 --host 0.0.0.0
```

### Pages

| Page | Path | Description |
|------|------|-------------|
| **Validate** | `/validate` | Paste or load a SCIM JSON payload and validate it against RFC 7643/7644 rules. Supports full resources and PATCH operations. Load any example from the built-in library. |
| **Probe** | `/probe` | Configure and run a live server conformance probe. Results are grouped by test phase with status indicators and a prioritised Fix Summary when failures are present. |
| **Examples** | `/examples` | Browse 16 curated RFC example payloads. Filter by resource type (User, Group, Agent, AgenticApplication, PATCH) or validity. Load any example directly into the Validate page. |

## Server Conformance Probe

Test a live SCIM server for RFC 7643/7644 conformance. The probe **creates, modifies, and deletes real resources** on the target server, then cleans up after itself.

⚠️ Warning: This tool performs destructive operations. Do not run against production tenants without explicit authorization.

```bash
# Basic probe with bearer token
scim-sanity probe https://example.com/scim/v2 --token <token> --i-accept-side-effects

# Basic auth
scim-sanity probe https://example.com/scim/v2 --username admin --password secret --i-accept-side-effects

# Compat mode (known deviations become warnings, not failures)
scim-sanity probe <url> --token <token> --compat --i-accept-side-effects

# JSON output for CI/CD
scim-sanity probe <url> --token <token> --json-output --i-accept-side-effects

# Test only a specific resource type
scim-sanity probe <url> --token <token> --resource Agent --i-accept-side-effects

# Self-signed certificates
scim-sanity probe <url> --token <token> --tls-no-verify --i-accept-side-effects

# Leave test resources on the server for inspection
scim-sanity probe <url> --token <token> --skip-cleanup --i-accept-side-effects

# Custom timeout and proxy
scim-sanity probe <url> --token <token> --timeout 60 --proxy http://proxy:8080 --i-accept-side-effects

# Custom CA bundle
scim-sanity probe <url> --token <token> --ca-bundle /path/to/ca-cert.pem --i-accept-side-effects
```

### Probe Options

| Option | Description |
|--------|-------------|
| `--token` | Bearer token for authentication |
| `--username` / `--password` | Basic auth credentials |
| `--i-accept-side-effects` | **Required.** Acknowledge that the probe creates/deletes resources |
| `--strict` / `--compat` | Strict (default) or compat validation mode |
| `--json-output` | Output results as JSON |
| `--resource` | Test a specific resource type (User, Group, Agent, AgenticApplication) |
| `--skip-cleanup` | Leave test resources on the server |
| `--tls-no-verify` | Skip TLS certificate verification |
| `--timeout` | Per-request timeout in seconds (default: 30) |
| `--proxy` | HTTP/HTTPS proxy URL |
| `--ca-bundle` | Path to custom CA certificate bundle |

### Safety Guardrails

The probe implements several safety measures to prevent accidental damage:

- **Explicit consent** — Refuses to run without `--i-accept-side-effects`.
- **Namespace isolation** — All test resources are prefixed with `scim-sanity-test-` to avoid collisions with real data.
- **Resource caps** — Hard limit of 10 agents in rapid lifecycle tests.
- **429 retry** — Automatically retries on 429 Too Many Requests, honoring `Retry-After` headers (max 3 retries).
- **500 transience detection** — When a POST returns 500, the probe retries once after a brief delay using the same request headers. If the retry succeeds, the result is recorded as a warning ("transient instability") and the CRUD lifecycle continues with the resource created by the retry. If both attempts fail, content-type rejection diagnosis runs before reporting the final failure.
- **Timeouts** — Per-request timeouts prevent hung runs.
- **Cleanup** — Deletes all created test resources in reverse order (groups before users). Skippable with `--skip-cleanup`.
- **Failure semantics** — If the process is interrupted, partial cleanup may occur; orphaned test resources are possible and should be removed manually.
- **Secret redaction** — Authorization headers are redacted in any JSON output or logs.

### Test Sequence

The probe runs 7 phases. Each phase tests specific RFC clauses against real HTTP traffic — no mocking.

1. **Discovery** (RFC 7644 §4)
   - GET `/ServiceProviderConfig`, `/Schemas`, `/ResourceTypes`
   - Asserts: HTTP 200, `Content-Type: application/scim+json`, parseable JSON body
   - A server that omits these endpoints forces clients to hardcode assumptions about server capabilities

2. **User CRUD Lifecycle** (RFC 7644 §3.3, §3.4.1, §3.5.1, §3.6; RFC 7643 §4.1)
   - POST → asserts 201, `Content-Type: application/scim+json`, `Location` header, `id`, `meta.created`, `meta.lastModified`
   - GET by id → asserts 200, same Content-Type and meta fields
   - PUT → asserts 200, same Content-Type and meta fields
   - GET after PUT → asserts the updated field value persisted
   - PATCH `active=false` → asserts 200 or 204
   - GET after PATCH → asserts `active` is `false`
   - DELETE → asserts 204 No Content (RFC 7644 §3.6)
   - GET after DELETE → asserts 404

3. **Group CRUD Lifecycle** (RFC 7644 §3.3; RFC 7643 §4.2)
   - Same sequence as User
   - Additional PATCH: add a member, then remove all members — asserts 200 each

4. **Agent CRUD Lifecycle** (draft-abbey-scim-agent-extension-00)
   - Same sequence as User
   - Skipped if server does not advertise Agent support in `/ResourceTypes`

5. **AgenticApplication CRUD Lifecycle** (draft-abbey-scim-agent-extension-00)
   - Same sequence as User
   - Skipped if server does not advertise AgenticApplication support

5a. **Agent Rapid Lifecycle** (draft-abbey-scim-agent-extension-00)
   - Create and immediately delete multiple agents (default 10) to test ephemeral provisioning at machine speed
   - Skipped if server does not support Agents

6. **Search** (RFC 7644 §3.4.2, §8.1)
   - GET `/Users` → asserts ListResponse envelope (`schemas`, `totalResults`, `Resources`), `Content-Type: application/scim+json`
   - GET `/Users?filter=...` → asserts 200 (or 400 if partial filter support)
   - GET `/Users?startIndex=1&count=1` → asserts pagination parameters honored
   - GET `/Users?count=0` → asserts `totalResults` present with empty `Resources`

7. **Error Handling** (RFC 7644 §3.12)
   - GET nonexistent resource → asserts 404 with SCIM error schema (`schemas`, `status`)
   - POST invalid JSON body → asserts 400 with SCIM error schema
   - POST missing required field (`userName`) → asserts 400 with SCIM error schema

### Strict vs Compat Mode

**Strict mode** (`--strict`, default) treats all RFC deviations as failures.

**Compat mode** (`--compat`) applies a curated **Deviation Policy**: known, widespread ecosystem deviations are downgraded to warnings instead of failures. This list is intentional and versioned.
Current compat warnings include:
- `application/json` instead of `application/scim+json`
- DELETE 204 with response body
- Location header mismatch with `meta.location`
- Missing error schema in error responses
- ETag/meta.version mismatch

Warnings appear in output but don't cause a non-zero exit code.

**Always failures (not compat-eligible):** Some deviations are reported as `FAIL` in both strict and compat mode because they fundamentally break RFC-compliant clients:
- Server rejects `Content-Type: application/scim+json` requests (e.g., with 500) but accepts `application/json` — diagnosed automatically and cited against RFC 7644 §8.2.

**Error response reporting:** When a server returns a 4xx or 5xx status for a resource endpoint, only the unexpected status code is reported. Predictable side-effects (missing `id`, `meta`, `schemas` in the error body) are suppressed to avoid obscuring the root cause with cascade noise.

#### Real-World Server Behavior

Enterprise SCIM servers often exhibit:

- **Rate limiting** (429 + Retry-After)
- **Eventual consistency** (a GET immediately after PUT may briefly return stale data)
- **Partial filter support** or restricted query capabilities

scim-sanity attempts to behave accordingly by retrying on 429, validating boundary cases, and clearly reporting unsupported or nonconformant behavior.

### Fix Summary

When failures are present, the probe appends a prioritised **Fix Summary** after the results. Each entry has three lines:

```
  [P1] Trouble: Wrong Content-Type on SCIM responses (12 tests affected)
       Fix: Set Content-Type: application/scim+json on all responses served from /scim/v2/
       Rationale: Compliant clients inspect Content-Type before parsing — every response
                  is rejected regardless of whether the body is otherwise correct.
```

Issues are ordered by severity (P1 most critical). The fix summary is omitted when all tests pass. In JSON output mode, the same information is available as an `issues` array (see below).

### JSON Output (Stable Interface)

```bash
scim-sanity probe <url> --token <token> --json-output --i-accept-side-effects
```

```json
{
  "scim_sanity_version": "0.6.0",
  "mode": "strict",
  "timestamp": "2026-02-24 09:15:00",
  "summary": {
    "total": 32,
    "passed": 14,
    "failed": 15,
    "warnings": 0,
    "skipped": 3,
    "errors": 0
  },
  "issues": [
    {
      "priority": "P1",
      "title": "Wrong Content-Type on SCIM responses",
      "rationale": "Compliant clients inspect Content-Type before parsing — every response is rejected regardless of whether the body is otherwise correct.",
      "fix": "Set Content-Type: application/scim+json on all responses served from /scim/v2/",
      "affected_tests": 12
    }
  ],
  "results": [
    {"name": "GET /ServiceProviderConfig", "status": "fail", "message": "Content-Type should be application/scim+json, got 'text/html; charset=utf-8'", "phase": "Phase 1 — Discovery"}
  ]
}
```

The JSON schema is treated as a public interface and is stable within major versions.

## Payload Validation (Linting)

Statically validate (lint) SCIM resource payloads and PATCH operations before sending them to a server. Resource type is auto-detected from schema URNs. This is a spec-driven validator with linter-style ergonomics: fast, offline, and suitable for CI/CD gating.

```bash
# Validate a resource file
scim-sanity user.json

# Validate a PATCH operation
scim-sanity --patch patch.json

# Validate from stdin
echo '{"schemas":["urn:ietf:params:scim:schemas:core:2.0:User"],"userName":"user@example.com"}' | scim-sanity --stdin

# Use in CI/CD pipelines
scim-sanity payload.json || exit 1
```

### Validation Rules

**Required attributes:**
- User: `userName`
- Group: `displayName`
- Agent: `name`
- AgenticApplication: `name`

**What it checks:**
- Schema URN validity and presence
- Required attributes per resource type
- Immutable attributes (`id`, `meta`) not set by client
- Null values (use PATCH `remove` instead)
- PATCH operation structure (`op`, `path`, `value` correctness)
- Complex and multi-valued attribute structure

### Exit Codes

- `0` — Validation passed (or all probe tests passed)
- `1` — Validation failed, probe failures detected, or error

## Payload Examples

### What the linter catches

Given a payload with a missing required field and a client-set immutable attribute:

```json
{
  "schemas": ["urn:ietf:params:scim:schemas:core:2.0:User"],
  "id": "123",
  "name": {"givenName": "John"}
}
```

```
Found 3 error(s):

❌ Missing required attribute: 'userName' (schema: urn:ietf:params:scim:schemas:core:2.0:User) at userName
❌ User resource missing required attribute: 'userName'
❌ Immutable attribute 'id' should not be set by client (mutability: readOnly) at id
```

### Minimal valid examples

**User**
```json
{
  "schemas": ["urn:ietf:params:scim:schemas:core:2.0:User"],
  "userName": "john.doe@example.com"
}
```

**Group**
```json
{
  "schemas": ["urn:ietf:params:scim:schemas:core:2.0:Group"],
  "displayName": "Engineering Team"
}
```

**Agent**
```json
{
  "schemas": ["urn:ietf:params:scim:schemas:core:2.0:Agent"],
  "name": "automation-agent"
}
```

**PATCH operation**
```json
{
  "schemas": ["urn:ietf:params:scim:api:messages:2.0:PatchOp"],
  "Operations": [{"op": "replace", "path": "displayName", "value": "New Name"}]
}
```

## Pre-commit Integration

```yaml
repos:
  - repo: local
    hooks:
      - id: scim-sanity
        name: Validate SCIM resources
        entry: python -m scim_sanity
        language: system
        types: [json]
        exclude: |
          (?x)^(
            .*/node_modules/.*|
            .*/\.venv/.*|
            .*/venv/.*|
            .*package\.json$|
            .*package-lock\.json$|
            .*tsconfig.*\.json$|
            .*jsconfig\.json$
          )$
        pass_filenames: true
        stages: [commit]
```

## Ansible Integration

Action plugin for SCIM validation in Ansible playbooks. See [ansible/README.md](ansible/README.md).

```yaml
- name: Validate SCIM payload
  scim_validate:
    payload: "{{ user_payload }}"
    operation: full
  register: validation_result
```

## Identity Provider Guides

- [Microsoft Entra ID Integration](docs/integrations/entra-id.md)
- [Google Workspace Integration](docs/integrations/google-workspace.md)

## Security and Compliance

- [Security and Compliance Guide](docs/security/compliance.md)

## Development

```bash
git clone https://github.com/thomaselliottbetz/scim-sanity.git
cd scim-sanity
python -m venv venv
source venv/bin/activate
pip install -e ".[web,dev]"
pytest -v
```

### Web GUI development

Two processes run during development:

```bash
# Terminal 1 — Python API (auto-reloads on source changes)
pip install -e ".[web]"
uvicorn scim_sanity.api:app --reload --port 8000

# Terminal 2 — Vite dev server with hot reload
cd web
npm install
npm run dev          # http://localhost:5173, proxies /api/* to :8000
```

To build the frontend for production:

```bash
cd web
npm run build        # outputs to web/dist/
```

The built static files are served automatically by `scim-sanity web` — no separate frontend process needed.

## Planned Improvements

**PATCH filter expression testing** (RFC 7644 §3.5.2) — The probe currently tests simple PATCH paths (`active`, `members`). Complex filter-based paths such as `emails[type eq "work"].value` are a known interop pain point and are not yet covered.

**Phase 1 schema content validation** — Discovery endpoint tests currently verify HTTP 200 and correct Content-Type but do not validate that the returned schema bodies are well-formed or consistent with the resources the server actually implements.

**Phase 6 resource body validation** — The search phase validates the ListResponse envelope structure but does not inspect individual resources within the `Resources` array. A server returning well-formed envelopes with non-conformant resource bodies would currently pass.

**GitHub Action** — A ready-to-use GitHub Action for running the probe or linter in CI/CD pipelines without requiring a local Python environment.

**Docker image** — A zero-setup container image for running the probe against any reachable SCIM endpoint without installing Python or pip.

## Related Projects

**[entra-google-security-bridge](https://github.com/thomaselliottbetz/entra-google-security-bridge)** — Post-provisioning security monitoring and attribute synchronization for hybrid Microsoft Entra ID + Google Workspace environments. Covers what scim-sanity doesn't: once provisioning is verified as spec-compliant, this tool handles ongoing risky sign-in detection, guest user auditing, OAuth token scanning, and OU synchronization driven by Entra ID attribute changes.

## Contributing

Contributions via Pull Request.

## License

MIT License - see [LICENSE](LICENSE) file.

## References

- [RFC 7643 - SCIM: Core Schema](https://tools.ietf.org/html/rfc7643)
- [RFC 7644 - SCIM: Protocol](https://tools.ietf.org/html/rfc7644)
- [draft-abbey-scim-agent-extension-00](https://datatracker.ietf.org/doc/draft-abbey-scim-agent-extension/)
