Metadata-Version: 2.4
Name: ai-agentbom
Version: 0.1.7
Summary: Minimal bill of materials generator for AI agents
Author: AgentBOM contributors
License: MIT
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: build>=1; extra == "dev"
Requires-Dist: pre-commit>=4; extra == "dev"
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: ruff>=0.6; extra == "dev"
Dynamic: license-file

# AgentBOM

![CI](https://github.com/vlcak27/agentbom/actions/workflows/ci.yml/badge.svg)

AI Agent Attack Surface Analysis.

AgentBOM is a minimal CLI for statically analyzing AI agent repositories. It generates a bill of materials focused on the security-relevant parts of an agent: providers, models, frameworks, prompts, MCP configuration, risky capabilities, reachable capabilities, policy findings, and SARIF results.

AgentBOM does not execute or import scanned code. It works offline and uses deterministic static analysis techniques in the current v0.4 implementation.

## Quick Start

```bash
pip install ai-agentbom

agentbom scan ./my-agent --sarif --cyclonedx --pretty
```

## What is AgentBOM

AgentBOM scans a repository and writes:

- `agentbom.json`: machine-readable findings
- `agentbom.md`: human-readable report
- `agentbom.html`: optional self-contained offline HTML report
- `agentbom.sarif`: optional SARIF 2.1.0 output for code scanning tools
- `agentbom.cdx.json`: optional CycloneDX JSON output

The goal is to make AI agent attack surface review repeatable. AgentBOM reports what it found, where it found it, and how confident the scanner is.

## Why AI agent repositories need analysis

AI agents combine model output with software capabilities. Security review needs to identify more than package dependencies:

- which model providers and concrete models are referenced
- which agent frameworks route prompts and tool calls
- whether MCP servers or prompt files are present
- whether shell, code execution, network, database, cloud, or autonomous execution capabilities exist
- whether risky capabilities appear reachable from a model, framework, or tool
- whether policy documentation is missing for sensitive behavior
- whether secret names are referenced without exposing secret values

Static analysis is intentionally conservative. Findings are review signals, not proof of exploitable vulnerabilities.

## Features

- Offline repository scanning
- Provider detection: `openai`, `anthropic`, `gemini`
- Model identifier detection in code and configuration
- Framework detection: LangChain, LlamaIndex, CrewAI, AutoGen, Semantic Kernel
- Dependency analysis for `pyproject.toml` and `requirements.txt`
- MCP config detection: `mcp.json`, `claude_desktop_config.json`
- Prompt file detection: `AGENTS.md`, `CLAUDE.md`, prompt YAML, `prompts/*.md`
- Risky capability detection
- Reachability inference from models, frameworks, and MCP config to capabilities
- Capability graph output
- Policy findings for missing or weak controls
- Secret reference detection by name only
- JSON, Markdown, optional HTML, optional SARIF, and optional CycloneDX output

Scanner limits:

- does not execute scanned code
- does not import scanned modules
- skips binary-looking files
- skips files larger than 1 MB
- does not follow symlink loops
- skips common dependency, build, cache, and VCS directories

## Reachability Analysis

Reachability analysis connects an actor to a capability.

Actors include:

- concrete models, such as `gpt-4o`
- frameworks, such as `langchain`
- tool configuration, such as `mcp.json`

Capabilities include:

- `shell_execution`
- `code_execution`
- `network_access`
- `cloud_access`
- `autonomous_execution`

AgentBOM first looks for an actor and capability in the same source file. If no same-file actor exists, it falls back to detected models, then frameworks, then tool configuration with lower confidence. This keeps inference deterministic and auditable.

Reachability findings also include static path evidence:

- `prompt_input`: prompt variables, prompt templates, chat messages, or direct input
- `tool_invocation`: tool calls or framework invocation patterns
- `shell_execution`: subprocess, shell, exec, or eval execution paths
- `network_execution`: HTTP, cloud, or network client execution paths

`confidence_score` is a deterministic 0-100 score derived from source-file locality, actor confidence, capability confidence, and path evidence.

## Capability Graph

The JSON report includes a graph representation of the detected attack surface.

Nodes represent providers, models, frameworks, and capabilities. Edges describe relationships:

- `uses`: a model or framework uses a provider
- `enables`: a framework enables a capability
- `reaches`: an actor reaches a capability

Example:

```json
{
  "nodes": [
    {"id": "provider:openai", "type": "provider", "name": "openai"},
    {"id": "framework:langchain", "type": "framework", "name": "langchain"},
    {"id": "capability:code_execution", "type": "capability", "name": "code_execution"}
  ],
  "edges": [
    {"source": "framework:langchain", "target": "capability:code_execution", "type": "reaches"}
  ]
}
```

## SARIF Export

Use `--sarif` to write `agentbom.sarif` alongside the JSON and Markdown reports. SARIF output includes scanner risks, reachable capabilities, and policy findings.

Example SARIF result:

```json
{
  "ruleId": "reachable.code_execution",
  "level": "error",
  "message": {
    "text": "langchain reaches code_execution with high risk"
  },
  "locations": [
    {
      "physicalLocation": {
        "artifactLocation": {
          "uri": "agent.py"
        }
      }
    }
  ]
}
```

## CycloneDX Export

Use `--cyclonedx` to write `agentbom.cdx.json` alongside the native AgentBOM JSON and Markdown reports.

The CycloneDX export is separate from the native AgentBOM schema and includes detected providers, models, frameworks, capabilities, and dependencies as CycloneDX components with `agentbom:*` properties.

## Example Output

Given an agent that imports LangChain, reads `OPENAI_API_KEY`, and calls `subprocess.run`, AgentBOM can produce:

```json
{
  "schema_version": "0.1.0",
  "repository": "examples/simple_agent",
  "providers": [
    {"name": "openai", "path": "agent.py", "confidence": "high"}
  ],
  "frameworks": [
    {"name": "langchain", "path": "agent.py", "confidence": "high"}
  ],
  "models": [],
  "mcp_servers": [
    {"name": "mcp.json", "path": "mcp.json", "confidence": "medium"}
  ],
  "capabilities": [
    {"name": "shell", "path": "agent.py", "confidence": "high"}
  ],
  "dependencies": [
    {
      "name": "langchain",
      "category": "ai_framework",
      "path": "requirements.txt",
      "confidence": "low"
    }
  ],
  "reachable_capabilities": [
    {
      "capability": "code_execution",
      "reachable_from": "langchain",
      "source_file": "agent.py",
      "risk": "high",
      "confidence": "high",
      "confidence_score": 100,
      "paths": ["shell_execution"]
    }
  ],
  "policy_findings": [
    {
      "severity": "high",
      "message": "shell execution detected without restrictions",
      "source_file": "agent.py"
    }
  ],
  "repository_risk": {
    "score": 90,
    "severity": "critical",
    "rationale": [
      "high-risk reachable capability detected: code_execution",
      "shell or code execution is present or reachable",
      "secret references were detected",
      "policy controls are missing or incomplete"
    ]
  },
  "risks": [
    {
      "severity": "high",
      "reason": "shell, code execution, or autonomous execution capability detected"
    }
  ],
  "secret_references": [
    {"name": "OPENAI_API_KEY", "path": "agent.py", "confidence": "high"}
  ]
}
```

Secret values are not stored or printed.

## Installation

Install from PyPI:

```bash
pip install agentbom
```

Development install:

```bash
pip install -e ".[dev]"
```

Install the pre-commit hooks:

```bash
pre-commit install
```

Run the hooks manually:

```bash
pre-commit run --all-files
```

## Usage

Scan the example repository:

```bash
agentbom scan examples/simple_agent --pretty
```

Write reports to a dedicated directory:

```bash
agentbom scan /path/to/agent-repo --output-dir ./agentbom-report --pretty
```

Generate SARIF for code scanning systems:

```bash
agentbom scan /path/to/agent-repo --output-dir ./agentbom-report --pretty --sarif
```

Generate CycloneDX JSON:

```bash
agentbom scan /path/to/agent-repo --output-dir ./agentbom-report --pretty --cyclonedx
```

Generate an offline HTML security report:

```bash
agentbom scan /path/to/agent-repo --output-dir ./agentbom-report --pretty --html
```

Apply a custom YAML policy:

```bash
agentbom scan /path/to/agent-repo --policy agentbom-policy.yaml --sarif --pretty
```

Example policy:

```yaml
deny_capabilities:
  - shell_execution
  - autonomous_execution

require:
  sandboxing: true
  human_approval: true
```

Custom policy violations are emitted as `policy_findings` in JSON, included in the Markdown report, and exported as SARIF policy results.

`sandboxing` is satisfied by a detected sandbox/runtime dependency.

`human_approval` is satisfied by repository text such as `human approval`, `human-in-the-loop`, or `approval required`.

Typical output:

```text
Wrote agentbom-report/agentbom.json
Wrote agentbom-report/agentbom.md
Wrote agentbom-report/agentbom.html
Wrote agentbom-report/agentbom.sarif
```

## JSON Schema

The output schema is documented in:

```text
docs/output-schema.json
```

The schema uses JSON Schema draft 2020-12 and defines the stable report fields for `schema_version`, providers, models, frameworks, capabilities, reachable capabilities, dependencies, capability graph, policy findings, repository risk, and risks.

Minimal shape:

```json
{
  "schema_version": "0.1.0",
  "providers": [],
  "models": [],
  "frameworks": [],
  "capabilities": [],
  "dependencies": [],
  "reachable_capabilities": [],
  "capability_graph": {
    "nodes": [],
    "edges": []
  },
  "policy_findings": [],
  "repository_risk": {
    "score": 0,
    "severity": "low",
    "rationale": []
  },
  "risks": []
}
```

## Architecture

AgentBOM uses a small static-analysis pipeline:

1. Walk the target directory without following symlink loops.
2. Skip generated, dependency, cache, VCS, binary-looking, and oversized files.
3. Run simple text detectors over source and configuration files.
4. Infer reachable capabilities from source-file locality and detected actors.
5. Build a capability graph.
6. Score scanner-level risks, policy findings, and aggregate repository risk.
7. Write JSON, Markdown, optional HTML, optional SARIF, and optional CycloneDX reports.

The implementation is dependency-light and deterministic so it can run in local development, CI, and restricted review environments.

## Roadmap

- Better package and configuration parsing
- More model and framework detectors
- Deeper MCP transport and command analysis
- Tool permission classification
- Policy allowlists and denylists
- Baseline comparison
- CI examples
- Expanded SARIF coverage
- SPDX export is not implemented yet
