Metadata-Version: 2.4
Name: agent-artifacts
Version: 0.1.0
Summary: Transactional memory + skill artifacts for AI agents
Author: HengYpinn
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: PyYAML>=6.0
Requires-Dist: jinja2>=3.1
Requires-Dist: jsonschema>=4.21
Requires-Dist: semver>=3.0.0
Provides-Extra: langgraph
Requires-Dist: langgraph>=0.0.50; extra == "langgraph"
Provides-Extra: langchain
Requires-Dist: langchain>=0.1.0; extra == "langchain"
Provides-Extra: postgres
Requires-Dist: psycopg[binary]>=3.1; extra == "postgres"
Provides-Extra: service
Requires-Dist: fastapi>=0.110; extra == "service"
Requires-Dist: uvicorn>=0.24; extra == "service"
Dynamic: license-file

﻿# Agent Artifacts: Transactional Memory and Skill Artifacts for AI Agents
Agent Artifacts is an open-source **memory correctness + procedural skill + auditability layer** for AI agents.
It makes agents more reliable by turning "memory" into **structured artifacts** that can be **validated, versioned, and replayed**.

Most agent systems can generate fluent responses.
Agent Artifacts helps them **remember safely**, **reuse skills**, and **explain decisions**.


---

## What You Get

- **Transactional memory** so hallucinations don't become permanent facts.
- **Skill artifacts** (prompt + workflow) that are versioned, tagged, and reusable.
- **Decision traces** that make agent behavior auditable and debuggable.
- **Bounded prompt overhead** with global injection caps.
- **MCP + adapter integrations** so teams don't need to switch stacks.

_Agent Artifacts is a plug-in layer, not a full agent framework._
_Use only what you need: memory, skills, or traces can be adopted independently._

---

## Quickstart (2 Minutes)

Install and import a prompt skill:

```bash
pip install agent-artifacts
agent-artifacts import examples/prompt_skills/api-tester.md --name api_tester --version 0.1.0
agent-artifacts list
agent-artifacts run api_tester@0.1.0 --inputs "{\"base_url\": \"https://api.example.com\", \"endpoints\": [\"/health\"]}"
```

More: [Quickstart guide](examples/quickstart.md) and [examples index](examples/README.md).

---

## Core Capabilities (Why It Matters)

**1) Transactional Memory**
- Stage -> validate -> commit (or rollback) memory writes.
- Prevents "false facts" from sticking in production.
- Details: [validation policy](docs/validation-policy.md) and [memory types](docs/memory-types.md).

**2) Skill Artifacts (Procedural Memory)**
- Store workflows + role prompts as versioned artifacts (`name@version`, `@stable` tags).
- Enforce typed inputs with JSON Schema for safer execution.
- Prompt skills can be plain `.md` files with optional YAML front-matter.
- Details: [prompt skills examples](examples/prompt_skills/), [workflow skills examples](examples/skills/), and [tool adapters](docs/tools.md).

**3) Decision Traces**
- Structured logs for "why did the agent do that?"
- Supports replay and regression debugging.
- Details: [memory redaction](docs/memory-redaction.md) (privacy) and trace CLI examples below.

**4) Bounded Context Overhead**
- Injection is capped by default (`AdapterPipeline(max_injected_tokens=1000)`).
- Keeps prompts predictable vs dumping full histories.
- Details: [context budgeting](docs/memory-consumption.md).

---

## When Agent Artifacts Shines
- You maintain **a library of agent prompts** and want versioning + metadata.
- You run **repeatable workflows** (deploys, QA checks, data extraction).
- You need **auditability** for production agent behavior.
- You want **bounded context overhead** instead of raw history dumps.

---

## Benefits & Use-Cases

See [benefits and use-cases](docs/benefits-use-cases.md) for persona-based examples (solo devs, vibe coders,
production teams) and modular adoption guidance.

---

## Integrations

- LangGraph adapter (stable): [LangGraph adapter guide](docs/langgraph-adapter.md)
- LangChain adapter (stable): [LangChain adapter guide](docs/langchain-adapter.md)
- MCP server (stdio + HTTP/SSE): [MCP server docs](docs/mcp.md)
- MCP client guides: [MCP client setup](docs/mcp-clients.md)

---

## Docs (Start Here)

Documentation index: [Docs (start here)](docs/README.md)

---

# CLI Quickstart

```bash
agent-artifacts import examples/prompt_skills/api-tester.md --name api_tester --version 0.1.0
agent-artifacts list
agent-artifacts run api_tester@0.1.0 --inputs "{\"base_url\": \"https://api.example.com\", \"endpoints\": [\"/health\"]}"
```

Full CLI + injection examples: [CLI reference](docs/cli.md).

### Skill tool integration (callable by LLMs)

Expose skills as tool/function definitions and execute tool calls:

```python
from agent_artifacts.skills import (
    SkillToolConfig,
    SkillToolRegistry,
    SkillQueryConfig,
    execute_tool_call,
)

registry = SkillToolRegistry.from_storage(
    storage,
    query=SkillQueryConfig(tags=["stable"]),
    config=SkillToolConfig(name_strategy="name_version"),
)

tool_defs = registry.definitions()  # pass to your LLM runtime as tool/function specs

# ...when the model calls a tool:
result = execute_tool_call(storage, tool_name, tool_arguments, registry=registry)
print(result.to_dict())
```

Provider-specific tool adapters and SDK call examples live in [tool adapters](docs/tools.md).

Tools quickstart:

- Build tool definitions with `SkillToolRegistry.from_storage(...)`
- Convert them for your provider via `to_openai_tools` / `to_anthropic_tools` / `to_gemini_tools`
- Execute tool calls with `execute_tool_call(...)`
- Auto-run from model responses with `auto_execute_with_model(...)` (see [tool adapters](docs/tools.md))

Runnable demo (no external SDKs required):

```bash
python examples/skills/tool_adapters_demo.py
```

### MCP server (stdio)

Expose skills + memory + traces via Model Context Protocol:

```bash
agent-artifacts-mcp --backend sqlite --db ~/.agent-artifacts/agent-artifacts.db
```

HTTP/SSE transport (optional):

```bash
agent-artifacts-mcp-http --host 127.0.0.1 --port 8001 --backend sqlite --db ~/.agent-artifacts/agent-artifacts.db
```

#### MCP quickstart (60 seconds)

Copy/paste MCP client config (Cursor, Claude Desktop, etc.):

```json
{
  "mcpServers": {
    "agent-artifacts": {
      "command": "agent-artifacts-mcp",
      "args": ["--backend", "sqlite", "--db", "/path/to/agent-artifacts.db"]
    }
  }
}
```

60-second smoke test (HTTP): see [MCP HTTP demo](examples/mcp/README.md).

See [MCP server docs](docs/mcp.md) for tool inventory and request/response examples.
Client setup: [MCP clients](docs/mcp-clients.md).
Cursor quickstart config: [Cursor config template](configs/mcp-cursor.json) and [Cursor guide](docs/mcp-cursor.md).
Windsurf and Claude guides: [Windsurf guide](docs/mcp-windsurf.md) and [Claude guide](docs/mcp-claude.md).
Compatibility matrix and example app: [MCP compatibility](docs/mcp-compatibility.md) and [MCP examples](examples/mcp/).

Prompt skills surfaced via MCP include argument metadata. If your skill `inputs` use JSON Schema
fields like `description` (or `title`), MCP clients can render richer prompt UIs:

```yaml
inputs:
  text:
    type: string
    description: Text to summarize.
```

```bash
# Decision traces + audit journal
agent-artifacts trace log --decision execute_skill --skill-ref deploy_fastapi@1.0.0 --reason "deploy requested" --confidence 0.9 --result success --tx-id <tx_id>
agent-artifacts trace query --decision execute_skill --limit 50
agent-artifacts trace query --skill-ref deploy_fastapi@1.0.0 --result success --created-after 2026-01-01T00:00:00Z --correlation-id corr-123
agent-artifacts journal query --tx-id <tx_id> --limit 50 --show-payload
agent-artifacts journal query --tx-id <tx_id> --limit 10 --format json
agent-artifacts replay --tx-id <tx_id> --limit 50 --show-payload
```

CLI run retries/timeouts:

```bash
agent-artifacts run deploy_fastapi@1.0.0 --inputs "{\"repo_path\": \".\", \"retries\": 1}" --max-attempts 3 --retry-on timeout,failure --backoff-ms 250,500 --total-timeout-s 60 --step-timeout-s 10 --idempotency-key deploy-2026-01-27-001 --trace-inputs-preview --trace-output-preview --trace-preview-max-chars 120
```

### Config file (optional)

Configuration can be stored in `~/.agent-artifacts/agent-artifacts.yaml` (or `AGENT_ARTIFACTS_CONFIG`) with
precedence: CLI args > env vars > config file > defaults.

Starter template: [config template](configs/agent-artifacts.yaml).

```yaml
storage:
  backend: sqlite
  db: ~/.agent-artifacts/agent-artifacts.db
  backend_config: {}
```

Inspect the resolved configuration and sources:

```bash
agent-artifacts config show --format json
```

---

## Storage + Python API

Storage service, Postgres backend setup, and programmatic API examples live in:
[storage service docs](docs/storage-service.md) and [Python API](docs/python-api.md).

---

## Examples

- LangGraph starter app: [starter app README](examples/langgraph/starter_app/README.md)
- LangGraph migration guide: [migration guide](docs/langgraph-migration.md)
- LangGraph demos: [demo](examples/langgraph/demo.py), [parallel demo](examples/langgraph/parallel_demo.py), [reference demo](examples/langgraph/reference_demo.py)
- LangGraph ergonomics RFC: [ergonomics RFC](docs/rfcs/0001-langgraph-adapter-ergonomics.md)
- Benchmark harness: [benchmarks](benchmarks/README.md)
- Docs index: [Docs (start here)](docs/README.md)
- Examples index: [examples/README.md](examples/README.md)

---

# Contributing

Contributions are welcome. See [contributing guide](docs/contributing.md).

Open items we would love help with:
- LeTTA / memU / ReMe adapters
- Adapter compatibility notes + deprecation policy
- Adapter conformance tests in CI
- Memory pollution benchmark + trace replay regression tests

---

# FAQ

### Q: Is this just another RAG memory system?

No. Agent Artifacts focuses on:
* transactional memory correctness
* procedural skills as artifacts
* decision trace auditability

### Q: Why "Agent Artifacts"?
Because it describes the core idea: reusable agent behavior stored as versioned artifacts.

---

# License

MIT. See `LICENSE`.

---





