Metadata-Version: 2.4
Name: openterms-py
Version: 0.3.1
Summary: Python SDK for the OpenTerms Protocol — query machine-readable AI agent permissions.
Author-email: OpenTerms <hello@openterms.com>
License: MIT
Project-URL: Homepage, https://openterms.com
Project-URL: Documentation, https://openterms.com/docs
Project-URL: Repository, https://github.com/openterms/openterms-py
Project-URL: Bug Tracker, https://github.com/openterms/openterms-py/issues
Keywords: openTerms,ai,agents,permissions,llm,langchain,crewai
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.28
Provides-Extra: async
Requires-Dist: httpx>=0.24; extra == "async"
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: responses>=0.23; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Dynamic: license-file

# openterms-py

Python SDK for the [OpenTerms Protocol](https://openterms.com).

Query machine-readable AI agent permissions from `openterms.json` files before
your agent acts on a domain.

```
pip install openterms-py
```

---

## Core API

```python
import openterms

# Fetch the full openterms.json (cached in memory, TTL 1h by default)
terms = openterms.fetch("github.com")

# Check a single permission
result = openterms.check("github.com", "api_access")
# result.decision → "allow" | "deny" | "not_specified"
# bool(result) → True when decision is "allow"

# Get the discovery block (MCP servers, OpenAPI specs)
disc = openterms.discover("github.com")

# Generate a local compliance receipt
rec = openterms.receipt("github.com", "api_access", result.decision)
print(rec.to_dict())
```

---

## Installation

Requires Python 3.9+ and [`requests`](https://pypi.org/project/requests/)
(installed automatically).

```bash
pip install openterms-py
```

Optional async support via [`httpx`](https://pypi.org/project/httpx/):

```bash
pip install "openterms-py[async]"
```

---

## Functions

### `fetch(domain) → dict | None`

Fetches `/.well-known/openterms.json` from the domain, falling back to
`/openterms.json`. Returns the parsed JSON dict or `None` if unreachable.

Results are cached in memory. The TTL is taken from the server's
`Cache-Control: max-age=N` header, or the configured default (3600s).

```python
terms = openterms.fetch("stripe.com")
if terms:
    print(terms.get("service"))
    print(terms.get("permissions"))
```

---

### `check(domain, action) → CheckResult`

Returns allow/deny for a single permission key. Evaluates to `True` in
boolean context when the decision is `"allow"`.

```python
result = openterms.check("stripe.com", "api_access")

if result:
    print("Access allowed")
else:
    print(f"Blocked: {result.decision}")  # "deny" or "not_specified"

# Access all fields
print(result.domain)     # "stripe.com"
print(result.action)     # "api_access"
print(result.decision)   # "allow" | "deny" | "not_specified"
print(result.raw_value)  # the raw value from permissions block
print(result.source)     # "cache" | "network"
```

Canonical permission keys: `read_content`, `scrape_data`, `api_access`,
`create_account`, `make_purchases`, `post_content`, `allow_training`.

---

### `discover(domain) → DiscoveryResult | None`

Returns the `discovery` block from the domain's `openterms.json`, or
`None` if absent.

```python
disc = openterms.discover("acme-corp.com")
if disc:
    for server in disc.mcp_servers:
        print(server.url, server.transport, server.description)
    for spec in disc.api_specs:
        print(spec.url, spec.type)
```

`DiscoveryResult` fields:
- `mcp_servers` — list of `McpServer(url, transport, description)`
- `api_specs` — list of `ApiSpec(url, type, description)`

---

### `receipt(domain, action, decision) → Receipt`

Generates a minimal ORS compliance receipt. Local artifact only — nothing
is sent to any server.

```python
result = openterms.check("github.com", "scrape_data")
rec = openterms.receipt("github.com", "scrape_data", result.decision)

print(rec.to_dict())
# {
#   "domain": "github.com",
#   "action": "scrape_data",
#   "decision": "deny",
#   "timestamp": "2026-04-11T10:40:00Z",
#   "openterms_hash": "a3f2...c91d"
# }

# Log it, write to a file, store in your DB — your choice
import json
with open("compliance_log.jsonl", "a") as f:
    f.write(json.dumps(rec.to_dict()) + "\n")
```

---

### `configure(default_ttl, timeout, user_agent)`

Adjust the shared client settings. Clears the existing cache.

```python
openterms.configure(
    default_ttl=600,   # 10-minute cache
    timeout=5,         # 5-second HTTP timeout
)
```

### `clear_cache(domain=None)`

Flush cached entries. Pass a domain to evict a single entry, or call with
no args to flush everything.

```python
openterms.clear_cache("github.com")  # evict one domain
openterms.clear_cache()              # flush all
```

---

## Plain Python example

No framework, just a permission gate before an HTTP call.

```python
import requests
import openterms

TARGET_DOMAIN = "data-provider.com"

def fetch_data_if_permitted(url: str) -> dict | None:
    result = openterms.check(TARGET_DOMAIN, "api_access")

    # Record the decision
    rec = openterms.receipt(TARGET_DOMAIN, "api_access", result.decision)
    print("Receipt:", rec.to_dict())

    if not result:
        print(f"api_access is {result.decision} for {TARGET_DOMAIN}. Aborting.")
        return None

    resp = requests.get(url, timeout=10)
    resp.raise_for_status()
    return resp.json()


data = fetch_data_if_permitted("https://data-provider.com/api/items")
```

---

## LangChain integration

Gate a web-interaction tool behind an OpenTerms permission check.

### Option 1 — Custom Tool with permission guard

```python
from langchain_core.tools import tool
import openterms

@tool
def fetch_page_content(url: str) -> str:
    """Fetch the text content of a web page.

    Only proceeds if the domain's openterms.json permits scraping.
    """
    from urllib.parse import urlparse
    import requests

    domain = urlparse(url).hostname or url

    result = openterms.check(domain, "scrape_data")

    # Log the compliance receipt
    rec = openterms.receipt(domain, "scrape_data", result.decision)
    print(f"[OpenTerms] receipt: {rec.to_dict()}")

    if not result:
        return (
            f"Cannot fetch {url}: scrape_data is '{result.decision}' "
            f"for {domain} per their openterms.json."
        )

    resp = requests.get(url, timeout=10)
    resp.raise_for_status()
    return resp.text[:4000]


# Use in an agent
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent

llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")
agent = create_react_agent(llm, tools=[fetch_page_content])

result = agent.invoke({
    "messages": [("user", "Summarise the content at https://example.com")]
})
```

### Option 2 — Pre-action callback on any browser tool

Wrap an existing tool class to inject the permission check transparently:

```python
from langchain_core.tools import BaseTool
from langchain_core.callbacks import CallbackManagerForToolRun
from typing import Optional, Type, Any
from pydantic import BaseModel
import openterms


class OpenTermsGuard(BaseTool):
    """Wraps any web tool and gates execution on OpenTerms permission."""

    name: str = "openTerms_guarded_browser"
    description: str = "Fetch a URL, checking OpenTerms permissions first."
    permission: str = "scrape_data"
    wrapped_tool: Any  # the underlying LangChain browser/fetch tool

    class ArgsSchema(BaseModel):
        url: str

    args_schema: Type[BaseModel] = ArgsSchema

    def _run(
        self,
        url: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        from urllib.parse import urlparse
        domain = urlparse(url).hostname or url
        result = openterms.check(domain, self.permission)

        rec = openterms.receipt(domain, self.permission, result.decision)
        print(f"[OpenTerms] {rec.to_dict()}")

        if not result:
            return (
                f"Blocked by OpenTerms: '{self.permission}' is "
                f"'{result.decision}' for {domain}."
            )
        return self.wrapped_tool.run(url)


# Usage
from langchain_community.tools import BrowserTool  # or any fetch tool
browser = BrowserTool()
guarded = OpenTermsGuard(wrapped_tool=browser)
```

### Option 3 — Discover MCP servers for a domain before connecting

```python
import openterms

def get_mcp_servers_for_domain(domain: str) -> list[dict]:
    disc = openterms.discover(domain)
    if not disc:
        return []
    return [
        {"url": s.url, "transport": s.transport}
        for s in disc.mcp_servers
    ]

servers = get_mcp_servers_for_domain("acme-corp.com")
# [{"url": "https://acme-corp.com/mcp/sse", "transport": "sse"}]
```

---

## CrewAI integration

Gate a CrewAI agent's web tasks behind OpenTerms permission checks.

### Option 1 — Custom tool for CrewAI

```python
from crewai.tools import BaseTool
import openterms
import requests


class OpenTermsWebTool(BaseTool):
    name: str = "web_fetch_with_permissions"
    description: str = (
        "Fetch content from a URL. "
        "Checks the domain's OpenTerms permissions before proceeding. "
        "Returns an error string if the domain denies the requested action."
    )
    permission: str = "scrape_data"
    # fail_closed: if True (default), blocks on null / no_openterms_json / low-confidence.
    # Set fail_closed=False only when you explicitly want permissive fallback.
    fail_closed: bool = True

    def _run(self, url: str) -> str:
        from urllib.parse import urlparse
        domain = urlparse(url).hostname or url

        result = openterms.check(domain, self.permission)

        # Store receipt for audit trail
        rec = openterms.receipt(domain, self.permission, result.decision)
        # In production: write rec.to_dict() to your audit log

        # Fail-closed default: block unless explicitly allowed.
        # not_specified covers: null, no openterms.json found, unknown action.
        if not result:
            if self.fail_closed:
                return (
                    f"Blocked by OpenTerms (fail-closed): '{self.permission}' "
                    f"is '{result.decision}' for {domain}. "
                    "Publish an openterms.json to permit this action."
                )
            # Permissive fallback — only reached when fail_closed=False.
            # Caller accepts responsibility for proceeding without explicit permission.

        resp = requests.get(url, timeout=10)
        resp.raise_for_status()
        return resp.text[:4000]


# Use in a CrewAI agent
from crewai import Agent, Task, Crew

web_tool = OpenTermsWebTool(permission="scrape_data")  # fail_closed=True by default

researcher = Agent(
    role="Web Researcher",
    goal="Research topics from web sources that permit scraping.",
    backstory="You respect site permissions and only access allowed content.",
    tools=[web_tool],
    verbose=True,
)

task = Task(
    description="Find and summarise pricing information from competitor websites.",
    expected_output="A bullet-point comparison of competitor pricing.",
    agent=researcher,
)

crew = Crew(agents=[researcher], tasks=[task], verbose=True)
result = crew.kickoff()
```

### Option 2 — Callback hook for CrewAI task lifecycle

Use a `before_kickoff` step to pre-validate all domains a task will touch:

```python
from crewai import Crew, Agent, Task, Process
from typing import Union
import openterms


def check_domain_permissions(
    domains: list[str],
    action: str,
    fail_closed: bool = True,
) -> dict[str, str]:
    """
    Returns {domain: decision} for all domains.

    Fail-closed by default:
    - Raises PermissionError if any domain explicitly denies the action.
    - Raises PermissionError if any domain has no openterms.json (not_specified).
      During public alpha, no_openterms_json is expected for most domains —
      this is not an error condition, but it blocks by default.
      Pass fail_closed=False to allow proceeding when policy is absent.

    Only pass fail_closed=False when you explicitly accept proceeding without
    explicit permission from the target domain.
    """
    results = {}
    blocked = []
    for domain in domains:
        r = openterms.check(domain, action)
        results[domain] = r.decision
        if r.decision == "deny":
            blocked.append((domain, "denied"))
        elif r.decision == "not_specified" and fail_closed:
            # not_specified covers: null, no openterms.json, unknown action.
            # During public alpha, many domains haven't published yet —
            # blocks or escalates by default; permissive fallback must be explicit.
            blocked.append((domain, "no_openterms_json / not_specified"))
    if blocked:
        details = ", ".join(f"{d} ({reason})" for d, reason in blocked)
        raise PermissionError(
            f"OpenTerms: {action!r} blocked for: {details}. "
            "Pass fail_closed=False to allow proceeding without explicit permission."
        )
    return results


# Before running your Crew, validate the target domains
target_domains = ["competitor-a.com", "competitor-b.com"]

try:
    permissions = check_domain_permissions(target_domains, "scrape_data")
    print(f"All domains permitted: {permissions}")
    # safe to proceed
    # crew.kickoff(...)
except PermissionError as e:
    print(f"Aborting: {e}")

# Permissive fallback — only when you explicitly accept proceeding without permission:
# permissions = check_domain_permissions(target_domains, "scrape_data", fail_closed=False)
```

### Option 3 — API discovery for CrewAI MCP tool selection

```python
import openterms
from crewai import Agent

def build_agent_for_domain(domain: str) -> Agent:
    disc = openterms.discover(domain)

    tools = []
    if disc and disc.api_specs:
        # Dynamically load tools from discovered OpenAPI specs
        for spec in disc.api_specs:
            print(f"Found API spec: {spec.url} ({spec.type})")
            # Load spec and generate tools here (e.g. via openapi-core)

    return Agent(
        role="Domain Specialist",
        goal=f"Interact with {domain} using its declared API.",
        backstory=f"You have been given the API specs for {domain}.",
        tools=tools,
    )
```

### Option 4 — Guarded-wrapper pattern (fail-closed by default)

Use this pattern to wrap any downstream CrewAI tool so the OpenTerms check
runs as a gate before the tool executes. Execution is **blocked by default**
unless the check returns `allow`. Permissive fallback requires explicit opt-in.

```python
from crewai.tools import BaseTool
from typing import Any, Type
from pydantic import BaseModel
import openterms


class OpenTermsGuardedTool(BaseTool):
    """
    Wraps any CrewAI BaseTool and gates execution on an OpenTerms permission check.

    Fail-closed by default:
    - allowed  → downstream tool executes
    - denied   → raises PermissionError, tool does not execute
    - not_specified (includes no_openterms_json, null, unknown) → raises PermissionError
      unless fail_closed=False is set explicitly

    During public alpha, no_openterms_json is expected for most domains.
    It is not an error, but it blocks by default. Set fail_closed=False only
    when you explicitly accept proceeding without an openterms.json.
    """

    name: str = "openTerms_guarded_tool"
    description: str = "Executes the wrapped tool only if OpenTerms permits the action."
    permission: str = "scrape_data"
    fail_closed: bool = True  # block on null / no_openterms_json by default
    wrapped_tool: Any        # the downstream BaseTool to guard

    class ArgsSchema(BaseModel):
        url: str

    args_schema: Type[BaseModel] = ArgsSchema

    def _run(self, url: str) -> str:
        from urllib.parse import urlparse
        domain = urlparse(url).hostname or url

        result = openterms.check(domain, self.permission)
        rec = openterms.receipt(domain, self.permission, result.decision)
        # In production: persist rec.to_dict() to your audit log

        if result.decision == "allow":
            return self.wrapped_tool.run(url)
        elif result.decision == "deny":
            raise PermissionError(
                f"OpenTerms: '{self.permission}' is denied for {domain}. "
                "Not proceeding."
            )
        else:
            # not_specified: null, no openterms.json, or unknown action
            if self.fail_closed:
                raise PermissionError(
                    f"OpenTerms: '{self.permission}' is not_specified for {domain} "
                    "(no openterms.json or null value — expected during public alpha). "
                    "Blocked by default. Set fail_closed=False to allow proceeding "
                    "without explicit permission."
                )
            # Permissive fallback — only when fail_closed=False is explicit
            return self.wrapped_tool.run(url)


# Usage — fail-closed (default): blocks on no_openterms_json
# from crewai_community.tools import BrowserTool  # or any BaseTool subclass
# browser = BrowserTool()
# guarded = OpenTermsGuardedTool(
#     wrapped_tool=browser,
#     permission="scrape_data",
#     fail_closed=True,   # explicit — this is the default
# )

# Permissive fallback — only use when you accept proceeding without permission:
# guarded_permissive = OpenTermsGuardedTool(
#     wrapped_tool=browser,
#     permission="scrape_data",
#     fail_closed=False,  # explicit opt-in required
# )
```

---

## Fail-closed defaults

**All behaviors are fail-closed by default.** The following results block execution
unless `fail_closed=False` is explicitly passed:

| Result | Meaning | Default behavior |
|--------|---------|-----------------|
| `allow` | Permission explicitly granted | Proceed |
| `deny` | Permission explicitly denied | Block (always) |
| `not_specified` | Key absent, null value, or low-confidence | Block (fail-closed) |
| `no_openterms_json` | Domain hasn't published yet — expected during public alpha | Block (fail-closed) |

`not_specified` is returned when:
- The domain has no `openterms.json` (most domains during public alpha)
- The permission key is absent from the file
- The value is `null` / `None`
- An unknown action was requested

This is not an error. During public alpha, `no_openterms_json` is the expected
response for the majority of domains. Your agent must decide how to handle it.
The canonical decision is to block or escalate to a human. Proceeding anyway
requires `fail_closed=False` and is an explicit choice the caller must make.

**Canonical permission keys** (the only documented keys):

- `read_content`
- `scrape_data`
- `api_access`
- `create_account`
- `make_purchases`
- `post_content`
- `allow_training`

---

## Models reference

```python
# CheckResult
result.domain      # str
result.action      # str
result.decision    # "allow" | "deny" | "not_specified"
result.raw_value   # Any — the raw permissions value (bool, dict, None)
result.source      # "cache" | "network"
bool(result)       # True iff decision == "allow"

# DiscoveryResult
disc.mcp_servers   # list[McpServer]
disc.api_specs     # list[ApiSpec]

# McpServer
server.url          # str
server.transport    # str  ("sse" | "stdio" | "streamable-http")
server.description  # str | None

# ApiSpec
spec.url            # str
spec.type           # str  ("openapi_3" | "swagger_2" | "graphql_schema")
spec.description    # str | None

# Receipt
rec.domain          # str
rec.action          # str
rec.decision        # "allow" | "deny" | "not_specified"
rec.timestamp       # str  (ISO 8601 UTC)
rec.openterms_hash  # str  (SHA-256 hex, empty if domain was unreachable)
rec.to_dict()       # → dict
```

---

## Advanced configuration

```python
import openterms

# Shorter cache, stricter timeout
openterms.configure(default_ttl=300, timeout=5)

# Per-request: bypass cache by clearing first
openterms.clear_cache("github.com")
result = openterms.check("github.com", "api_access")

# Use your own client instance (e.g. for testing with a mock cache)
from openterms.client import OpenTermsClient
from openterms.cache import TermsCache

custom_cache = TermsCache()
client = OpenTermsClient(default_ttl=0, cache=custom_cache)
result = client.check("github.com", "api_access")
```

---

## License

MIT
