Metadata-Version: 2.4
Name: hb-eval-sdk
Version: 2.1.0
Summary: HB-Eval SDK for reliable agent evaluation, semantic memory, and LangChain integration
Author-email: Abuelgasim Mohamed Ibrahim Adam <abuelgasim.hbeval@outlook.com>
License: MIT
Project-URL: Homepage, https://github.com/hb-evalSystem/HB-System
Project-URL: Documentation, https://github.com/hb-evalSystem/HB-System/blob/main/docs
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.28.0
Requires-Dist: cryptography>=41.0.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.1.0; extra == "langchain"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"

# HB-Eval SDK

The official Python SDK for HB-Eval OS — the reliability operating system for
agentic AI. Evaluate any agent trajectory against five reliability metrics and
receive a tier certification, in a few lines of code.

## Install

```bash
pip install hb-eval-sdk
```

For the LangChain integration:

```bash
pip install hb-eval-sdk[langchain]
```

## Quick start

```python
from hb_eval_sdk import HBEvalClient

client = HBEvalClient(
    api_key="...",          # identifies your project
    aes_key="...",          # encrypts your payload (base64, 32 bytes)
    signing_secret="...",   # signs your request (base64; never transmitted)
)

result = client.evaluate({
    "trajectory": [
        {"step": 1, "action": "chain_start"},
        {"step": 2, "action": "tool_call", "tool": "search"},
        {"step": 3, "action": "chain_end"},
    ],
    "sub_tasks": 3,
    "constraint_violations": 0,
    "recovery_episodes": [],
    "agent_id": "my-agent",
})

print(result.verdict, result.tier)
print(result.metrics)   # pei, irs, frr, ti, csi
```

## The five metrics

Every evaluation returns five reliability metrics. Any of them may be `None`
when it is genuinely undefined for a given run, and `None` always means
"not measured" — never "scored zero".

- **PEI** — Planning Efficiency Index
- **IRS** — Intentional Recovery Score (None when the run had no faults)
- **FRR** — Failure Resilience Rate
- **TI** — Traceability Index (None when no judge evaluation was made)
- **CSI** — Consistency Stability Index (None without enough history)

## LangChain

```python
from hb_eval_sdk import HBEvalCallback

callback = HBEvalCallback(api_key="...", aes_key="...", signing_secret="...")
agent.run(task, callbacks=[callback])
print(callback.last_result.verdict)
```

The callback observes the real run — counting genuine tool errors and
detecting actual fault-and-recovery patterns — rather than assuming a clean
execution.

## Credentials

Your project has three credentials, issued together when the project is
created. The API key is sent on each request to identify you. The AES key
encrypts your payload locally. The signing secret signs your request and is
never transmitted — it proves the request genuinely came from you, even to an
observer who has seen your API key.

## Links

- Documentation: https://github.com/hb-evalSystem/HB-System/blob/main/docs
- Repository: https://github.com/hb-evalSystem/HB-System
- Platform: https://hbeval.com

## License

MIT
