Metadata-Version: 2.1
Name: exemplar-harness-sdk
Version: 0.2.0
Summary: Exemplar harness eval SDK — ingest framework callbacks to the Exemplar platform API
License: LicenseRef-Proprietary
Keywords: exemplar,harness,eval,llm,agents,telemetry,observability
Author: Exemplar Dev LLC
Requires-Python: >=3.10,<3.14
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Provides-Extra: agno
Provides-Extra: all
Provides-Extra: anthropic
Provides-Extra: autogen
Provides-Extra: claude-agent
Provides-Extra: crewai
Provides-Extra: google-adk
Provides-Extra: google-genai
Provides-Extra: haystack
Provides-Extra: langchain
Provides-Extra: langgraph
Provides-Extra: litellm
Provides-Extra: llamaindex
Provides-Extra: openai
Provides-Extra: portkey
Provides-Extra: pydantic-ai
Provides-Extra: semantic-kernel
Provides-Extra: smolagents
Requires-Dist: agno (>=1.0) ; extra == "agno" or extra == "all"
Requires-Dist: anthropic (>=0.40) ; extra == "anthropic" or extra == "all"
Requires-Dist: autogen-agentchat (>=0.4) ; extra == "autogen" or extra == "all"
Requires-Dist: autogen-ext[openai] (>=0.4) ; extra == "autogen" or extra == "all"
Requires-Dist: claude-agent-sdk (>=0.2) ; extra == "claude-agent" or extra == "all"
Requires-Dist: crewai (>=0.80) ; extra == "crewai" or extra == "all"
Requires-Dist: google-adk (>=0.1) ; extra == "google-adk" or extra == "all"
Requires-Dist: google-genai (>=1.52.0) ; extra == "google-genai" or extra == "all"
Requires-Dist: haystack-ai (>=2.0) ; extra == "haystack" or extra == "all"
Requires-Dist: httpx (>=0.27.0)
Requires-Dist: langchain-core (>=0.3) ; extra == "langchain" or extra == "langgraph" or extra == "all"
Requires-Dist: langgraph (>=0.2) ; extra == "langgraph" or extra == "all"
Requires-Dist: litellm (>=1.55) ; extra == "litellm" or extra == "all"
Requires-Dist: llama-index-core (>=0.10.20) ; extra == "llamaindex" or extra == "all"
Requires-Dist: openai (>=1.0) ; extra == "openai" or extra == "all"
Requires-Dist: portkey-ai (>=1.9) ; extra == "portkey" or extra == "all"
Requires-Dist: pydantic-ai-slim[google] (>=0.2) ; extra == "pydantic-ai" or extra == "all"
Requires-Dist: semantic-kernel (>=1.0) ; extra == "semantic-kernel" or extra == "all"
Requires-Dist: smolagents (>=1.0) ; extra == "smolagents" or extra == "all"
Project-URL: Documentation, https://exemplar.dev
Project-URL: Homepage, https://exemplar.dev
Project-URL: Issue Tracker, https://github.com/Exemplar-Dev/exemplar-harness-sdk/issues
Project-URL: Repository, https://github.com/Exemplar-Dev/exemplar-harness-sdk
Description-Content-Type: text/markdown

# exemplar-harness-sdk

Python SDK for ingesting framework-native agent telemetry into Exemplar harness eval.

**Multiple framework integrations** — LangChain, LangGraph, OpenAI, Anthropic, Google GenAI SDK, Google ADK, and more. See [Framework integrations](#framework-integrations).

## Install

```bash
pip install exemplar-harness-sdk

# One framework
pip install "exemplar-harness-sdk[langchain]"

# Everything
pip install "exemplar-harness-sdk[all]"
```

Licensed for **non-commercial use** only. Commercial use requires a separate license from [Exemplar Dev LLC](https://exemplar.dev). See [LICENSE](LICENSE).

---

## Quick start

Follow these steps to send your first agent turn to Exemplar harness eval. The walkthrough uses **LangChain**; [other frameworks](#framework-integrations) follow the same pattern.

### Step 1 — Install

```bash
pip install "exemplar-harness-sdk[langchain]"
```

Need a different framework? Install the matching extra instead (e.g. `[anthropic]`, `[openai]`, `[google-adk]`). See [optional extras](#optional-extras).

### Step 2 — Configure credentials

Set your Exemplar org API key:

```bash
export EXEMPLAR_API_KEY="eis_your_org_api_key"
```

| Variable | Purpose |
|----------|---------|
| `EXEMPLAR_API_KEY` | Org API key (**required**) |

The SDK sends requests to `https://production-api.exemplar.dev`.

### Step 3 — Create a `Harness` client

```python
from exemplar_harness import Harness

harness = Harness.from_env()
```

Or pass the API key explicitly:

```python
from exemplar_harness import Harness

harness = Harness(api_key="eis_...")
```

Pick one **`session_id` per conversation** — all turns with the same ID are grouped into a single eval session.

### Step 4 — Wire the LangChain integration

Create a callback handler and pass it to your LLM or chain:

```python
from exemplar_harness.integrations.langchain import make_langchain_callback_handler

SESSION_ID = "support-sess-001"  # reuse for every turn in this chat

handler = make_langchain_callback_handler(
    harness,
    session_id=SESSION_ID,
    chain_name="support-bot",
    source_app="my-app",
)
```

Each completed LLM call invokes `Harness.ingest()` automatically — you do not call `ingest()` yourself for callback-based integrations.

### Step 5 — Run your agent

```python
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI  # or ChatGoogleGenerativeAI, etc.

llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
chain = ChatPromptTemplate.from_messages([("human", "{question}")]) | llm

chain.invoke({"question": "Summarize our refund policy."})
chain.invoke({"question": "What is the return window?"})  # same SESSION_ID → same eval session
```

### Step 6 — (Optional) Trigger eval on ingest

Pass `auto_run=True` when you call `harness.ingest()` directly, or bind it on a session helper:

```python
session = harness.session(
    SESSION_ID,
    agent_name="support-bot",
    source_app="my-app",
    auto_run=True,  # run eval after each ingest through this session
)
```

Callback-based integrations (LangChain, LangGraph, Agno, etc.) call `harness.ingest()` internally with platform defaults.

### What happens under the hood

```mermaid
flowchart LR
  Agent[Your agent] --> Integration[SDK integration]
  Integration --> Harness[Harness.ingest]
  Harness --> API[Exemplar platform API]
  API --> Eval[Harness eval]
```

Each ingest POSTs an [envelope v1](#envelope-v1) payload (`sourceType` + framework-native `data`). The Exemplar platform API maps it to eval session turns.

A successful ingest returns metadata such as `sessionId`, `turnCount`, and `evalStatus`.

### Complete example (steps 3–5)

```python
from exemplar_harness import Harness
from exemplar_harness.integrations.langchain import make_langchain_callback_handler
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

harness = Harness.from_env()
session_id = "support-sess-001"

handler = make_langchain_callback_handler(
    harness,
    session_id=session_id,
    chain_name="support-bot",
    source_app="my-app",
)

llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
chain = ChatPromptTemplate.from_messages([("human", "{question}")]) | llm

for question in (
    "Summarize our refund policy.",
    "What is the return window?",
):
    chain.invoke({"question": question})
```

### No framework? Ingest turns directly

For fixtures, batch replay, or custom pipelines:

```python
harness.ingest(
    "generic",
    session_id="sess-abc",
    event="turns",
    data={
        "turns": [
            {
                "input": "What is harness eval?",
                "output": "Automated judge over agent sessions.",
                "model": "gpt-4o",
            }
        ]
    },
    agent_name="my-agent",
    source_app="my-app",
)
```

### Choose a different framework

| Framework | Extra | Wire this |
|-----------|-------|-----------|
| LangChain | `langchain` | `make_langchain_callback_handler` → `callbacks=[handler]` |
| LangGraph | `langgraph` | `HarnessLangGraphHandler.make_graph_callback_handler()` → `invoke(..., config={"callbacks": [...]})` |
| LiteLLM | `litellm` | `register_litellm_handler` + `metadata={"session_id": ...}` |
| OpenAI SDK | `openai` | `HarnessOpenAICallback.on_completion` after `chat.completions.create` |
| Anthropic SDK | `anthropic` | `make_anthropic_middleware` → `Anthropic(middleware=[...])` |
| Claude Agent SDK | `claude-agent` | `run_query()` or `record_agent_result` after `query()` |
| Portkey | `portkey` | `HarnessPortkeyCallback.on_completion` after `chat.completions.create` |
| Agno | `agno` | `harness_agno_post_hook` → `Agent.post_hooks` |
| Haystack | `haystack` | `HarnessHaystackHandler.run_and_record` or `record_generator_run` |
| LlamaIndex | `llamaindex` | `register_llamaindex_handler` (auto) or `record_llm_chat_end` |
| AutoGen | `autogen` | `HarnessAutoGenHandler.on_agent_run_complete` after `agent.run` |
| CrewAI | `crewai` | `make_crewai_listener` (auto) or `on_task_complete` |
| Google GenAI SDK | `google-genai` | `instrument_google_genai_client` or `sdk_generate_content` |
| Google ADK | `google-adk` | `ingest_adk_session` after `session.model_dump()` |
| Pydantic AI | `pydantic-ai` | `instrument_pydantic_ai_agent` or `record_run` after `agent.run` |
| Semantic Kernel | `semantic-kernel` | `register_semantic_kernel_filter` on `Kernel` |
| smolagents | `smolagents` | `harness_smolagents_step_callback` → `step_callbacks` |

Copy-paste snippets for each framework are in [Framework integrations](#framework-integrations) below.

### Ingest parameters

| Parameter | Purpose |
|-----------|---------|
| `session_id` | Groups turns into one eval session |
| `agent_name` | Agent identifier on the ingest body |
| `source_app` | Your application name (defaults to `source_type`) |
| `auto_run` | Run eval immediately after ingest (`True` / `False` / omit) |
| `harness` | Optional eval metadata (scenario tags, categories, etc.) |

---

## Harness memory (long-term recall)

The SDK can add and search long-term memories via the integration-service memory API (`/api/harness-memory`). Eval ingest (`Harness.ingest()`) is unchanged.

### Configure

```bash
export EXEMPLAR_API_KEY="eis_your_org_api_key"
# optional for JWT auth or explicit org override:
# export EXEMPLAR_ORGANIZATION_ID="your_org_id"
# optional for local dev:
export EXEMPLAR_BASE_URL="http://localhost:8000"
```

| Variable | Purpose |
|----------|---------|
| `EXEMPLAR_API_KEY` | Org API key (**required**) |
| `EXEMPLAR_ORGANIZATION_ID` | Optional org override (JWT auth); omitted when using org API keys |
| `EXEMPLAR_BASE_URL` | Override API host (default: production) |

### Add, search, recall

```python
from exemplar_harness import Harness

harness = Harness.from_env()  # org API key is enough; organization_id optional
memory = harness.memory(user_id="user-123", session_id="chat-abc", app_id="my-app")

memory.add("User prefers bullet-point answers.", memory_type="preference")
results = memory.search("formatting preferences", top_k=5)
context = memory.recall("how should I format answers?")  # for system prompts
```

### Generic hook (any agent loop)

```python
from exemplar_harness.integrations.memory import MemoryHook

hook = MemoryHook(memory, recall_top_k=5, auto_add=False)
recall = hook.before_turn(user_input)          # inject into system prompt
# ... run LLM ...
hook.after_turn(user_input, assistant_output)  # no-op unless auto_add=True
```

### LangChain

```python
from exemplar_harness.integrations.memory.langchain import make_langchain_memory_handler

handler = make_langchain_memory_handler(harness, user_id="user-123", session_id="chat-abc")
recall = handler.recall_for("What do you know about me?")
# pass handler in callbacks=[handler]; prepend handler.system_prompt_prefix() to system message
```

### LiteLLM

```python
from exemplar_harness.integrations.memory.litellm import make_litellm_memory_logger

mem_logger = make_litellm_memory_logger(harness, user_id="user-123")
messages = mem_logger.prepare_messages([{"role": "user", "content": "Hello"}])
# register_litellm_memory_logger(harness, user_id="user-123") for global callbacks
```

### Framework memory helpers

Each eval integration has a matching memory helper under `exemplar_harness.integrations.memory.*`:

| Framework | Import | Wire this |
|-----------|--------|-----------|
| Generic | `MemoryHook` | `before_turn` / `after_turn` on any loop |
| LangChain | `make_langchain_memory_handler` | `callbacks=[handler]` + `recall_for()` |
| LangGraph | `make_langgraph_memory_handler` | same as LangChain in graph `config` |
| LiteLLM | `make_litellm_memory_logger` | `prepare_messages()` or `register_litellm_memory_logger` |
| OpenAI SDK | `make_openai_memory_helper` | `sdk_chat_completion_with_memory(helper, client, ...)` |
| Anthropic SDK | `make_anthropic_memory_helper` | `make_anthropic_memory_middleware(helper)` |
| Portkey | `make_portkey_memory_helper` | same pattern as OpenAI |
| Google GenAI | `make_google_genai_memory_helper` | `prepare_contents()` / `sdk_generate_content_with_memory` |
| Agno | `make_agno_memory_helper` | `harness_agno_memory_hooks(helper)` pre/post hooks |
| Pydantic AI | `make_pydantic_ai_memory_helper` | `prepare_prompt()` before `agent.run()` |
| Haystack | `make_haystack_memory_helper` | `prepare_generator_messages()` before generator run |
| LlamaIndex | `make_llamaindex_memory_helper` | `prepare_chat_messages()` before chat |
| AutoGen | `make_autogen_memory_helper` | `prepare_task()` before agent run |
| CrewAI | `make_crewai_memory_helper` | `prepare_task_context()` before task |
| smolagents | `make_smolagents_memory_helper` | `harness_smolagents_memory_step_callback(helper)` |
| Semantic Kernel | `make_semantic_kernel_memory_helper` | `prepare_arguments()` before invoke |
| Claude Agent SDK | `make_claude_agent_memory_helper` | `prepare_prompt()` before `query()` |
| Google ADK | `make_google_adk_memory_helper` | `prepare_user_message()` before turn |

Live examples: `examples/live/memory_demo.py`, `examples/live/langchain_memory_demo.py`, and:

```bash
python -m examples.live.run_memory_demos --list
HARNESS_EXAMPLES_SIMULATED=1 python -m examples.live.run_memory_demos
EXEMPLAR_API_KEY=... GOOGLE_API_KEY=... python -m examples.live.run_memory_demos --only memory_google_genai
```

---

## Framework integrations

Step-by-step [quick start](#quick-start) above uses LangChain. Below are equivalent snippets for each supported framework.

### LangChain

Already covered in [Quick start](#quick-start). Minimal reference:

```python
from exemplar_harness.integrations.langchain import make_langchain_callback_handler
from langchain_openai import ChatOpenAI

handler = make_langchain_callback_handler(harness, session_id="sess-abc", chain_name="support-bot")
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
```

### LangGraph

```bash
pip install "exemplar-harness-sdk[langgraph]"
```

Pass `make_graph_callback_handler()` in your graph `invoke` config:

```python
from typing import Annotated, TypedDict

from exemplar_harness import Harness
from exemplar_harness.integrations.langgraph import HarnessLangGraphHandler
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
from langgraph.graph import END, START, StateGraph
from langgraph.graph.message import add_messages

harness = Harness(api_key="eis_...")
session = HarnessLangGraphHandler(harness, session_id="sess-abc", graph_name="support-bot")
handler = session.make_graph_callback_handler()
config = {"callbacks": [handler], "configurable": {"thread_id": "sess-abc"}}

llm = ChatOpenAI(model="gpt-4o")


class State(TypedDict):
    messages: Annotated[list, add_messages]


def agent(state: State, run_config):
    return {"messages": [llm.invoke(state["messages"], config=run_config)]}


graph = StateGraph(State)
graph.add_node("agent", agent)
graph.add_edge(START, "agent")
graph.add_edge("agent", END)
app = graph.compile()

app.invoke(
    {"messages": [HumanMessage(content="Summarize our refund policy.")]},
    config=config,
)
```

### LiteLLM

```bash
pip install "exemplar-harness-sdk[litellm]"
```

Register a custom logger once; pass `session_id` in completion `metadata`:

```python
import litellm
from exemplar_harness import Harness
from exemplar_harness.integrations.litellm import register_litellm_handler

harness = Harness(api_key="eis_...")
register_litellm_handler(harness, source_app="my-app", agent_name="support-bot")

litellm.completion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize our refund policy."}],
    metadata={"session_id": "sess-abc"},
)
```

### Anthropic SDK

```bash
pip install "exemplar-harness-sdk[anthropic]"
```

One `HarnessAnthropicCallback` per session. Prefer client middleware for automatic ingest:

```python
import anthropic
from exemplar_harness import Harness
from exemplar_harness.integrations.anthropic import (
    HarnessAnthropicCallback,
    make_anthropic_middleware,
)

harness = Harness(api_key="eis_...")
callback = HarnessAnthropicCallback(harness, session_id="sess-abc", agent_name="support-bot")

client = anthropic.Anthropic(middleware=[make_anthropic_middleware(callback)])
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Summarize our refund policy."}],
)
```

Or call `on_completion` manually after each `messages.create` if you cannot use middleware.

### Claude Agent SDK

```bash
pip install "exemplar-harness-sdk[claude-agent]"
```

Use `run_query` to run the agent and ingest automatically:

```python
import asyncio

from exemplar_harness import Harness
from exemplar_harness.integrations.claude_agent import HarnessClaudeAgentHandler

harness = Harness(api_key="eis_...")
handler = HarnessClaudeAgentHandler(harness, session_id="sess-abc", agent_name="research-agent")

asyncio.run(handler.run_query("Summarize harness eval in one sentence."))
```

Or call `record_agent_result` manually after `query()` returns a `ResultMessage`:

```python
import asyncio

from claude_agent_sdk import ResultMessage, query
from exemplar_harness import Harness
from exemplar_harness.integrations.claude_agent import HarnessClaudeAgentHandler

harness = Harness(api_key="eis_...")
handler = HarnessClaudeAgentHandler(harness, session_id="sess-abc", agent_name="research-agent")


async def main() -> None:
    prompt = "Summarize harness eval in one sentence."
    result = None
    async for message in query(prompt=prompt):
        if isinstance(message, ResultMessage):
            result = message
    handler.record_agent_result(prompt=prompt, result=result)


asyncio.run(main())
```

### OpenAI SDK

```bash
pip install "exemplar-harness-sdk[openai]"
```

Call `on_completion` after each `chat.completions.create`:

```python
from openai import OpenAI
from exemplar_harness import Harness
from exemplar_harness.integrations.openai import HarnessOpenAICallback

harness = Harness(api_key="eis_...")
callback = HarnessOpenAICallback(harness, session_id="sess-abc", agent_name="support-bot")

client = OpenAI()
messages = [{"role": "user", "content": "Summarize our refund policy."}]
response = client.chat.completions.create(model="gpt-4o", messages=messages)

callback.on_completion(messages=messages, response=response, model="gpt-4o")
```

### Portkey

```bash
pip install "exemplar-harness-sdk[portkey]"
```

Call `on_completion` after each `chat.completions.create`:

```python
import os

from portkey_ai import Portkey
from exemplar_harness import Harness
from exemplar_harness.integrations.portkey import HarnessPortkeyCallback

harness = Harness(api_key="eis_...")
callback = HarnessPortkeyCallback(harness, session_id="sess-abc", agent_name="support-bot")

client = Portkey(api_key=os.environ["PORTKEY_API_KEY"])
messages = [{"role": "user", "content": "Summarize our refund policy."}]
response = client.chat.completions.create(model="@openai/gpt-4o", messages=messages)

callback.on_completion(messages=messages, response=response, model="@openai/gpt-4o")
```

### Agno

```bash
pip install "exemplar-harness-sdk[agno]"
```

Pass `harness_agno_post_hook` to `Agent.post_hooks` — each `agent.run` ingests automatically:

```python
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from exemplar_harness import Harness
from exemplar_harness.integrations.agno import harness_agno_post_hook

harness = Harness(api_key="eis_...")
agent = Agent(
    name="support-bot",
    model=OpenAIChat(id="gpt-4o"),
    post_hooks=[
        harness_agno_post_hook(harness, session_id="sess-abc", agent_name="support-bot")
    ],
)

agent.run("Summarize our refund policy.")
```

### Haystack

```bash
pip install "exemplar-harness-sdk[haystack]"
```

Use `run_and_record` to run the pipeline and ingest in one call:

```python
from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from exemplar_harness import Harness
from exemplar_harness.integrations.haystack import HarnessHaystackHandler

harness = Harness(api_key="eis_...")
handler = HarnessHaystackHandler(harness, session_id="sess-abc", pipeline_name="support-bot")

prompt_builder = ChatPromptBuilder(
    template=[ChatMessage.from_user("{{ question }}")],
    required_variables=["question"],
)
llm = OpenAIChatGenerator(model="gpt-4o")
pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")

question = "Summarize our refund policy."
handler.run_and_record(pipe, {"prompt_builder": {"question": question}}, question=question)
```

You can also call `record_generator_run` manually after `pipe.run`, or pass `handler.make_snapshot_callback(question=...)` to `pipe.run(..., snapshot_callback=...)`.

### LlamaIndex

```bash
pip install "exemplar-harness-sdk[llamaindex]"
```

Register instrumentation once; LlamaIndex LLM calls ingest automatically:

```python
from exemplar_harness import Harness
from exemplar_harness.integrations.llamaindex import register_llamaindex_handler
from llama_index.core.llms import ChatMessage
from llama_index.llms.openai import OpenAI

harness = Harness(api_key="eis_...")
register_llamaindex_handler(harness, session_id="sess-abc", workflow_name="support-workflow")

llm = OpenAI(model="gpt-4o")
messages = [ChatMessage(role="user", content="Summarize our refund policy.")]
llm.chat(messages)
```

Or call `record_llm_chat_end` manually after each `llm.chat` if you prefer explicit control.

### AutoGen

```bash
pip install "exemplar-harness-sdk[autogen]"
```

Call `on_agent_run_complete` after each `agent.run`:

```python
import asyncio

from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
from exemplar_harness import Harness
from exemplar_harness.integrations.autogen import HarnessAutoGenHandler

harness = Harness(api_key="eis_...")
handler = HarnessAutoGenHandler(
    harness, session_id="sess-abc", agent_name="assistant", team_name="support-team"
)


async def run() -> None:
    client = OpenAIChatCompletionClient(model="gpt-4o")
    agent = AssistantAgent(name="assistant", model_client=client)
    try:
        task = "Summarize our refund policy."
        result = await agent.run(task=task)
        handler.on_agent_run_complete(task=task, result=result, model="gpt-4o")
    finally:
        await client.close()


asyncio.run(run())
```

### CrewAI

```bash
pip install "exemplar-harness-sdk[crewai]"
```

Register a CrewAI event listener for automatic ingest on `TaskCompletedEvent`:

```python
from exemplar_harness import Harness
from exemplar_harness.integrations.crewai import make_crewai_listener

harness = Harness(api_key="eis_...")
monitor = make_crewai_listener(
    harness, session_id="sess-abc", crew_name="research-crew", model="gpt-4o"
)

# Run your crew — task completions ingest automatically via the listener.
```

Or call `on_task_complete` manually when each task finishes:

```python
from exemplar_harness.integrations.crewai import HarnessCrewAIMonitor

monitor = HarnessCrewAIMonitor(
    harness, session_id="sess-abc", crew_name="research-crew", model="gpt-4o"
)
monitor.on_task_complete(
    agent_role="Researcher",
    task_description="Summarize our refund policy.",
    output="Our policy allows returns within 30 days.",
    prompt_tokens=120,
    completion_tokens=45,
)
```

### Google GenAI SDK

```bash
pip install "exemplar-harness-sdk[google-genai]"
```

Use `instrument_google_genai_client` so each `generate_content` call ingests automatically:

```python
from google import genai
from exemplar_harness import Harness
from exemplar_harness.integrations.google_genai import (
    HarnessGoogleGenAICallback,
    instrument_google_genai_client,
)

harness = Harness(api_key="eis_...")
callback = HarnessGoogleGenAICallback(harness, session_id="sess-abc", agent_name="support-bot")

client = genai.Client(api_key="your-google-api-key")
instrument_google_genai_client(callback, client)

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Summarize our refund policy.",
)
print(response.text)
```

Or wrap a single call with `sdk_generate_content` if you prefer not to patch the client.

### Google ADK

```bash
pip install "exemplar-harness-sdk[google-adk]"
```

Run your ADK agent, then export and ingest the session:

```python
import asyncio

from exemplar_harness import Harness
from exemplar_harness.integrations.google_adk import ingest_adk_session
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types

harness = Harness(api_key="eis_...")
session_id, app_name, user_id = "sess-abc", "my_app", "user-1"

agent = LlmAgent(
    model="gemini-2.5-flash",
    name="my_agent",  # must be a valid Python identifier
    instruction="Answer concisely.",
)
sessions = InMemorySessionService()


async def run_and_ingest() -> None:
    await sessions.create_session(app_name=app_name, user_id=user_id, session_id=session_id)
    runner = Runner(agent=agent, app_name=app_name, session_service=sessions)

    message = types.Content(role="user", parts=[types.Part(text="Hello")])
    async for _ in runner.run_async(user_id=user_id, session_id=session_id, new_message=message):
        pass

    session = await sessions.get_session(app_name=app_name, user_id=user_id, session_id=session_id)
    ingest_adk_session(
        harness,
        session.model_dump(mode="json", exclude_none=True),
        session_id=session_id,
        agent_name="my-agent",
    )


asyncio.run(run_and_ingest())
```

If you already have a session export dict:

```python
ingest_adk_session(harness, adk_session, session_id="sess-abc", agent_name="my-agent")
```

### Pydantic AI

```bash
pip install "exemplar-harness-sdk[pydantic-ai]"
```

Instrument the agent so each `run` / `run_sync` ingests automatically:

```python
from pydantic_ai import Agent
from exemplar_harness import Harness
from exemplar_harness.integrations.pydantic_ai import (
    HarnessPydanticAIHandler,
    instrument_pydantic_ai_agent,
)

harness = Harness(api_key="eis_...")
handler = HarnessPydanticAIHandler(harness, session_id="sess-abc", agent_name="support-bot")
agent = Agent("google:gemini-2.5-flash")
instrument_pydantic_ai_agent(handler, agent)

agent.run_sync("Summarize our refund policy.")
```

### Semantic Kernel

```bash
pip install "exemplar-harness-sdk[semantic-kernel]"
```

Register a function-invocation filter on your kernel:

```python
from semantic_kernel import Kernel
from exemplar_harness import Harness
from exemplar_harness.integrations.semantic_kernel import register_semantic_kernel_filter

harness = Harness(api_key="eis_...")
kernel = Kernel()
register_semantic_kernel_filter(kernel, harness, session_id="sess-abc", plugin_name="my-plugin")

# Each kernel.invoke(...) ingests automatically after the function runs.
```

### smolagents

```bash
pip install "exemplar-harness-sdk[smolagents]"
```

Pass a step callback when creating your agent:

```python
from smolagents import CodeAgent
from exemplar_harness import Harness
from exemplar_harness.integrations.smolagents import harness_smolagents_step_callback

harness = Harness(api_key="eis_...")
agent = CodeAgent(
    tools=[],
    model=model,
    step_callbacks=[
        harness_smolagents_step_callback(harness, session_id="sess-abc", agent_name="support-bot")
    ],
)
agent.run("Summarize our refund policy.")
```

---

## Reference

### Optional extras

| Extra | Module |
|-------|--------|
| `langchain` | `exemplar_harness.integrations.langchain` |
| `langgraph` | `exemplar_harness.integrations.langgraph` |
| `litellm` | `exemplar_harness.integrations.litellm` |
| `openai` | `exemplar_harness.integrations.openai` |
| `anthropic` | `exemplar_harness.integrations.anthropic` |
| `claude-agent` | `exemplar_harness.integrations.claude_agent` |
| `portkey` | `exemplar_harness.integrations.portkey` |
| `agno` | `exemplar_harness.integrations.agno` |
| `haystack` | `exemplar_harness.integrations.haystack` |
| `llamaindex` | `exemplar_harness.integrations.llamaindex` |
| `autogen` | `exemplar_harness.integrations.autogen` |
| `crewai` | `exemplar_harness.integrations.crewai` |
| `google-adk` | `exemplar_harness.integrations.google_adk` |
| `google-genai` | `exemplar_harness.integrations.google_genai` |
| `pydantic-ai` | `exemplar_harness.integrations.pydantic_ai` |
| `semantic-kernel` | `exemplar_harness.integrations.semantic_kernel` |
| `smolagents` | `exemplar_harness.integrations.smolagents` |

Vercel AI SDK is **not** included in the Python package (TypeScript SDK path).

### Envelope v1

Each ingest POSTs `schemaVersion: 1` to `POST /api/harness-eval/v1/sessions` with a `sourceType` and framework-native `data` payload. The Exemplar platform API maps envelopes to `EvalSession` turns.

