Metadata-Version: 2.4
Name: my-react-agent
Version: 1.5.0
Summary: ReAct plan-execute agent with memory
Author-email: Zhaniya Abzhanova <zhaniya.abzhanova@gmail.com>
License: MIT License
        
        Copyright (c) <2026> <Zhaniya Abzhanova>
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: ollama>=0.5.0
Requires-Dist: regex>=2023.0.0
Provides-Extra: tools
Requires-Dist: wikipedia>=1.4.0; extra == "tools"
Requires-Dist: Wikipedia-API>=0.8.1; extra == "tools"
Requires-Dist: google-search-results>=2.4.2; extra == "tools"
Requires-Dist: python-docx>=1.1.0; extra == "tools"
Requires-Dist: pdfminer.six>=2023.0.0; extra == "tools"
Requires-Dist: beautifulsoup4; extra == "tools"
Provides-Extra: vector
Requires-Dist: numpy>=1.24; extra == "vector"
Requires-Dist: scikit-learn>=1.3; extra == "vector"
Dynamic: license-file

# my-react-agent

A **ReAct (Reason + Act)** agent framework for Python with **step-by-step traceability**, **evidence-first answering**, and **confidence-gated retries**.  
It plans a multi-step solution, executes each step via actions/tools, evaluates quality, and produces a final answer **grounded in collected observations and evidence**.

---

## Table of Contents

- [Why this project](#why-this-project)
  - [What you get here that’s harder to guarantee in langchainllamaindex](#what-you-get-here-thats-harder-to-guarantee-in-langchainllamaindex)
  - [When my-react-agent is the better choice](#when-my-react-agent-is-the-better-choice)
- [Key features](#key-features)
- [License](#license)
- [Requirements](#requirements)
- [From PyPI](#from-pypi)
- [From Source](#from-source)
- [Install Ollama](#install-ollama)
- [Quickstart](#quickstart)
  - [How to define a tool](#how-to-define-a-tool)
  - [Logger Messages Configuration](#logger-messages-configuration)
- [Architecture (precise)](#architecture-precise)
  - [High-level flow](#high-level-flow)
  - [Component map (modules and responsibilities)](#component-map-modules-and-responsibilities)
- [Prompts](#prompts)
  - [How to add new Prompt](#how-to-add-new-prompt)
  - [How to override existing Prompt](#how-to-override-existing-prompt)
- [Actions (Decision Catalogue)](#actions-decision-catalogue)
  - [Action interface](#action-interface)
  - [Built-in Actions](#built-in-actions)
  - [How to override an Action](#how-to-override-an-action)
  - [How to create and add a Custom Action](#how-to-create-and-add-a-custom-action)
- [How to create and add a Custom ParameterAssessor](#how-to-create-and-add-a-custom-parameterassessor)
- [LLM Adapters](#llm-adapters)
  - [Why adapters exist](#why-adapters-exist)
  - [LLM roles in this project](#llm-roles-in-this-project)
  - [How to pass adapters](#how-to-pass-adapters)

---

## Why this project

Similar projects- LangChain and LlamaIndex are strong frameworks—but they’re optimised for *different priorities*:

- **LangChain** is an **integration + composition** system (chains, agents, tool wrappers, retrievers, many providers). It’s great when you want to assemble an app quickly from lots of building blocks.
- **LlamaIndex** is a **data/RAG framework** (ingestion, indexing, retrieval, routing, structured querying). It’s great when your core problem is “connect LLMs to your data” at scale.

`my-react-agent` exists for a different goal: **a small, inspectable agent runtime where traceability, evidence and reliability are first-class — not optional add-ons**.

### What you get here that’s harder to guarantee in LangChain/LlamaIndex

- **Traceability as a core invariant (not a plugin / external service dependency)**  
  Every step *must* produce a structured record: action decision → tool input/output → observation → evidence → confidence.  
  This makes debugging and evaluation predictable because the “paper trail” is built into the runtime.

- **Evidence-first answering as a default design**  
  The final answer is synthesised from collected **observations + `Evidence` objects**, making it straightforward to enforce “don’t invent facts” policies and to display citations/snippets in a consistent format.

- **Confidence-gated retries with a controlled recovery loop**  
  Low-confidence step results trigger a deterministic retry policy (switch action/tool, adjust input, or stop/clarify).  
  Many frameworks can *evaluate*, but `my-react-agent` treats step-level confidence as an orchestration primitive.

- **Cleaner extension points for research/prototyping**  
  Instead of customising a big graph of components, you can add a new behavior by implementing:
  - an `Action` (LLM-visible selection rule + instructions)
  - an `ActionHandler` (runtime execution)
  
  This makes it easier to experiment with new “agent behaviors” (like GreetingAction, guardrails, special routing) without rewriting the core loop.

### When `my-react-agent` is the better choice

Use this project when you care most about:
- **auditing** (exactly what happened and why, step-by-step),
- **reproducible debugging** (structured traces you can log or test),
- **grounded outputs** (final answer constrained to collected evidence),
- **reliability under uncertainty** (confidence gating + retries),
- **lightweight core** (clear orchestration over large ecosystem complexity).

---

## Key features

- **Plan → Execute → Finalise pipeline**  
  Creates a step plan, runs each step deterministically, then synthesises a final answer.
- **Explicit traceability**  
  Step transcript + evidence pack per step (what happened, why, and what was found).
- **Evidence-first design**  
  Uses structured `Evidence` objects; final answer can be constrained to what was observed.
- **Confidence gating + retry loops**  
  Evaluates each step (alignment/quality/realism) and retries when confidence is below threshold.
- **Pluggable tools**  
  Tools are registered once and invoked through a single boundary (`ToolExecutor` / tool interface).
- **Modular actions**  
  Actions like `USE_TOOL`, `ANSWER_BY_ITSELF`, `CLARIFY`, `STOP`, and `NEED_CONTEXT` are isolated modules.
- **Memory**  
  `QueryMemory` (per question) + `ConversationMemory` (cross-turn) for entities, steps, and observations.
- **Prompt registry**  
  Centralised prompt management (`PromptRegistry`) with overridable defaults.
- **Plugin support**  
  Optional runtime extension via `REACT_AGENT_PLUGINS`.

---

## License
MIT

## Requirements
- Python 3.10+
- Ollama (local LLM runtime)

## From PyPI
pip install my-react-agent

## From Source
pip install git+https://git01lab.cs.univie.ac.at/zhaniyaa77/my-react-agent.git

## Install Ollama
##### Download and install Ollama:
- https://ollama.com/download

##### Pull a model (example used below: llama3):
ollama pull llama3:8b

## Quickstart

To run the agent you need two things:
1. At least one tool (otherwise the agent will show error)
2. An LLM model (so the agent can reason and decide when/how to use tools)

#### How to define a tool
> **Goal:** Implement tool called multiplier that multiplies two numbers.

A tool must be a subclass AgentTool and implement:
- **name**: unique string ID
- **description**: shown to the model in the tool catalog
- **refiner_instructions**: how the model should format tool input
- **refiner_input_format**: human-readable format spec
- **refiner_input_regex**: validation regex for tool input
- **execute(input: str) -> Evidence**: execute and return evidence (tool name, content(tool result), extracted format of tool result, time, confidence)

These methods are important for **refiner** taht is a prompt-based step that converts natural language into the exact format the tool expects. That's why in these example you can run not just **4*5** query but also **calculate 4*5/what is 4 times 5?/multiply 4 and 5**.

```python
from my_react_agent.agent_memory.data_structures import Evidence
from my_react_agent.tool_management.tools.agent_tool import AgentTool

from datetime import datetime, timezone

class MultiplicationTool(AgentTool):
    @property
    def name(self) -> str:
        return "multiplier"

    @property
    def description(self) -> str:
        return "Multiplies two numbers. Tool input must be '<number>*<number>' (e.g. 45*7)."

    @property
    def refiner_instructions(self) -> str:
        return "Output ONLY '<number>*<number>' (example: 45*7). No words."

    @property
    def refiner_input_format(self) -> str:
        return "<number>*<number>"

    @property
    def refiner_input_regex(self) -> Optional[str]:
        return r"^\s*-?\d+(\.\d+)?\s*\*\s*-?\d+(\.\d+)?\s*$"

    def execute(self, input: str) -> Evidence:
        left, right = input.split("*", 1)
        result = float(left) * float(right)
        return Evidence(
            tool=self.name,
            content=f"{left.strip()} * {right.strip()} = {result}",
            extracted={"expression": input, "result": result},
            as_of=datetime.now(timezone.utc),
            confidence=0.95,
        )
```

**To make a tool easier to select**:
1. Use a very specific description
2. Provide strict refiner_instructions and refiner_input_regex
3. Keep the input format simple 

**To run the agent with these tool**: initialise the agent with the tool from example and any model (here we use "llama3:8b" model):

```python
from __future__ import annotations

import re
from datetime import datetime
from typing import Optional

import ollama

from my_react_agent.agent_heart.react_agent import ReActAgent

def main() -> None:
    tools = [MultiplicationTool()]

    agent = ReActAgent.create(llm="llama3:8b", tools=tools)

    print("\nQuickStart: my-react-agent \nType 'exit' to quit.\n")
    while True:
        q = input("You: ").strip()
        if q.lower() in {"exit", "quit"}:
            break
        if not q:
            continue
        print(agent.handle(q))
        print()

if __name__ == "__main__":
    main()
```

#### Logger Messages Configuration
What the logging setup gives you:
1. Shows INFO/DEBUG logs
So you see step planning, tool refinement, retries, confidence gating, etc.
2. Adds the nice prefix
Timestamp + level + logger name, e.g.
21:07:30 INFO [agent_heart.react_agent] ...
3. Keeps logs and prints in order
Using a stdout handler avoids weird interleaving where logs appear before/after print(...) unexpectedly.

```python
import logging
import sys

def _setup_logging() -> None:
    logging.basicConfig(
        level=logging.INFO,  # change to logging.DEBUG to see everything
        format="%(asctime)s %(levelname)s [%(name)s] %(message)s",
        datefmt="%H:%M:%S",
        handlers=[logging.StreamHandler(sys.stdout)],  # keep logs + prints in order
    )

    # Optional: make your package extra chatty
    logging.getLogger("my_react_agent").setLevel(logging.DEBUG)
    logging.getLogger("agent_heart").setLevel(logging.DEBUG)

def main() -> None:
    _setup_logging()

    tools = [MultiplicationTool()]
    agent = ReActAgent.create(llm="llama3:8b", tools=tools)
    ...
```

## Architecture (precise)
### High-level flow

1. **Planning (planner LLM)**
   - Input: user question (+ optional conversation state)
   - Output: one or more **step tasks** (plan)

2. **Execution loop (per step)**
   - Select an **action** (e.g. `USE_TOOL`, `ANSWER_BY_ITSELF`, `CLARIFY`, `NEED_CONTEXT`, `STOP`)
   - If tool is needed:
     - Optional **tool query refinement** produces strict tool input
     - Execute tool
   - Save **observation** + **evidence** to memory

3. **Confidence assessment**
   - Parameter assessors score the step (e.g. entity alignment, answer quality, realism)
   - If confidence < threshold → recovery loop chooses a better next action

4. **Finalisation (summariser LLM)**
   - Synthesises a final answer from step observations/evidence.

### Component map (modules and responsibilities)
**Core orchestration**
- `ReActAgent`  
  Owns the plan/execute/finalise loop, action selection, retries, and memory writes.

**Actions (step-level behaviours)**
- `NeedContextAction`  
  Resolves missing entities / pronouns using the entity extractor and memory.
- `UseToolAction`  
  Invokes exactly one tool (via the tool execution boundary), stores observation/evidence.
- `AnswerByItselfAction`  
  Uses LLM-only knowledge for stable facts (no tools).
- `ClarifyAction`  
  Asks a single clarification question when the step is underspecified.
- `StopAction`  
  Terminates after repeated failures or user cancellation.

**Tools**
- `AgentTool` (interface/base class)  
  Tools implement `execute(tool_input: str) -> Evidence`.
- `ToolExecutor` (execution boundary)  
  The only place where the agent invokes tools. Keeps tool I/O consistent and traceable.

**Memory**
- `QueryMemory`  
  Per-question state: plan, step trace, transcript, observations.
- `ConversationMemory`  
  Cross-question state: extracted entities and references you want to persist.

**Evidence**
- `Evidence` (structured record)  
  `tool`, `content`, `url`, `extracted` dict, `as_of`, `confidence`.

**Confidence**
- Parameter assessors (factory-driven)  
  Examples: `EntityAlignmentAssessor`, `AnswerQualityAssessor`, `AnswerRealismAssessor`.

**Tool input refinement (ToolQueryRefiner)**
- Before calling a tool, the agent converts current step's task into the exact tool input string expected by that tool. This is what prevents “LLM prose” from being fed into tools and standardises tool calls.
- ToolQueryRefiner relies on AgentTool exposing a “refiner contract”. Tools can implement these properties to constrain/refine the model’s output:
`refiner_instructions`: str — tool-specific rules (“Return a normal search query…”, etc.)
`refiner_input_format`: str — short format spec for expected input
`refiner_input_regex`: Optional[str] — strict regex for allowed inputs
`refiner_forbidden`: str — explicit forbidden patterns
`refiner_examples`: str / get_examples() — optional examples to guide the refiner
`refiner_max_chars`: int — max tool input length (hard cap)

**Prompts**
- `PromptRegistry`  
  Stores prompt templates for planning, refinement, confidence assessment, summarisation.

**Plugins**
- Loaded via `REACT_AGENT_PLUGINS` environment variable  
  A plugin module exposes `plugin.register(ctx)` and can add tools/actions/assessors/prompts.

## Prompts
Prompts are small instruction templates used by the agent for planning, tool-use decisions, summarisation, and confidence checks.
- All built-in prompts live in `my_react_agent/agent_prompts/defaults_prompts.py` as `DEFAULT_PROMPTS`.
- Each prompt is keyed by `PromptId` (see agent_prompts/prompts_ids.py).
- Prompts are rendered via `PromptTemplate.render()` using Python `str.format(**kwargs)`.

##### How to add new Prompt
It's shown in the example of Adding new **Parameter Assessor** below

##### How to override existing Prompt

**Option A** - Override the whole Prompt

```python
from __future__ import annotations
from dataclasses import replace
from datetime import datetime, timezone
from typing import Optional, Set

from my_react_agent.agent_heart.react_agent import ReActAgent
from my_react_agent.agent_memory.data_structures import Evidence
from my_react_agent.tool_management.tools.agent_tool import AgentTool

from my_react_agent.agent_prompts.prompt_registry import PromptRegistry
from my_react_agent.agent_prompts.prompt_template import PromptTemplate
from my_react_agent.agent_prompts.defaults_prompts import DEFAULT_PROMPTS
from my_react_agent.agent_prompts.prompts_ids import PromptId


def build_prompts_with_override() -> PromptRegistry:
    prompts = PromptRegistry(_defaults=dict(DEFAULT_PROMPTS))

    custom_template = PromptTemplate(
        text=(
            "You decide if the user question should be split into multiple sequential steps.\n\n"
            "Return true in 'separate' when ANY of these are present:\n"
            "- the question contains multiple asks joined by 'and' / commas / bullet-like phrasing\n"
            "- there are multiple entities to compare\n"
            "- there are multiple time periods, calculations, or dependent sub-questions\n\n"
            "Return false when the question is a single direct lookup or a single short explanation.\n\n"
            "User question:\n{question}\n\n"
            "Return STRICT JSON with this shape:\n{schema}\n\n"
            "JSON only, no explanation:"
        ),
        required_vars={"question", "schema"},
        description="(override) Decide if the question should be split into multiple steps",
        version="override-1",
    )

    # Override the built-in prompt in place
    prompts.set(PromptId.SHOULD_SEPARATE_QUESTION, custom_template)

    return prompts


def main() -> None:
    # Add MultiplicationTool from QuickStart
    tools = [MultiplicationTool()]

    # Build PromptRegistry with a single override
    prompts = build_prompts_with_override()

    agent = ReActAgent.create(
        llm="llama3:8b",
        tools=tools,
        prompts=prompts,  # <-- plug in your overridden prompt registry
    )

    print("\nReAct Agent (override SHOULD_SEPARATE_QUESTION). Type 'exit' to quit.\n")
    while True:
        q = input("You: ").strip()
        if q.lower() in {"exit", "quit"}:
            break
        if not q:
            continue
        print(agent.handle(q))
        print()


if __name__ == "__main__":
    main()
```

**Option B** - Override the just the text of the Prompt
This is the safest option for simple changes.
```python
from my_react_agent.agent_prompts.prompt_registry import PromptRegistry
from my_react_agent.agent_prompts.defaults_prompts import DEFAULT_PROMPTS
from my_react_agent.agent_prompts.prompts_ids import PromptId

prompts = PromptRegistry(_defaults=dict(DEFAULT_PROMPTS))

prompts.set_text(
    PromptId.SHOULD_SEPARATE_QUESTION,
    "New text here...\nUser question:\n{question}\n\nSchema:\n{schema}\nJSON only:"
)
```

## Actions (Decision Catalogue)
- An **Action** is a small “capability label” the planner LLM can choose during _decide_action_for_step(). Actions are not executed directly. They are advertised to the LLM in the “actions catalog” (via Action.to_catalog_entry()). So you can think of an Action as a “menu item” shown to the planner.

##### Action interface
All actions implement the same base class:
- `name`: unique identifier used by the planner and handlers
- `default_instructions`: default guidance shown to the planner
- `default_when_to_pick`: default “selection rule” shown to the planner

This project keeps actions **simple and overridable**: you can tweak *when* an action is chosen (via `when_to_pick`) and *how* it should behave (via `instructions`) **without editing the agent core**. Overrides are done by setting:
- `action.instructions = "..."` or `action.set_instructions("...")`
- `action.when_to_pick = "..."` or `action.set_when_to_pick("...")`
To revert to defaults:
- action.clear_overrides()

##### Built-in Actions
- `NeedContextAction`  
  Resolves missing entities / pronouns using the entity extractor and memory.
- `UseToolAction`  
  Invokes exactly one tool (via the tool execution boundary), stores observation/evidence.
- `AnswerByItselfAction`  
  Uses LLM-only knowledge for stable facts (no tools).
- `ClarifyAction`  
  Asks a single clarification question when the step is underspecified.
- `StopAction`  
  Terminates after repeated failures or user cancellation.

##### How to override an Action
There are two supported ways to change behavior:
1. **Override an instance** (recommended): easiest and does not require subclassing
2. **Subclass**: changes defaults globally for that new class

**Option A** — Override an existing action instance (recommended)
This is the simplest approach: keep the built-in action class, but modify `when_to_pick` and/or `instructions` on the instance you pass to the agent.

```python
from my_react_agent.agent_heart.react_agent import ReActAgent
from my_react_agent.agent_core.agent_actions import (
    AnswerByItselfAction, ClarifyAction, UseToolAction, StopAction
)
from my_react_agent.agent_core.agent_actions.need_context_action import NeedContextAction

# Create an action instance
answer = AnswerByItselfAction()

# Override the selection rule shown to the planner LLM
answer.when_to_pick = (
    "Prefer ANSWER_BY_ITSELF for conceptual/definition questions and timeless facts.\n"
    "Avoid if the step needs current/live info, private data, or tool-only capabilities."
)

# (Optional) override the instructions too
# answer.instructions = "Answer in 1-2 sentences. No speculation."

step_actions = [
    NeedContextAction(),
    answer,              # <-- overridden instance used here
    ClarifyAction(),
    UseToolAction(),
    StopAction(),
]

agent = ReActAgent.create(
    llm="llama3:8b",
    tools=[...],
    step_actions=step_actions,
)

```
Why this works:
- the planner sees an “action catalogue” built from Action.to_catalog_entry()
- to_catalog_entry() uses action.when_to_pick and action.instructions
- your overrides become part of the prompt, influencing action selection

**Option B** — Subclass an Action (change defaults)
If you want a reusable action that always has a new default, subclass and override default_when_to_pick / default_instructions.

```python
from my_react_agent.agent_core.agent_actions import AnswerByItselfAction

class AnswerByItselfMoreOften(AnswerByItselfAction):
    @property
    def default_when_to_pick(self) -> str:
        return (
            "Prefer ANSWER_BY_ITSELF for conceptual questions, definitions, and timeless explanations.\n"
            "Avoid if it needs current/live facts or tool-only capabilities."
        )

step_actions = [
    NeedContextAction(),
    AnswerByItselfMoreOften(),
    ClarifyAction(),
    UseToolAction(),
    StopAction(),
]
```

##### Full runnable example (override when_to_pick)
> **Goal:** Override existing action's (AnswerByItselfAction) when_to_pick method.
```python
from __future__ import annotations

from datetime import datetime, timezone
from typing import Optional, List

from my_react_agent.agent_heart.react_agent import ReActAgent
from my_react_agent.agent_memory.data_structures import Evidence
from my_react_agent.tool_management.tools.agent_tool import AgentTool

from my_react_agent.agent_core.agent_actions import (
    AnswerByItselfAction,
    ClarifyAction,
    UseToolAction,
    StopAction,
)
from my_react_agent.agent_core.agent_actions.need_context_action import NeedContextAction


def main() -> None:
    # Add MultiplicationTool from QuickStart
    tools = [MultiplicationTool()]

    answer = AnswerByItselfAction()
    answer.when_to_pick = (
        "Prefer ANSWER_BY_ITSELF for conceptual/definition questions and timeless facts.\n"
        "Avoid if the step needs current/live info, private data, or tool-only capabilities."
    )

    step_actions: List[object] = [
        NeedContextAction(),
        answer,              # <- overridden instance used here
        ClarifyAction(),
        UseToolAction(),
        StopAction(),
    ]

    agent = ReActAgent.create(
        llm="llama3:8b",     
        tools=tools,
        step_actions=step_actions,
    )

    print("\nReAct Agent (custom when_to_pick). Type 'exit' to quit.\n")
    while True:
        q = input("You: ").strip()
        if q.lower() in {"exit", "quit"}:
            break
        if not q:
            continue
        print(agent.handle(q))
        print()


if __name__ == "__main__":
    main()
```


## How to create and add a Custom Action
> **Goal:** If the user asks about Star Wars (e.g. “Who is Anakin Skywalker?”), the agent should include the phrase "May the Force be with you!" in the final answer.

- Your new class should provide:
1. name (string): must be stable and unique
2. default_when_to_pick: plain-English routing rule
3. default_instructions: what it will do at runtime
optional examples: short example user queries
These fields matter because **Action.to_catalog_entry()** is what the planner sees. If they are vague, the model won’t reliably pick your action.

- Practical guidance:
1. Keep **name** short 
2. Make **default_when_to_pick** crisp (“Pick when X and only when X”)
3. Make **default_instructions** procedural and testable (“Return final answer immediately, do not call tools, include phrase …”)

#### What an ActionHandler
An **ActionHandler** is the concrete implementation that runs after the planner selects an action. The agent resolves **decision.action → handler = _get_handler_for_action(action_name)**:
- If a matching handler is found in **self._action_handlers**, it runs.
- If not found, it defaults to the **USE_TOOL** handler.

**The handler must have registration key that match the action name** (uppercase and trimmed) and:
- expose action_name matching the action name
- implement run(ctx) and return:
    1. a StepToolCall (or empty_tool_call() if no tool is used)
    2. a StepResult (with final_answer, observation, etc.)
    3. the updated Step
Important behavior decisions:
- If your action should finish the whole query immediately, set:
StepResult.should_stop = True
- If your action is deterministic or rule-based and you don’t want confidence gating to trigger retries:
override should_assess_result(...) to return False

#### In these example:
- `StarWarsAnswerAction` advertises a new action the agent can choose: "STARWARS_ANSWER".
- `StarWarsAnswerHandler` runs the prompt to answer the question based on LLM's knowledge and returns a final answer immediately, without calling any tools.

How to use:
1. Register the handler in `action_handlers` with the same key as Action.name.
2. Include `StarWarsAnswerAction()` in `step_actions` (and also in `low_conf_actions` - low-confidence retry decision).

```python
from __future__ import annotations

import re
from datetime import datetime
from typing import Dict, Optional, Tuple

from my_react_agent.agent_heart.react_agent import ReActAgent
from my_react_agent.agent_heart.action_handler_base import ActionHandler, empty_tool_call

from my_react_agent.tool_management.tools.agent_tool import AgentTool
from my_react_agent.agent_memory.data_structures import Evidence, Step, StepResult, StepToolCall, step_set_result

from my_react_agent.agent_core.agent_actions.action import Action
from my_react_agent.agent_core.agent_actions import (
    AnswerByItselfAction,
    ClarifyAction,
    UseToolAction,
    StopAction,
)
from my_react_agent.agent_core.agent_actions.need_context_action import NeedContextAction


class StarWarsAnswerAction(Action):
    @property
    def name(self) -> str:
        # Must match the handler registration key below
        return "STARWARS_ANSWER"

    @property
    def default_when_to_pick(self) -> str:
        return "Pick when the user's question is about Star Wars (characters, movies, lore, etc.)."

    @property
    def default_instructions(self) -> str:
        return (
            "Answer directly from general knowledge (no tools). "
            "The final answer MUST start with: 'May the Force be with you!' "
            "Then provide the answer."
        )

    @property
    def examples(self) -> list[str]:
        return [
            "User: Who is Luke Skywalker?",
            "User: Explain the Jedi and Sith.",
            "User: What is the Force in Star Wars?",
        ]


class StarWarsAnswerHandler(ActionHandler):
    @property
    def action_name(self) -> str:
        return "STARWARS_ANSWER"

    def run(self, ctx) -> Tuple[StepToolCall, StepResult, Step]:
        q = (ctx.question or "").strip()

        # Generate the Star Wars answer (LLM-only)
        prompt = (
            "You answer questions about Star Wars using general knowledge.\n"
            "Rules:\n"
            "- Be accurate; if uncertain, say so briefly.\n"
            "- Be concise unless the user asked for detail.\n\n"
            f"Question: {q}\n"
            "Answer:"
        )

        try:
            body = (ctx.agent.summariser_llm.generate(prompt) or "").strip()
        except Exception:
            body = ""

        if not body:
            body = "I’m not sure, but I can try to help if you rephrase the question."

        final = "May the Force be with you!\n\n" + body

        # Make the transcript show the final answer even though we stop early
        try:
            ctx.transcript_lines.append(f"Final answer: {final}")
        except Exception:
            pass

        ev = Evidence(
            tool="starwars_answer",
            content=final,
            url=None,
            extracted={"matched": True},
            as_of=datetime.utcnow(),
            confidence=0.9,
        )
        try:
            if getattr(ctx.step, "evidence", None) is not None:
                ctx.step.evidence.append(ev)
        except Exception:
            pass

        sr = StepResult(
            observation=final,
            final_answer=final,
            should_stop=True,   # <-- stop immediately once we produce the Star Wars answer
            success=True,
        )
        updated = step_set_result(ctx.step, sr)
        return empty_tool_call(tool=""), sr, updated

    def should_assess_result(self, ctx, *, step: Step, decision, step_result: StepResult) -> bool:
        # Deterministic routing; skip confidence gating for this.
        return False


def main() -> None:
    # Add Multiplication tool from the QuickStart)
    tools = [MultiplicationTool()]

    # Put custom action with defaults actions
    step_actions = [
        StarWarsAnswerAction(),
        NeedContextAction(),
        AnswerByItselfAction(),
        ClarifyAction(),
        UseToolAction(),
    ]

    # Also add custom action to low_conf_actions- otherwise low-confidence re-decisions will use the default list (which won’t include your custom action)
    low_conf_actions = [
        StarWarsAnswerAction(),
        NeedContextAction(),
        AnswerByItselfAction(),
        ClarifyAction(),
        UseToolAction(),
        StopAction(),
    ]

    action_handlers = {
        "STARWARS_ANSWER": StarWarsAnswerHandler(),
    }

    agent = ReActAgent.create(
        llm="llama3:8b",              
        tools=tools,
        step_actions=step_actions,
        low_conf_actions=low_conf_actions,
        action_handlers=action_handlers,
    )

    print("\nMultiplier tool + StarWarsAnswerAction\nType 'exit' to quit.\n")
    while True:
        user_input = input("You: ").strip()
        if user_input.lower() in {"exit", "quit"}:
            break
        if not user_input:
            continue
        print(agent.handle(user_input))
        print()


if __name__ == "__main__":
    main()
```


## How to create and add a Custom ParameterAssessor
> **Goal:** add a `MathExpressionAssessor` that scores how correct is a result of a calculation.

In `my-react-agent`, confidence gating works like this:

1. After a step runs, the agent creates **step summary evidence** (a short factual summary).
2. `ConfidenceAssessor.assess_step_summary(...)` runs **all registered ParameterAssessors** on:
   - `query_text` = step task
   - `answer_text` = step summary content
3. It aggregates the per-assessor `ParameterRating.score` values into one confidence score.
4. If confidence is below threshold, the agent triggers a recovery loop (tries different actions/tools).

So, to add an assessor you must:

1. Create a prompt template
Define a prompt template. It must contain only the variables listed in required_vars.
Example: an assessor that checks if an answer is concise and directly answers the question:

```python
from my_react_agent.agent_prompts.prompt_template import PromptTemplate

CONF_MATH_EXPR_PROMPT_ID = "confidence_math_expression_min"

CONF_MATH_EXPR_TEMPLATE = PromptTemplate(
    text=(
        "You are checking the correctness of a math result.\n"
        "Given the QUESTION and the ANSWER, decide if the calculation is correct.\n\n"
        "Rules:\n"
        "- score=1.0 if the math in ANSWER matches the correct computation.\n"
        "- score=0.0 if it is incorrect.\n"
        "- If the QUESTION/ANSWER do not contain a clear computable expression/result, score=0.5.\n"
        "- reason must be short.\n"
        "- computed should be the correct result if you can compute it, otherwise \"unknown\".\n\n"
        "QUESTION:\n{question}\n\n"
        "ANSWER:\n{answer}\n\n"
        "Return ONLY JSON.\n"
        "JSON:"
    ),
    required_vars={"question", "answer"},
    description="Minimal math correctness scoring",
    version="2",
)

```

2. Define a JSON Schema (recommended) 

**Why**: If the model outputs extra text (e.g., “Here is the response: {...}”), strict json.loads() fails, your assessor falls back, and the agent may repeatedly retry the step.

Use a schema that exactly matches what you want back:

```python
MATH_ASSESS_SCHEMA: Dict[str, Any] = {
    "type": "object",
    "properties": {
        "score": {"type": "number", "minimum": 0.0, "maximum": 1.0},
        "reason": {"type": "string"},
        "computed": {"type": "string"},
    },
    "required": ["score", "reason", "computed"],
    "additionalProperties": False,
}
```

3. Implement the **ParameterAssessor** subclass 

Your assessor must implement:
- name
- prompt_id
- default_prompt_template()
- assess(...) -> ParameterRating

```python
from my_react_agent.confidence_assessment.parameter_assessor import ParameterAssessor
from my_react_agent.confidence_assessment.models import ParameterRating

class MathExpressionAssessor(ParameterAssessor):
    @property
    def name(self) -> str:
        return "math_expression_correctness"

    @property
    def prompt_id(self) -> str:
        return CONF_MATH_EXPR_PROMPT_ID

    def default_prompt_template(self) -> PromptTemplate:
        return CONF_MATH_EXPR_TEMPLATE

    def _parse_json_strict_or_extract(self, raw: str) -> Dict[str, Any]:
        """
        Even with schema, keep a tiny safety net:
        if the model ever wraps JSON with extra text, extract first {...}.
        """
        s = (raw or "").strip()
        if not s:
            return {}

        try:
            obj = json.loads(s)
            return obj if isinstance(obj, dict) else {}
        except Exception:
            pass

        m = re.search(r"\{.*\}", s, flags=re.DOTALL)
        if m:
            try:
                obj = json.loads(m.group(0))
                return obj if isinstance(obj, dict) else {}
            except Exception:
                return {}

        return {}

    def assess(
        self,
        *,
        query_text: str,
        answer_text: str,
        tool_result_text: str = "",
        knowledge_cutoff: Optional[str] = None,
        result_timestamp: Optional[str] = None,
    ) -> ParameterRating:
        prompt = self._render_prompt(
            question=(query_text or "").strip(),
            answer=(answer_text or "").strip(),
        )

        raw = (self.llm.generate(prompt, format=MATH_ASSESS_SCHEMA) or "").strip()

        obj = self._parse_json_strict_or_extract(raw)

        try:
            score = float(obj.get("score", self.default_fallback))
        except Exception:
            score = float(self.default_fallback)

        score = max(0.0, min(1.0, score))

        reason = str(obj.get("reason", "fallback")).strip()[:160] or "fallback"
        computed = str(obj.get("computed", "")).strip()[:80]

        print(f"\n[MathExpressionAssessor] score={score:.2f} computed={computed or 'unknown'} reason={reason}\n")

        return ParameterRating(
            name=self.name,
            score=score,
            reason=reason,
            meta={"computed": computed} if computed else {},
        )
```

##### Notes on assessor inputs
Base class calls an assessor with:
- query_text: the step task or question text being assessed
- answer_text: the step’s produced answer/summary
- tool_result_text: optional raw tool output if your confidence framework passes it through
- knowledge_cutoff and result_timestamp: optional metadata (often unused in simple assessors)

4. Register the new prompt in PromptRegistry

Update create_prompts() to register the assessor prompt id and template:
```python
def create_prompts() -> PromptRegistry:
    prompts = PromptRegistry(_defaults=dict(DEFAULT_PROMPTS))
    prompts.register_default(CONF_MATH_EXPR_PROMPT_ID, CONF_MATH_EXPR_TEMPLATE)
    return prompts
```

5. Build Agent. Pass an assessor factory into ReActAgent.create(...)

The code expects **parameter_assessors** to contain either:
- a `ParameterAssessor instance`, or
- a `factory (llm, prompts) -> ParameterAssessor`

##### Full runnable example: 

```python
from __future__ import annotations

import json
import re
from datetime import datetime, timezone
from typing import Any, Dict, Optional

from my_react_agent.agent_heart.react_agent import ReActAgent
from my_react_agent.agent_memory.data_structures import Evidence
from my_react_agent.tool_management.tools.agent_tool import AgentTool

from my_react_agent.confidence_assessment.parameter_assessor import ParameterAssessor
from my_react_agent.confidence_assessment.models import ParameterRating

from my_react_agent.agent_prompts.prompt_template import PromptTemplate
from my_react_agent.agent_prompts.prompt_registry import PromptRegistry
from my_react_agent.agent_prompts.defaults_prompts import DEFAULT_PROMPTS

CONF_MATH_EXPR_PROMPT_ID = "confidence_math_expression_min"

CONF_MATH_EXPR_TEMPLATE = PromptTemplate(
    text=(
        "You are checking the correctness of a math result.\n"
        "Given the QUESTION and the ANSWER, decide if the calculation is correct.\n\n"
        "Rules:\n"
        "- score=1.0 if the math in ANSWER matches the correct computation.\n"
        "- score=0.0 if it is incorrect.\n"
        "- If the QUESTION/ANSWER do not contain a clear computable expression/result, score=0.5.\n"
        "- reason must be short.\n"
        "- computed should be the correct result if you can compute it, otherwise \"unknown\".\n\n"
        "QUESTION:\n{question}\n\n"
        "ANSWER:\n{answer}\n\n"
        "Return ONLY JSON.\n"
        "JSON:"
    ),
    required_vars={"question", "answer"},
    description="Minimal math correctness scoring",
    version="2",
)

MATH_ASSESS_SCHEMA: Dict[str, Any] = {
    "type": "object",
    "properties": {
        "score": {"type": "number", "minimum": 0.0, "maximum": 1.0},
        "reason": {"type": "string"},
        "computed": {"type": "string"},
    },
    "required": ["score", "reason", "computed"],
    "additionalProperties": False,
}


class MathExpressionAssessor(ParameterAssessor):
    @property
    def name(self) -> str:
        return "math_expression_correctness"

    @property
    def prompt_id(self) -> str:
        return CONF_MATH_EXPR_PROMPT_ID

    def default_prompt_template(self) -> PromptTemplate:
        return CONF_MATH_EXPR_TEMPLATE

    def _parse_json_strict_or_extract(self, raw: str) -> Dict[str, Any]:
        s = (raw or "").strip()
        if not s:
            return {}

        try:
            obj = json.loads(s)
            return obj if isinstance(obj, dict) else {}
        except Exception:
            pass

        m = re.search(r"\{.*\}", s, flags=re.DOTALL)
        if m:
            try:
                obj = json.loads(m.group(0))
                return obj if isinstance(obj, dict) else {}
            except Exception:
                return {}

        return {}

    def assess(
        self,
        *,
        query_text: str,
        answer_text: str,
        tool_result_text: str = "",
        knowledge_cutoff: Optional[str] = None,
        result_timestamp: Optional[str] = None,
    ) -> ParameterRating:
        prompt = self._render_prompt(
            question=(query_text or "").strip(),
            answer=(answer_text or "").strip(),
        )
        raw = (self.llm.generate(prompt, format=MATH_ASSESS_SCHEMA) or "").strip()

        obj = self._parse_json_strict_or_extract(raw)

        try:
            score = float(obj.get("score", self.default_fallback))
        except Exception:
            score = float(self.default_fallback)

        score = max(0.0, min(1.0, score))

        reason = str(obj.get("reason", "fallback")).strip()[:160] or "fallback"
        computed = str(obj.get("computed", "")).strip()[:80]

        print(f"\n[MathExpressionAssessor] score={score:.2f} computed={computed or 'unknown'} reason={reason}\n")

        return ParameterRating(
            name=self.name,
            score=score,
            reason=reason,
            meta={"computed": computed} if computed else {},
        )

def create_prompts() -> PromptRegistry:
    prompts = PromptRegistry(_defaults=dict(DEFAULT_PROMPTS))
    prompts.register_default(CONF_MATH_EXPR_PROMPT_ID, CONF_MATH_EXPR_TEMPLATE)
    return prompts


def main() -> None:
    # Pass Multiplication tool from the Quickstart
    tools = [MultiplicationTool()]
    prompts = create_prompts()

    # factory so ReActAgent.create() uses its confidence_llm to build the assessor
    def math_assessor_factory(llm, prompts):
        return MathExpressionAssessor(llm=llm, prompts=prompts)

    agent = ReActAgent.create(
        llm="llama3:8b",
        tools=tools,
        prompts=prompts,
        parameter_assessors=[math_assessor_factory],
    )

    print("\nMinimal Assessor Demo (math_expression_correctness + multiplier). Type 'exit' to quit.\n")

    while True:
        q = input("You: ").strip()
        if q.lower() in {"exit", "quit"}:
            break
        if not q:
            continue
        print(agent.handle(q))
        print()


if __name__ == "__main__":
    main()
```

## LLM Adapters
LLM adapters are the thin integration layer between the agent and any model backend. The agent only depends on one interface:
- `LLMBase.generate(prompt: str, **kwargs) -> str`
This keeps the agent provider-agnostic (Ollama today, something else tomorrow) and lets models being swapped without touching the planning/tool/action logic.

##### Why adapters exist
- **Decouple providers**: the agent never imports Ollama/OpenAI/etc. directly.
- **Centralise backend params**: `temperature`, `stop`, `num_ctx`, and optionally `format` (for structured JSON outputs).
- **Enable testing**: you can inject a fake adapter that returns deterministic outputs.

##### LLM roles in this project
`ReActAgent` uses four LLM roles so each part of the loop can have different defaults (especially temperature):

- `planner_llm`  
  Decides how to proceed for each step (e.g. `USE_TOOL`, `CLARIFY`, `ANSWER_BY_ITSELF`, `NEED_CONTEXT`, `STOP`) and may split a question into steps.

- `summariser_llm`  
  Produces the final user-facing answer and step synthesis. Also used by default for entity extraction (`LLMEntityExtractor`).

- `confidence_llm`  
  Evaluates step result quality via `ConfidenceAssessor` / `ParameterAssessors` and can trigger retries on low-confidence steps.

- `refiner_llm`  
  Converts natural-language tool intent into strict tool inputs (often regex-constrained).

##### How to pass adapters
You can pass the `llm` into `ReActAgent.create()` in three common ways:
1. **Single model string** (uses default factory, builds per-role LLMs):

```python
agent = ReActAgent.create(llm="llama3:8b", tools=tools)
```

2. A **single adapter instance** (same adapter for all roles):

```python
agent = ReActAgent.create(llm=DefaultOllamaLLM("llama3:8b"), tools=tools)
```

3. A **role factory** (best for role-specific adapters):

```python
agent = ReActAgent.create(llm=lambda role: DefaultOllamaLLM("llama3:8b"), tools=tools)
```

4. **Role mapping** (recommended for different models per role)
Provide a dict with "default" and optional per-role overrides:

```python
from __future__ import annotations

from datetime import datetime, timezone
from typing import Optional, Dict

from my_react_agent.agent_heart.react_agent import ReActAgent
from my_react_agent.agent_memory.data_structures import Evidence
from my_react_agent.tool_management.tools.agent_tool import AgentTool

from my_react_agent.agent_prompts.prompt_registry import PromptRegistry
from my_react_agent.agent_prompts.defaults_prompts import DEFAULT_PROMPTS


def create_role_models() -> Dict[str, str]:
    return {
        "default": "llama3:8b",
        "planner_llm": "llama3:8b",
        "summariser_llm": "mistral",
        "confidence_llm": "deepseek-r1:8b",
        "refiner_llm": "gemma3:4b",
    }


def main() -> None:
    # Add MultiplicationTool from QuickStart
    tools = [MultiplicationTool()]

    agent = ReActAgent.create(
        llm=create_role_models(),   
        tools=tools,
    )

    print("\nReAct Agent (4-role models). Type 'exit' to quit.\n")
    while True:
        q = input("You: ").strip()
        if q.lower() in {"exit", "quit"}:
            break
        if not q:
            continue
        print(agent.handle(q))
        print()


if __name__ == "__main__":
    main()
```

#### Don't forget to pull models:
`ollama pull llama3:8b`
`ollama pull mistral`
`ollama pull deepseek-r1:8b`
`ollama pull gemma3:4b`


