jeevesagent.architecture.tree_of_thoughts
=========================================

.. py:module:: jeevesagent.architecture.tree_of_thoughts

.. autoapi-nested-parse::

   Tree of Thoughts: branching exploration with per-node evaluation.

   Yao et al. 2023 — `Tree of Thoughts: Deliberate Problem Solving with
   Large Language Models <https://arxiv.org/abs/2305.10601>`_. Useful
   for combinatorial reasoning, multi-step planning, math (Game of 24),
   puzzle solving — anywhere a single straight-shot ReAct trajectory
   would commit too early.

   Pattern (BFS beam search)
   -------------------------

   1. **Root** is the problem statement.
   2. **For each level up to ``max_depth``:**

      a. **Expand:** for every frontier node, the proposer generates
         ``branch_factor`` candidate "thoughts" (next steps toward a
         solution).
      b. **Evaluate:** the evaluator scores each candidate 0-1 (how
         promising is this branch?).
      c. **Prune:** keep only the top ``beam_width`` scored
         candidates as the next frontier.
      d. **Early exit:** if any candidate scores ``>= solved_threshold``,
         we stop early and use that branch.

   3. **Best leaf wins.** The highest-scoring leaf across the whole
      tree is the final answer (its content goes to ``session.output``).

   This is the "BFS-with-beam" variant — DFS with backtracking is a
   follow-up. For a structured combinatorial task, BFS-beam covers most
   of what users need.

   Cost
   ----
   ``branch_factor × beam_width × max_depth × 2`` model calls (one
   proposer + one evaluator per candidate). With defaults
   ``(3, 2, 3)`` that's 36 calls. Reserve ToT for problems where the
   search structure earns the cost — math/planning tasks where ReAct
   visibly meanders.

   Strengths
   ---------
   * **Explicit search tree.** Every candidate, score, and decision is
     observable through ``architecture_event`` events.
   * **Composable.** Wrap inside :class:`Reflexion` to learn which
     evaluation patterns predict real success.
   * **Replay-correct.** Each proposer / evaluator call is a named
     ``runtime.step``, so journaled runtimes replay deterministically.

   Weaknesses
   ----------
   * **Expensive.** 30-50× a single ReAct turn for typical settings.
   * **Evaluator-quality bound.** A weak evaluator picks weak branches
     and the search wastes budget on dead ends.
   * **Domain-specific.** Branch-and-evaluate makes sense for
     combinatorial problems; for open-ended writing tasks, use
     Self-Refine or Actor-Critic.



Attributes
----------

.. autoapisummary::

   jeevesagent.architecture.tree_of_thoughts.DEFAULT_EVALUATOR_PROMPT
   jeevesagent.architecture.tree_of_thoughts.DEFAULT_PROPOSER_PROMPT


Classes
-------

.. autoapisummary::

   jeevesagent.architecture.tree_of_thoughts.ThoughtNode
   jeevesagent.architecture.tree_of_thoughts.TreeOfThoughts


Module Contents
---------------

.. py:class:: ThoughtNode(/, **data: Any)

   Bases: :py:obj:`pydantic.BaseModel`


   One node in the Tree-of-Thoughts search tree.

   Children are stored implicitly (each node has a ``parent_id``).
   The full tree is reconstructable from the node list ToT keeps in
   its session metadata.

   Create a new model by parsing and validating input data from keyword arguments.

   Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be
   validated to form a valid model.

   `self` is explicitly positional-only to allow `self` as a field name.


   .. py:attribute:: content
      :type:  str


   .. py:attribute:: depth
      :type:  int


   .. py:attribute:: id
      :type:  str


   .. py:attribute:: parent_id
      :type:  str | None


   .. py:attribute:: score
      :type:  float
      :value: 0.0



.. py:class:: TreeOfThoughts(*, branch_factor: int = 3, max_depth: int = 3, beam_width: int = 2, solved_threshold: float = 1.0, min_score: float = 0.0, parallel: bool = True, proposer_prompt: str | None = None, evaluator_prompt: str | None = None)

   Branch + evaluate + prune. BFS beam search over thoughts.


   .. py:method:: declared_workers() -> dict[str, jeevesagent.agent.api.Agent]


   .. py:method:: run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) -> collections.abc.AsyncIterator[jeevesagent.core.types.Event]
      :async:



   .. py:attribute:: name
      :value: 'tree-of-thoughts'



.. py:data:: DEFAULT_EVALUATOR_PROMPT
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """You evaluate a candidate reasoning step. Given the original problem
      and the proposed thought, score how promising this thought is for
      arriving at the correct solution.
      
      Output exactly one line:
      score: <number between 0 and 1>
      
      Then optionally one line of brief justification. The first line
      must match the score format exactly so it can be parsed.
      
      - 1.0 = this thought is correct and final / will obviously lead to a
        correct answer
      - 0.7-0.9 = strong direction, likely correct
      - 0.4-0.6 = plausible but uncertain
      - 0.0-0.3 = wrong direction or contradicts the problem
      """

   .. raw:: html

      </details>



.. py:data:: DEFAULT_PROPOSER_PROMPT
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """You are exploring possible reasoning paths to solve a problem.
      
      Given the problem and any prior steps, propose ONE next step (a
      "thought") toward a solution. A thought can be a sub-step,
      intermediate calculation, sub-decision, or partial answer.
      
      Output only the thought itself — concise, one paragraph at most.
      Do not number it; do not preface with "Thought:".
      """

   .. raw:: html

      </details>



