jeevesagent.architecture.tree_of_thoughts¶
Tree of Thoughts: branching exploration with per-node evaluation.
Yao et al. 2023 — Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Useful for combinatorial reasoning, multi-step planning, math (Game of 24), puzzle solving — anywhere a single straight-shot ReAct trajectory would commit too early.
Pattern (BFS beam search)¶
Root is the problem statement.
For each level up to ``max_depth``:
Expand: for every frontier node, the proposer generates
branch_factorcandidate “thoughts” (next steps toward a solution).Evaluate: the evaluator scores each candidate 0-1 (how promising is this branch?).
Prune: keep only the top
beam_widthscored candidates as the next frontier.Early exit: if any candidate scores
>= solved_threshold, we stop early and use that branch.
Best leaf wins. The highest-scoring leaf across the whole tree is the final answer (its content goes to
session.output).
This is the “BFS-with-beam” variant — DFS with backtracking is a follow-up. For a structured combinatorial task, BFS-beam covers most of what users need.
Cost¶
branch_factor × beam_width × max_depth × 2 model calls (one
proposer + one evaluator per candidate). With defaults
(3, 2, 3) that’s 36 calls. Reserve ToT for problems where the
search structure earns the cost — math/planning tasks where ReAct
visibly meanders.
Strengths¶
Explicit search tree. Every candidate, score, and decision is observable through
architecture_eventevents.Composable. Wrap inside
Reflexionto learn which evaluation patterns predict real success.Replay-correct. Each proposer / evaluator call is a named
runtime.step, so journaled runtimes replay deterministically.
Weaknesses¶
Expensive. 30-50× a single ReAct turn for typical settings.
Evaluator-quality bound. A weak evaluator picks weak branches and the search wastes budget on dead ends.
Domain-specific. Branch-and-evaluate makes sense for combinatorial problems; for open-ended writing tasks, use Self-Refine or Actor-Critic.
Attributes¶
Classes¶
One node in the Tree-of-Thoughts search tree. |
|
Branch + evaluate + prune. BFS beam search over thoughts. |
Module Contents¶
- class jeevesagent.architecture.tree_of_thoughts.ThoughtNode(/, **data: Any)[source]¶
Bases:
pydantic.BaseModelOne node in the Tree-of-Thoughts search tree.
Children are stored implicitly (each node has a
parent_id). The full tree is reconstructable from the node list ToT keeps in its session metadata.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- class jeevesagent.architecture.tree_of_thoughts.TreeOfThoughts(*, branch_factor: int = 3, max_depth: int = 3, beam_width: int = 2, solved_threshold: float = 1.0, min_score: float = 0.0, parallel: bool = True, proposer_prompt: str | None = None, evaluator_prompt: str | None = None)[source]¶
Branch + evaluate + prune. BFS beam search over thoughts.
- declared_workers() dict[str, jeevesagent.agent.api.Agent][source]¶
- async run(session: jeevesagent.architecture.base.AgentSession, deps: jeevesagent.architecture.base.Dependencies, prompt: str) collections.abc.AsyncIterator[jeevesagent.core.types.Event][source]¶
- name = 'tree-of-thoughts'¶
- jeevesagent.architecture.tree_of_thoughts.DEFAULT_EVALUATOR_PROMPT = Multiline-String¶
Show Value
"""You evaluate a candidate reasoning step. Given the original problem and the proposed thought, score how promising this thought is for arriving at the correct solution. Output exactly one line: score: <number between 0 and 1> Then optionally one line of brief justification. The first line must match the score format exactly so it can be parsed. - 1.0 = this thought is correct and final / will obviously lead to a correct answer - 0.7-0.9 = strong direction, likely correct - 0.4-0.6 = plausible but uncertain - 0.0-0.3 = wrong direction or contradicts the problem """
- jeevesagent.architecture.tree_of_thoughts.DEFAULT_PROPOSER_PROMPT = Multiline-String¶
Show Value
"""You are exploring possible reasoning paths to solve a problem. Given the problem and any prior steps, propose ONE next step (a "thought") toward a solution. A thought can be a sub-step, intermediate calculation, sub-decision, or partial answer. Output only the thought itself — concise, one paragraph at most. Do not number it; do not preface with "Thought:". """