Metadata-Version: 2.4
Name: pocket-joe
Version: 0.2.0.5
Summary: A correct, simple, performant, and pythonic framework for building durable AI agents
Requires-Python: >=3.12
Requires-Dist: fastmcp>=2.13.2
Provides-Extra: examples
Requires-Dist: anthropic>=0.20.0; extra == 'examples'
Requires-Dist: beautifulsoup4>=4.11.0; extra == 'examples'
Requires-Dist: ddgs>=9.9.1; extra == 'examples'
Requires-Dist: httpx>=0.27.0; extra == 'examples'
Requires-Dist: openai>=1.0.0; extra == 'examples'
Requires-Dist: pyyaml>=6.0; extra == 'examples'
Requires-Dist: requests>=2.28.0; extra == 'examples'
Requires-Dist: youtube-transcript-api>=0.6.0; extra == 'examples'
Description-Content-Type: text/markdown

# PocketJoe

**LLM Agents are just agents...**

- Agents are policies
- A policy reasons over observations and chooses a batch of options
- A policy can be any mix of LLM-based, human-in-the-loop, or heuristic

## Semantics

An agent system using Reinforcement Learning theory with LLM semantics as first class

- `policy`: all code/logic/llm are policies
- `observations` - the set of observations for the policy to reason over
- `options` - additional action spaces available to the policy
- `selected_actions` - the set of concurrent actions the policy chose to take
- `Message`: a shared dataclass for `observations` and `actions` that aligns with llm semantics

### LLM semantics as platform semantics

In LLM APIs, everything is a `Message`. We adopt this as our universal unit:

- **Input:** `observations: list[Message]` (what the policy sees)
- **Output:** `selected_actions` - the policy's action space (owns its outputs)

**Key insight:** When options are provided, they expand the policy's action space. The runtime automatically invokes all option calls and injects the results back as observations.

### Everything is a Policy

**Universal Return Types:** Policies can return any JSON-serializable type - the framework automatically wraps results when called as options.

An LLM policy that can call other policies:

```python
@policy.tool(description="Calls LLM with tool support")
async def openai_llm_policy_v1(observations: list[Message], options: list[OptionSchema]) -> list[Message]:
    """LLM policy that calls OpenAI GPT-4 with tool support."""
    openai = AsyncOpenAI()
    response = await openai.chat.completions.create(
        model="gpt-4",
        messages=observations_to_completions_messages(observations),
        tools=options_to_completions_tools(options))
    return completions_response_to_messages(response)
```

A simple helper policy returning primitives:

```python
@policy.tool(description="Performs a web search and returns results.")
async def web_search_policy(query: str) -> str:
    """Performs a web search and returns results."""
    results = DDGS().text(query, max_results=5)
    return "\n\n".join([f"Title: {r['title']}\nURL: {r['href']}\nSnippet: {r['body']}" for r in results])
```

A policy returning structured data:

```python
@policy.tool(description="Transcribe YouTube video")
async def transcribe_youtube_policy(url: str) -> dict[str, str]:
    """Get video title, transcript and metadata from YouTube URL."""
    video_id = _extract_video_id(url)
    transcript = YouTubeTranscriptApi().fetch(video_id)
    return {
        "title": title,
        "transcript": " ".join([snippet.text for snippet in transcript]),
        "video_id": video_id
    }
```

An orchestrator policy that coordinates LLM + search:

```python
@policy.tool(description="Orchestrates LLM with web search tool")
async def search_agent(prompt: str, max_iterations: int = 3) -> list[Message]:
    """Orchestrator that gives the LLM access to web search."""
    ctx = AppContext.get_ctx()

    system_builder = MessageBuilder(policy="system", role_hint_for_llm="system")
    system_builder.add_text("You are an AI assistant that can use tools.")
    system_message = system_builder.to_message()

    prompt_builder = MessageBuilder(policy="user", role_hint_for_llm="user")
    prompt_builder.add_text(prompt)
    prompt_message = prompt_builder.to_message()

    history = [system_message, prompt_message]
    for _ in range(max_iterations):
        selected_actions = await ctx.llm(
            observations=history,
            options=OptionSchema.from_func([ctx.web_search])
        )
        history.extend(selected_actions)
        if not any(msg.payload and isinstance(msg.payload, OptionCallPayload) for msg in selected_actions):
            break

    return history
```

Use `AppContext` for registry (gives IDE type hints):

```python
class AppContext(BaseContext):
    def __init__(self, runner):
        super().__init__(runner)
        self.llm = self._bind(openai_llm_policy_v1)
        self.web_search = self._bind(web_seatch_ddgs_policy)
        self.search_agent = self._bind(search_agent)
```

Enjoy:

```python
async def main():
    runner = InMemoryRunner()
    ctx = AppContext(runner)
    result = await ctx.search_agent(prompt="What is the latest Python version?")

    # Get final text message (Message.__str__ extracts text automatically)
    final_msg = next((msg for msg in reversed(result) if msg.parts), '')
    print(f"\nFinal Result: {final_msg}")
```

**Why this matters:**

- **Universal Composability:** Decorate any function - it works like FastAPI/FastMCP endpoints
- **Flexible Return Types:** Return primitives (str, dict, list), or list[Message] for complex flows
- **Auto-wrapping:** Framework automatically wraps results in OptionResultPayload when called as options
- **Type-safe:** Full IDE support with typed context and message payloads
- **Evolution-friendly:** Start simple (primitives) → add complexity (messages) with no refactoring

A correct, simple, performant, and pythonic framework for building durable AI agents.

> "There is no flow, only Policies and Actions."

## Getting Started

### Prerequisites

- Python 3.12+

### Installation

```bash
uv add pocket-joe
```

Or with pip:

```bash
pip install pocket-joe
```

To install with example dependencies:

```bash
uv add pocket-joe --extra examples
# or
pip install pocket-joe[examples]
```

### Development Setup

```bash
git clone https://github.com/Sohojoe/pocket-joe.git
cd pocket-joe
uv sync --dev --all-extras
```

### Running Examples

Set your API key:

```bash
export OPENAI_API_KEY=sk-...
```

#### Search Agent (ReAct)

```bash
uv run python examples/search_agent.py
```

#### YouTube Summarizer

```bash
uv run python examples/youtube_summarizer.py
```

## Dev Status

Still in prerelease, things will change

Initial version

- [] Tidy up code - add partly refactored code
- [] Proper tests
- [] Implement more examples from Pocket-Flow

Durable System:

- [] Ledger - Temporal style 'at least once, only one result' replay semantic
- [] Durable Storage wrapper - For long running tasks & replay
- [] Distributed - worker model

## Background

Inspired by [PocketFlow](https://github.com/The-Pocket/PocketFlow)... I loved PocketFlow but it fell short in a couple of key areas. This is my rewrite that I can actually use.
