NeuroAgent AI

Core Concepts

The Agent is the atom. Everything else — teams, workflows, database agents, API agents — is a composition or specialization of Agent, never a parallel system.

                          ┌─────────────────────────────┐
        Your code  ─────► │            Agent            │   (public API, sync + async)
                          └──────────────┬──────────────┘
                                         │ orchestrates
        ┌───────────┬───────────┬────────┼────────┬───────────┬───────────┐
        ▼           ▼           ▼        ▼        ▼           ▼           ▼
    Providers     Tools      Memory     RAG    Context     Cost       Tracing
   (LLM calls) (functions) (history) (retrieval)(prompt)  (tokens→$)  (spans)

  Higher-order compositions, all built ON TOP of Agent (never around it):
     Workflow ── DAG of steps       Team ── collaborating agents      Server ── agent.serve()
     DatabaseAgent ── NL→SQL        APIAgent ── call HTTP APIs        Integrations ── Shopify/SAP/…

The agent run loop

When you call agent.run(prompt) (sync) or await agent.arun(prompt) (async):

1. Assemble context  →  system prompt + memory history + (optional) retrieved RAG context + prompt
2. Call the provider →  normalized LLM request (with your tools' JSON schemas)
3. Tool calls?       →  validate args, execute the tool, feed the result back ──┐
                        (loop, bounded by max_steps)  ◄────────────────────────┘
4. Final answer      →  return AgentResponse(content, messages, usage, steps, trace_id)
5. Along the way     →  record a span tree (trace) + token usage → cost; persist to memory

Tools auto-generate their JSON schema from your function's signature, type hints, and docstring — you never hand-author schemas. Every run is traced and costed.

1. Agents

from neuroagent import Agent

agent = Agent(provider="openai", model="gpt-4.1", system_prompt="Answer in one sentence.")

resp = agent.run("Who wrote Pride and Prejudice?")
print(resp.content)          # final text
print(resp.usage)            # Usage(prompt_tokens=..., completion_tokens=...)
print(resp.steps)            # number of model/tool iterations
print(resp.trace_id)         # links to this run's trace

# Async + streaming variants
import asyncio
asyncio.run(agent.arun("..."))            # await agent.arun(...)
# agent.ask(...) / agent.aask(...) are aliases that read naturally with RAG

2. Providers

One API, many models. Switch with a string:

Agent(provider="openai",    model="gpt-4.1")
Agent(provider="anthropic", model="claude-opus-4-8")
Agent(provider="gemini",    model="gemini-2.0-flash")
Agent(provider="groq",      model="llama-3.3-70b-versatile")
Agent(provider="echo")      # offline, no key — for tests & demos

OpenAI-compatible endpoints (Azure OpenAI, OpenRouter, Together, vLLM, …) work via base_url. You can also register a custom provider:

from neuroagent.providers import register_provider, LLMProvider
# class MyProvider(LLMProvider): ...   (implement complete/stream/embed)
register_provider("myprovider", MyProvider)
Agent(provider="myprovider")

3. Tools

Decorate any function with @tool. The agent decides when to call it, runs it, and continues reasoning with the result. Sync and async tools are both supported.

from neuroagent import Agent, tool

@tool
def get_weather(city: str) -> str:
    """Return the current weather for a city."""
    return f"It's 22°C and sunny in {city}."

@tool(read_only=True)
async def search_docs(query: str) -> str:
    """Search internal documentation."""
    ...

agent = Agent(provider="openai", tools=[get_weather, search_docs])
print(agent.run("What should I wear in Paris today?").content)

# Add tools after construction:
agent.add_tool(get_weather)

4. Streaming

Stream tokens as they arrive with astream:

import asyncio
from neuroagent import Agent

async def main():
    agent = Agent(provider="openai")
    async for token in agent.astream("Tell me a short story about a robot."):
        print(token, end="", flush=True)

asyncio.run(main())