Metadata-Version: 2.4
Name: nb_llm
Version: 0.2.0
Summary: Powerful LLM framework — 3 lines of code to do what LangChain takes 30. Chat, Tools, RAG, Agents, Streaming.
Author: ydf0509
License: MIT
Project-URL: Homepage, https://github.com/ydf0509/nb_llm
Project-URL: Repository, https://github.com/ydf0509/nb_llm
Keywords: llm,ai,chatgpt,deepseek,agent,rag,tool-calling,framework
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: openai>=1.0.0
Provides-Extra: all
Requires-Dist: httpx>=0.24.0; extra == "all"
Requires-Dist: python-dotenv>=0.19.0; extra == "all"
Requires-Dist: tiktoken>=0.5.0; extra == "all"
Requires-Dist: pydantic>=2.0.0; extra == "all"
Requires-Dist: anthropic>=0.18.0; extra == "all"
Requires-Dist: redis>=4.0.0; extra == "all"
Requires-Dist: chromadb>=0.4.0; extra == "all"
Provides-Extra: rag
Requires-Dist: chromadb>=0.4.0; extra == "rag"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.20.0; extra == "dev"

# nb_llm

**A powerful LLM framework** — 3 lines of code to do what LangChain takes 30 lines.

[![Python 3.7+](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

[中文文档 (Chinese)](https://github.com/ydf0509/nb_llm/blob/main/README_CN.md)

---

## Why nb_llm?

| Comparison | nb_llm | LangChain + LangGraph |
|--------|--------|-----------------------|
| Core abstraction | **1** (Chat) | 10+ (LLM, Chain, Agent, Tool, Prompt, Parser, Memory...) |
| Learning curve | 1 hour | 1 week+ |
| Install | `pip install nb_llm` | langchain + langchain-core + langchain-openai + langgraph + ... |
| Workflow | **Native Python** if/for/while | StateGraph DSL + conditional edges + state machines |
| Tool calls | `@chat.tool` one line | @tool + Prompt + Agent + AgentExecutor |

---

## Installation

```bash
pip install nb_llm
```

## Get Started in 30 Seconds

```python
from nb_llm import Chat, ChatConfig

chat = Chat(ChatConfig("deepseek"))
print(chat.ask("Explain Python in one sentence"))
```

That's it. Auto-detects model provider, auto-discovers API Key.

---

## Core Features

### Multi-turn Conversation

```python
chat = Chat(ChatConfig("deepseek"))
chat.send("My name is Alice")
chat.send("I'm 25 years old")
print(chat.send("What's my name? How old am I?"))
# "Your name is Alice, and you are 25 years old"
```

`send()` remembers context, `ask()` is stateless — they don't interfere with each other.

### Streaming Output

```python
for chunk in chat.stream("Write a poem about spring"):
    print(chunk, end="", flush=True)
```

### Tool Calls (Agent)

```python
chat = Chat(ChatConfig("deepseek"))

@chat.tool
def get_weather(city: str) -> str:
    """Get weather for a city"""
    return f"{city}: Sunny, 25°C"

answer = chat.send("What's the weather in Beijing?")
# Automatically calls get_weather("Beijing"), generates answer based on result
```

Compare with LangChain's @tool + AgentExecutor + Prompt template — nb_llm only needs `@chat.tool`.

### Structured Output (Pydantic)

```python
from pydantic import BaseModel, Field

class Movie(BaseModel):
    title: str = Field(description="Movie title")
    year: int = Field(description="Release year")
    rating: float = Field(description="Rating, 1-10")

result = chat.ask("Recommend a sci-fi movie", SendOptions(response_type=Movie))
movie = result.parsed  # Movie object with full IDE completion
print(movie.title, movie.year, movie.rating)
```

### Pipeline Composition

```python
translator = Chat(ChatConfig("deepseek", system="Translate to English"))
summarizer = Chat(ChatConfig("deepseek", system="Summarize in one sentence"))

pipeline = translator >> summarizer
result = pipeline("AI is changing the world")
```

### Batch Processing

```python
answers = chat.batch(
    ["What is Python?", "What is Java?", "What is Go?"],
    concurrency=5,
)
```

### Multi-Agent Collaboration

```python
from nb_llm import Team

pm = Chat(ChatConfig("deepseek", system="You are a product manager", name="PM"))
dev = Chat(ChatConfig("deepseek", system="You are a developer", name="Dev"))
qa = Chat(ChatConfig("deepseek", system="You are a QA engineer", name="QA"))

result = Team(pm, dev, qa).discuss("Design login feature", rounds=3)
print(result.conclusion)
```

### Vision / Multimodal

```python
# Analyze an image (URL or local file)
result = chat.ask(
    "What's in this image?",
    SendOptions(image="https://example.com/photo.jpg")
)

# Multiple images + local file
result = chat.ask(
    "Compare these two diagrams",
    SendOptions(images=["diagram1.png", "diagram2.png"])
)
```

Local files are automatically converted to base64 data URLs.

### Router (Intelligent Routing)

```python
from nb_llm import Router

math_chat = Chat(ChatConfig("deepseek", system="You are a math expert"))
code_chat = Chat(ChatConfig("deepseek", system="You are a coding expert"))
classifier = Chat(ChatConfig("deepseek"))

router = Router(
    experts={"math": math_chat, "code": code_chat},
    classifier=classifier,
)

# Automatically routes to the appropriate expert
answer = router.send("Solve x^2 + 3x - 4 = 0")  # → math expert
answer = router.send("Write a Python sort function")  # → code expert
```

### RAG (Retrieval-Augmented Generation)

```python
from nb_llm import RAG, RAGConfig

# Basic usage
rag = RAG(RAGConfig(model="deepseek"))
rag.add("./docs/")  # Load a directory
rag.add("manual.pdf")  # Load a single file
answer = rag.chat("How to install this product?")
print(answer)
for src in answer.sources:
    print(f"{src.file}: {src.chunk[:50]}...")
```

#### RAG with ChromaDB Persistence + Custom Embedding

```python
rag = RAG(RAGConfig(
    model="deepseek",
    embedding_model="BAAI/bge-m3",
    embedding_api_key="your-key",
    embedding_base_url="https://api.siliconflow.cn/v1",
    chunk_size=5000,
    chunk_overlap=500,
    top_k=15,
    vectorstore="chromadb",
    vectorstore_path="./my_vectordb",
))

# First run: vectorizes and persists to disk
# Second run: loads from disk, skips re-vectorization
if len(rag.vectorstore) == 0:
    rag.add("large_document.txt")

answer = rag.chat("How does it work?")
```

#### Standalone Embedding

```python
from nb_llm import Embedding

emb = Embedding(model="BAAI/bge-m3", api_key="...", base_url="...")
vector = emb.embed("Hello world")  # Single text → List[float]
vectors = emb.embed(["Hello", "World"])  # Batch → List[List[float]]
```

---

## Advanced Features

### ChatConfig — Reusable Configuration

```python
from dataclasses import dataclass

@dataclass
class ProductionConfig(ChatConfig):
    model: str = "deepseek"
    retry: int = 3
    cache: bool = True
    temperature: float = 0.3

chat = Chat(ProductionConfig())
```

### SendOptions — Per-call Options

```python
from nb_llm import SendOptions

# IDE shows all available fields after typing SendOptions(
result = chat.ask("Give a precise answer", SendOptions(
    temperature=0,
    json=True,
    max_tokens=100,
))
```

### Async Support

```python
import asyncio

async def main():
    answer = await chat.aio_send("Hello")
    async for chunk in chat.aio_stream("Write a poem"):
        print(chunk, end="")
    answers = await chat.aio_batch(["Q1", "Q2"], concurrency=5)

asyncio.run(main())
```

### History Persistence

```python
# File / SQLite / Redis backends
chat = Chat(ChatConfig("deepseek",
    history_backend="sqlite",
    history_url="./history.db",
))
```

### Fault Tolerance & Fallback

```python
chat = Chat(ChatConfig("deepseek",
    retry=3,
    fallback="qwen",
    cache=True,
    cache_ttl=3600,
))
```

### Conversation Save / Load

```python
# Save conversation history to file
chat.save("conversation.json")

# Load conversation from file
chat.load("conversation.json")

# View current history
print(chat.history)

# Use as context manager (auto-clears history on exit)
with Chat(ChatConfig("deepseek")) as chat:
    chat.send("Hello")
    chat.send("How are you?")
# history is cleared here
```

### Clone

```python
# Create an independent copy with same config and tools
chat2 = chat.clone()
```

### Token Counting

```python
token_count = chat.count_tokens("Some long text...")
is_ok = chat.check_tokens("Some text", max_tokens=4000)
```

### Multi-model Strategy

```python
# Race mode: use whichever responds first
chat = Chat(ChatConfig(
    model=["gpt-4o", "deepseek", "qwen"],
    strategy="fastest",
))
```

### Session Management

```python
session_a = chat.session("user_001")
session_b = chat.session("user_002")
session_a.send("My name is Alice")
session_b.send("My name is Bob")
```

### Cost Tracking

```python
with chat.track_cost() as tracker:
    chat.send("Question 1")
    chat.send("Question 2")
print(f"Total tokens: {tracker.total_tokens}")
```

### Hook System

Three ways to register hooks — choose the best fit for your scenario:

#### Method 1: Decorators (most concise)

```python
chat = Chat(ChatConfig("deepseek"))

@chat.on_before
def log_request(messages, options):
    print(f"[Sending] {messages[-1]['content'][:50]}...")

@chat.on_after
def log_response(response, usage):
    print(f"[Received] tokens: {usage.total_tokens}")

@chat.on_error
def handle_error(error):
    print(f"[Error] {error}")

@chat.on_tool_call
def log_tool(name, args):
    print(f"[Tool] {name}({args})")
```

#### Method 2: Manual Registration (dynamic)

The decorator syntax `@chat.on_before` is sugar for `chat.on_before(func)` — you can call it directly:

```python
def my_logger(messages, options):
    print(f"[LOG] {messages[-1]['content'][:50]}...")

chat.on_before(my_logger)     # same as @chat.on_before
chat.on_after(my_callback)    # same as @chat.on_after
chat.on_error(my_handler)     # same as @chat.on_error
chat.on_tool_call(my_tracer)  # same as @chat.on_tool_call
```

Useful for runtime dynamic hook registration, or batch-registering the same hooks across multiple Chat instances.

#### Method 3: Inherit Chat (unified behavior)

Subclass Chat to define hooks once — all instances get them automatically:

```python
class LoggedChat(Chat):
    def __init__(self, config):
        super().__init__(config)
        self.on_before(self._log_before)
        self.on_after(self._log_after)

    def _log_before(self, messages, options):
        print(f"[Sending] {messages[-1]['content'][:50]}...")

    def _log_after(self, response, usage):
        print(f"[Received] tokens: {usage.total_tokens}")

chat = LoggedChat(ChatConfig("deepseek"))
# No need to repeat registration for each instance
```

| Hook | Trigger | Callback Signature |
|------|---------|-------------------|
| `on_before` | Before sending request | `(messages, options)` |
| `on_after` | After receiving response | `(response, usage)` |
| `on_error` | On exception | `(error)` |
| `on_tool_call` | After each tool call | `(name, args)` |

### Tool Management

Beyond `@chat.tool`, you can manage tools programmatically:

```python
def search(query: str) -> str:
    """Search the web"""
    return f"Results for {query}"

chat.add_tool(search)            # Register a tool
chat.add_tools([func1, func2])   # Register multiple tools
chat.remove_tool("search")       # Remove by name
chat.remove_tool(search)         # Remove by function
chat.clear_tools()               # Remove all tools
print(chat.tools)                # List all tool schemas
```

### @step Observability

```python
from nb_llm import step

@step("translate")
def translate(text):
    return chat.ask(f"Translate to English: {text}")

@step("summarize")
def summarize(text):
    return chat.ask(f"Summarize: {text}")

# Each step emits events (start/end/error) with timing info
result = translate("你好世界")
```

Register listeners to capture step events:

```python
from nb_llm.workflow.step import on_step_event

@on_step_event
def log_steps(event_type, data):
    print(f"[{event_type}] {data['name']} ({data.get('elapsed', 0):.2f}s)")
```

### Custom Provider Registration

```python
from nb_llm import register_provider

register_provider(
    name="my-llm",
    model="my-model-v1",
    base_url="https://my-api.com/v1",
    api_key_env="MY_API_KEY",
)

# Now use it like any built-in model
chat = Chat(ChatConfig("my-llm"))
```

---

## CLI Tools

```bash
# Version
python -m nb_llm version

# List models
python -m nb_llm models

# Single query (requires provider's API Key)
python -m nb_llm ask "Hello" --model deepseek --api-key "$DEEPSEEK_API_KEY"

# Interactive chat
python -m nb_llm chat --model deepseek --stream --api-key "$DEEPSEEK_API_KEY"

# Real-world example using SiliconFlow
python -m nb_llm ask "Hello"  --model "tencent/Hunyuan-MT-7B" --base-url "https://api.siliconflow.cn/v1" --api-key "$YOUR_SILICONFLOW_API_KEY"
```

---

## Built-in Model Support

| Alias | Actual Model | Provider |
|------|----------|--------|
| `deepseek` | deepseek-chat | DeepSeek |
| `gpt-4o` | gpt-4o | OpenAI |
| `claude` | claude-sonnet-4 | Anthropic |
| `qwen` | qwen-plus | Qwen (Alibaba) |
| `glm` | glm-4 | Zhipu GLM |
| `siliconflow` | DeepSeek-V3 | SiliconFlow |
| `ollama` | llama3 | Ollama (local) |

Also supports any OpenAI-compatible API:

```python
chat = Chat(ChatConfig(
    model="any-model-name",
    base_url="https://your-api.com/v1",
    api_key="sk-xxx",
))
```

---

## Response Objects

All response objects support `.attribute` access (IDE completion) + `.to_dict()` conversion.

```python
result = chat.send("Hello")
print(result)                    # Use as string
print(result.text)               # Explicit text access
print(result.usage.total_tokens) # Token usage
print(result.model)              # Model used
print(result.to_dict())          # Convert to dict
print(result.to_json_str())      # Formatted JSON string (indent=4)
```

| Object | Purpose | Key Attributes |
|------|------|----------|
| `ChatResponse(str)` | Response value | `.text` `.usage` `.tool_calls_made` `.parsed` `.model` |
| `UsageInfo` | Token usage | `.prompt_tokens` `.completion_tokens` `.total_tokens` |
| `ToolCallRecord` | Tool call record | `.name` `.args` `.result` `.elapsed` |
| `StreamResponse` | Streaming response | `.text` `.usage` `.finish_reason` |
| `TeamResult` | Multi-agent result | `.conclusion` `.transcript` `.rounds` |

---

## Project Structure

```
nb_llm/
├── core/           # Chat core, config, response, history backends
├── providers/      # OpenAI / Anthropic adapters, model registry
├── tools/          # Tool Schema generation, execution engine
├── agents/         # Router, Team multi-agent
├── rag/            # RAG retrieval-augmented generation
├── embedding/      # Embedding vectorization
├── middleware/      # Cache middleware
├── workflow/       # @step observability
└── __main__.py     # CLI entry point
```

---

## Documentation

- [API Documentation (CN)](https://github.com/ydf0509/nb_llm/blob/main/tests/ai_docs/ai_cn_doc/nb_llm_api_doc.md) — Full API usage + LangChain comparison
- [Design Documentation (CN)](https://github.com/ydf0509/nb_llm/blob/main/tests/ai_docs/ai_cn_doc/nb_llm_design_doc.md) — Architecture + module details
- [API Documentation (EN)](https://github.com/ydf0509/nb_llm/blob/main/tests/ai_docs/ai_en_doc/nb_llm_api_doc.md) — Full API usage (English)
- [Design Documentation (EN)](https://github.com/ydf0509/nb_llm/blob/main/tests/ai_docs/ai_en_doc/nb_llm_design_doc.md) — Architecture design (English)
- [Changelog](https://github.com/ydf0509/nb_llm/blob/main/tests/ai_docs/ai_cn_doc/ai_update.md) — All major changes

---

## Compatibility

- Python 3.7+
- Core dependency: `openai`
- Optional: `httpx`, `python-dotenv`, `tiktoken`, `pydantic`, `anthropic`, `redis`

## License

MIT
