Metadata-Version: 2.4
Name: tokenstretcher
Version: 1.1.0
Summary: TokenStretcher — Hierarchical AI task manager that breaks complex prompts into specialized agents for massive token & cost savings
Author: TokenStretcher Contributors
License: Business Source License 1.1 (BSL 1.1)
        
        Copyright (c) 2026 TokenSaverAI Contributors
        
        Parameters:
        - Licensor: TokenSaverAI Contributors
        - Licensed Work: TokenSaverAI (all files in this repository)
        - Additional Use Grant: You may use the Licensed Work for non-commercial purposes,
          including personal use and use inside internal tools at companies.
        - Change Date: Four years from the date this file was first published.
        - Change License: MIT License
        
        On the fourth anniversary of the date this file was first published, this license
        will automatically convert to the MIT License.
        
        Until the Change Date, the following restrictions apply:
        
        **Commercial Use Restriction:**
        You may NOT use this software, in whole or in substantial part, to provide a
        commercial service, hosted product, or resold offering (including as part of an
        AI coding agent, devtool, or proxy service) without a separate commercial license
        from the Licensor.
        
        This includes, but is not limited to:
        - Offering TokenSaverAI (or a derivative) as a paid or freemium service
        - Embedding it into a commercial AI coding product and charging for usage
        - Reselling access to the decomposition, routing, or proxy capabilities
        
        The intent is to keep the local client and core ideas open for individuals and
        builders, while protecting the commercial rights around the prepaid proxy model
        and high-quality orchestration intelligence.
        
        For commercial licensing inquiries, contact the maintainers.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
        ADDITIONAL DISCLAIMER FOR PAID / PROXY USAGE:
        
        This software includes functionality that can be used to operate a paid proxy
        service that resells or intermediates access to third-party large language model
        APIs (such as those provided by xAI, OpenAI, Anthropic, etc.).
        
        THE AUTHORS AND COPYRIGHT HOLDERS PROVIDE NO WARRANTY that:
        - Use of this software (including the proxy, virtual key, and prepaid billing
          components) complies with the terms of service of any third-party LLM provider.
        - The prepaid balance tracking, virtual key system, or billing logic is accurate,
          secure, or free from errors.
        - Any funds collected through the use of this software will be honored, refunded,
          or result in continued service.
        - The software will be available, error-free, or suitable for any particular
          commercial purpose.
        
        USERS WHO OPERATE PAID INSTANCES OF THIS SOFTWARE (including via Lemon Squeezy,
        Stripe, or any other payment processor) ASSUME ALL RISK AND LIABILITY for
        compliance with applicable laws, third-party terms of service, data protection
        regulations, and consumer protection obligations.
        
        IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY INDIRECT,
        INCIDENTAL, SPECIAL, CONSEQUENTIAL, OR PUNITIVE DAMAGES, INCLUDING BUT NOT
        LIMITED TO LOSS OF PROFITS, DATA, USE, GOODWILL, OR OTHER INTANGIBLE LOSSES,
        RESULTING FROM (A) THE USE OR INABILITY TO USE THE SOFTWARE; (B) ANY CONDUCT OR
        CONTENT OF ANY THIRD PARTY; (C) ANY CONTENT OBTAINED FROM THE SOFTWARE; OR (D)
        UNAUTHORIZED ACCESS, USE, OR ALTERATION OF YOUR TRANSMISSIONS OR CONTENT,
        WHETHER BASED ON WARRANTY, CONTRACT, STATUTE, TORT (INCLUDING NEGLIGENCE), OR
        ANY OTHER LEGAL THEORY, WHETHER OR NOT WE HAVE BEEN INFORMED OF THE POSSIBILITY
        OF SUCH DAMAGE.
        
        YOU ARE SOLELY RESPONSIBLE FOR ENSURING THAT YOUR USE OF THIS SOFTWARE, INCLUDING
        ANY PAID PROXY OR BILLING FUNCTIONALITY, COMPLIES WITH ALL APPLICABLE LAWS AND
        THE TERMS OF SERVICE OF ANY LLM PROVIDERS WHOSE MODELS ARE ACCESSED THROUGH IT.
        
Project-URL: Homepage, https://github.com/ResponsiveDesignLLC/TokenSaverAI
Project-URL: Documentation, https://github.com/ResponsiveDesignLLC/TokenSaverAI#readme
Project-URL: Repository, https://github.com/ResponsiveDesignLLC/TokenSaverAI
Project-URL: Issues, https://github.com/ResponsiveDesignLLC/TokenSaverAI/issues
Keywords: ai,llm,agents,token-optimization,cost-saving,orchestration,grok,litellm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Internet :: Proxy Servers
Classifier: Topic :: Office/Business :: Financial :: Point-Of-Sale
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: litellm>=1.40.0
Requires-Dist: pydantic>=2.7.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.7.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: aiofiles>=23.0.0
Requires-Dist: tomli>=2.0.0; python_version < "3.11"
Provides-Extra: mcp
Requires-Dist: mcp>=1.0.0; extra == "mcp"
Provides-Extra: proxy
Requires-Dist: fastapi>=0.110.0; extra == "proxy"
Requires-Dist: uvicorn[standard]>=0.29.0; extra == "proxy"
Requires-Dist: python-multipart>=0.0.9; extra == "proxy"
Requires-Dist: resend>=0.8.0; extra == "proxy"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# TokenStretcher

**A hierarchical AI task orchestrator that delivers higher quality results for a fraction of the token cost.**

Instead of throwing your entire complex prompt at the most expensive model, TokenStretcher intelligently decomposes it, routes each piece to the cheapest competent specialist, executes in parallel where safe, and synthesizes a superior final answer.

Typical real-world savings: **40–70%** on complex multi-part tasks.

---

## Why TokenStretcher Exists

Large prompts to powerful models are brutally expensive and often produce bloated or unfocused output.

TokenStretcher wins by intelligently breaking down complex tasks and routing work to the most suitable models for the job.

---

## Prepay Model — Zero Risk for You

**You preload money. You can never get a surprise bill.**

This is the core safety promise of TokenStretcher in production use.

### How It Works

1. You add funds to your prepaid wallet (`tokenstretcher topup` or via Stripe/Lemon Squeezy).
2. When you run a task, TokenStretcher checks your balance first.
3. Every LLM call (via the optional LiteLLM proxy) is tracked and deducted in real time.
4. If your balance is too low, it gracefully refuses to run and tells you exactly how much to add.

### Local vs Proxy Mode

- **Local / BYOK mode** (default): You use your own API keys. No prepay required. Great for individuals and testing.
- **Proxy Mode** (recommended for teams/production): All traffic goes through a LiteLLM proxy. Virtual keys with hard `max_budget` are issued. The proxy automatically deducts from your prepaid wallet.

You can switch between the two at any time.

### Wallet Commands

```bash
tokenstretcher balance                 # See current prepaid balance + history
tokenstretcher topup                   # Show payment instructions
tokenstretcher add-funds 25            # Manually credit (local or after receiving payment)
tokenstretcher proxy start             # Start the budget-enforcing proxy server
```

This model completely eliminates the risk of runaway costs that has hurt many developers using raw LLM APIs.

---

## Setting Up Your Proxy + Virtual Keys (Monetization Mode)

This is how you turn TokenStretcher into a real business that sells prepaid AI access safely.

### Step-by-step Setup

1. **Get your master key**
   ```bash
   export XAI_API_KEY="xai-your-real-key-here"
   ```

2. **Configure the proxy**
   Edit `.tokensaver/config.toml`:

   ```toml
   [models]
   powerful = "xai/grok-4"

   proxy_default_model = "xai/grok-3"
   email_provider = "smtp"
   email_from = "keys@yourdomain.com"
   ```

3. **Start the proxy server**
   ```bash
   tokenstretcher proxy start
   # Then in another terminal:
   uvicorn tokensaver.proxy.server:app --port 8000
   ```

4. **Create and deliver virtual keys**
   ```bash
   tokenstretcher proxy create-key customer@example.com 50
   ```

   This will:
   - Generate a virtual key (`tsai_...`)
   - Link it to the user's prepaid balance
   - Automatically email the key to the customer (if email is configured)

5. **Users use the virtual key**
   They point their OpenAI client at your proxy:

   ```python
   from openai import OpenAI

   client = OpenAI(
       base_url="http://your-proxy:8000/v1",
       api_key="tsai_xxxxxxxxxxxxxxxxxxxx"
   )
   ```

Every call is authenticated, budget-checked, and deducted from their prepaid balance in real time.

### How Billing Works

- You control the real `XAI_API_KEY`
- Customers only ever receive virtual keys with limited budgets
- When their prepaid balance hits zero, their key stops working automatically
- You never have to chase invoices

This is the safest possible way to sell access to powerful models.

### Quick Commands (After Setup)

```bash
# Start the full server (now launches automatically)
tokenstretcher proxy start

# Create a key manually + email it
tokenstretcher proxy create-key customer@example.com 25

# View everything
tokenstretcher proxy dashboard
```

**Security Recommendation:**
Set an `ADMIN_TOKEN` in your environment. All `/admin/*` routes will then require:
```
Authorization: Bearer your-admin-token
```

### Making TokenStretcher Available on PyPI

The project is now reasonably well packaged for PyPI.

To publish:

```bash
pip install build twine
python -m build
twine upload dist/*
```

See `DEPLOYMENT.md` for more details on distribution and hosting the proxy.

---

## Installation

```bash
# From source (recommended during early development)
git clone https://github.com/yourname/tokensaverai
cd TokenSaver
pip install -e .

# Or after publishing
pip install tokensaverai
```

### API Keys

TokenStretcher uses [LiteLLM](https://docs.litellm.ai), so it supports virtually every provider.

Recommended environment variables:

```bash
export XAI_API_KEY="xai-..."           # Grok models (best experience)
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GROQ_API_KEY="gsk_..."
```

---

## Quick Start

### CLI (Recommended)

```bash
# Basic usage
tokenstretcher "Build a production FastAPI service with JWT auth, user CRUD, and rate limiting"

# Use a specific powerful model for planning
tokenstretcher "Design a multi-tenant SaaS permissions system" --model grok-4

# Plan only (see the decomposition without spending money)
tokenstretcher --plan-only "Refactor the entire payment module for better testability"

# Interactive mode (great for exploration)
tokenstretcher interactive
```

### Python API

```python
import asyncio
from tokenstretcher import Manager, load_config

async def main():
    config = load_config()
    config.default_powerful_model = "grok-4"
    config.verbose = True

    manager = Manager(config=config)
    result = await manager.run(
        "Create a complete, secure FastAPI authentication system with refresh tokens and role-based access"
    )

    print(result.final_output)
    print("\n" + result.savings.summary())

asyncio.run(main())
```

---

## How It Works

1. **Decomposition** — A frontier model (your choice of powerful model) receives an elite prompt that forces it to break the task into the smallest possible high-quality subtasks while choosing the cheapest viable model tier for each.
2. **Specialized Agents** — Each subtask gets a narrowly scoped expert role + only the project context that is actually relevant to it.
3. **Parallel Execution** — Tasks with no dependencies run concurrently (controlled by `max_parallel_agents`).
4. **Recursive Sub-Managers** — For very large scopes, a sub-manager can be spawned that performs its own decomposition.
5. **Synthesis** — A final balanced model combines everything into one coherent, high-quality deliverable.
6. **Savings Report** — You get exact numbers comparing against the cost of doing it as one giant prompt.

---

## Configuration

Create `.tokensaver/config.toml` (or `config.toml`) in your project:

```toml
[models]
cheap = "gpt-4o-mini"
balanced = "xai/grok-3"
powerful = "xai/grok-4"

savings_share = 0.15
max_parallel_agents = 6
max_recursion_depth = 2
verbose = true
```

Full options are defined in `tokensaver/models.py` → `TokenSaverConfig`.

You can also generate an example with:

```bash
tokenstretcher --init-config
```

---

## Example Agent Roles (Built-in)

- `Python Backend Expert`
- `FastAPI Auth Specialist`
- `Code Reviewer & Optimizer`
- `Research Summarizer`
- `Technical Documentation Writer`
- `Frontend React/TypeScript Specialist`
- `SQL / Database Designer`
- `DevOps & Deployment Expert`

Adding your own roles is trivial — see the extension guide below.

---

## Savings Calculator & Monetization Foundation

Every run produces a `SavingsReport`:

```python
report = result.savings
print(report.percent_saved)           # e.g. 58.3
print(report.cost_saved_usd)          # e.g. 0.074
print(report.summary())
```

The system already calculates:
- Baseline cost of a single powerful-model call with full context
- Actual cost of the hierarchical execution
- Percentage saved

This data is also written to `savings-reports/` as JSON for later analysis or proxy billing.

**Future proxy mode** (not yet implemented): A hosted version can sit between you and the model providers, take a configurable cut of the savings (`savings_share`), and still deliver net savings to the user.

---

## Grok Build / MCP Integration

TokenStretcher was built from day one to be an excellent **tool** for other AI systems (especially Grok Build).

### Recommended Patterns

- **Inside Grok Build**: When a user gives you a complex, multi-part, or expensive-looking task, strongly consider calling:

  ```bash
  tokenstretcher "the user's exact request here"
  ```

- **As a library** inside a running agent (see Python example above).

- **MCP Server** (planned): A future release will ship with `tokenstretcher mcp serve`. Until then, wrapping `Manager.run` with the official MCP Python SDK takes ~30 lines.

See [AGENTS.md](./AGENTS.md) for detailed guidance on using and extending TokenStretcher inside AI coding environments.

---

## Project Structure

```
tokensaver/
├── __init__.py
├── cli.py                 # Rich CLI with interactive + headless modes
├── manager.py             # The orchestrator (decompose → execute → synthesize)
├── agent.py               # Narrow, context-filtered specialist agents
├── models.py              # All Pydantic models + config schema
├── prompts.py             # The secret sauce (especially DECOMPOSER)
├── config.py              # TOML/YAML + env loading
└── utils.py               # Context loading, cost math, pretty printing

examples/
templates/                 # Future: saved role templates & workflows
```

---

## Extending TokenStretcher

### Add a New Role

Edit `tokensaver/prompts.py`:

```python
ROLE_TEMPLATES["Kubernetes Reliability Engineer"] = """You are a senior platform engineer..."""
```

Then reference it in decomposition plans.

### Custom Model Routing

Override `TokenSaverConfig.get_model_for_tier()` or pass a custom config object.

### Better Context Filtering

Improve `utils.filter_context_for_task()` — the current implementation is deliberately simple and cheap.

### New Execution Strategies

Subclass `Manager` and override `_execute_plan` or `_synthesize`.

---

## Development

```bash
pip install -e ".[dev]"
ruff check .
pytest
```

---

## Roadmap

- [ ] Native MCP server (`tokenstretcher mcp serve`)
- [ ] Persistent task graphs + resume
- [ ] Real output token accounting + better cost tracking via LiteLLM
- [ ] Hosted proxy with savings-share billing
- [ ] Evaluation harness with golden tasks
- [ ] VS Code / Cursor extension

---

## License

MIT © TokenStretcher Contributors

---

## Philosophy (TL;DR)

> The best AI system is not the one that uses the biggest model.
> It is the one that uses the *right* model for every piece of the work,
> with the least possible wasted context, and still ships outstanding results.

TokenStretcher is an early, pragmatic step in that direction.

---

## Disclaimer and Limitation of Liability

**IMPORTANT – READ CAREFULLY BEFORE USING THIS SOFTWARE**

This software is provided **"AS IS"** and **"AS AVAILABLE"**, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement.

**IN NO EVENT SHALL THE AUTHORS, CONTRIBUTORS, OR COPYRIGHT HOLDERS BE LIABLE** for any claim, damages, or other liability, whether in an action of contract, tort (including negligence), or otherwise, arising from, out of, or in connection with the software or the use or other dealings in the software, even if advised of the possibility of such damages.

### Special Warning for Paid / Commercial Use

This software contains components (the proxy server, virtual key system, prepaid wallet, and Lemon Squeezy integration) that are specifically designed to support **paid commercial services** that resell or intermediate access to third-party large language models.

**By operating any paid instance of this software you expressly acknowledge and agree that:**

- You assume **all risk and full legal responsibility** for compliance with the Terms of Service of any LLM providers whose models are accessed through this software.
- The authors make **no guarantees whatsoever** regarding billing accuracy, key generation, email delivery, service availability, data integrity, or correctness of any financial tracking.
- You are solely responsible for compliance with all applicable laws, including consumer protection, data protection (GDPR, CCPA, etc.), and financial regulations.
- The authors shall have **no liability** for any financial losses, chargebacks, refunds, regulatory actions, customer disputes, service interruptions, or any other damages arising from your commercial use of this software.

**Strongly Recommended:** If you operate a paid service using this software, you should create and publish your own Terms of Service and Privacy Policy that govern your relationship with your customers.

See the full LICENSE file for the complete legal text.

---

## License & Commercial Use

TokenStretcher is released under the **Business Source License (BSL 1.1)**.

### Summary

- **Free for individuals** and non-commercial / internal use.
- **Commercial use is restricted** until the Change Date (4 years after publication).
- You may **not** resell, host, or embed this as a paid service (including inside commercial AI coding agents) without a separate license.

### Why This License?

The goal is to keep the tool genuinely useful and open for developers and AI agents (especially Grok Build users), while preventing the core IP — particularly the prepaid proxy model and high-quality hierarchical orchestration — from being immediately copied and resold.

If you want to build a commercial product on top of TokenStretcher (hosted proxy, enterprise agent integration, etc.), please reach out for a commercial license.

See the full `LICENSE` file for legal text.

---

## Philosophy (TL;DR)

> The best AI system is not the one that uses the biggest model.
> It is the one that uses the *right* model for every piece of the work,
> with the least possible wasted context, and still ships outstanding results.

TokenStretcher is an early, pragmatic step in that direction.

Contributions that improve quality **or** cost (ideally both) are extremely welcome.
