Metadata-Version: 2.4
Name: omniagent-fleet
Version: 0.1.4
Summary: AI Infrastructure OS — same engine, four facades (CLI, REST, web, MCP). Reduce 80-95% of AI spend by routing tasks to the right model.
Author-email: Sergio Garcia <sgarcia@ubicacuenca.com>
License: MIT
Project-URL: Homepage, https://github.com/landrover1984/omniagent
Project-URL: Source, https://github.com/landrover1984/omniagent
Project-URL: Issues, https://github.com/landrover1984/omniagent/issues
Project-URL: Changelog, https://github.com/landrover1984/omniagent/blob/main/CHANGELOG.md
Keywords: ai,llm,router,cost-reduction,cli,mcp,local-first,ollama,fleet,infrastructure
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Distributed Computing
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.109.0
Requires-Dist: uvicorn[standard]>=0.27.0
Requires-Dist: pydantic>=2.5.0
Requires-Dist: gitpython>=3.1.40
Requires-Dist: psutil>=5.9.8
Requires-Dist: py-cpuinfo>=9.0.0
Requires-Dist: httpx>=0.26.0
Requires-Dist: python-socketio>=5.11.0
Requires-Dist: websockets>=12.0
Requires-Dist: rich>=13.7.0
Requires-Dist: typer>=0.9.0
Requires-Dist: docker>=7.0.0
Requires-Dist: pyyaml>=6.0.1
Requires-Dist: paramiko>=3.4.0
Requires-Dist: boto3>=1.34.0
Dynamic: license-file

﻿<div align="center">

# OmniAgent

## You are overspending on AI.

**OmniAgent routes every AI task to the most efficient model automatically.**
Local first. Cloud only when it pays off. **80–97% savings** on your AI bill.

[Try the AI Cost Calculator](landing/index.html) · [See the 60-second demo](#the-60-second-demo) · [Star on GitHub](https://github.com/landrover1984/omniagent)

[![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![Tests](https://img.shields.io/badge/tests-367%20passing-brightgreen.svg)](tests/)
[![Zero Telemetry](https://img.shields.io/badge/zero-telemetry-blueviolet.svg)]()
[![Local First](https://img.shields.io/badge/local-first-blue.svg)]()
[![MIT 100%](https://img.shields.io/badge/100%25-open%20source-orange.svg)]()

</div>

---

## The math most devs don't realize

> *"I just need AI to review my code, write docstrings, and rename things."*
> — every developer with a $500/month Cursor + Claude bill

Here's what you actually need (benchmarked on real hardware, Jun 2026):

| Task | All-Claude reality | OmniAgent | Savings |
|------|-------------------|-----------|---------|
| **Review a function for bugs** | Claude · **$0.30** | qwen2.5-coder:7b (local) · **$0.00** | **100%** |
| **Write a Google-style docstring** | Claude · **$0.28** | qwen2.5-coder:7b (local) · **$0.00** | **100%** |
| **Rename a variable** | Claude · **$0.15** | qwen2.5-coder:7b (local) · **$0.00** | **100%** |
| **Explain TCP vs UDP** | Claude · **$0.10** | qwen2.5-coder:7b (local) · **$0.00** | **100%** |
| **Classify a bug ticket** | Claude · **$0.08** | qwen2.5-coder:7b (local) · **$0.00** | **100%** |

**Fleet benchmark · MSI desktop (GTX 1650, 4GB VRAM, 8 threads) · 5 tasks · 506 tokens: $0.00 total cloud spend.**

OmniAgent uses Claude when Claude is the right tool. It just doesn't use Claude when Qwen can do the same job at 1% the cost.

---

## The 60-second demo

Real benchmark run on MSI desktop (GTX 1650, 4GB VRAM, 8 threads):

```
→ msi-node: qwen2.5-coder:7b | $0.00 | 30,123ms  (review function)
→ msi-node: qwen2.5-coder:7b | $0.00 | 50,475ms  (write docstring)
→ msi-node: qwen2.5-coder:7b | $0.00 | 10,181ms  (rename variable)
→ msi-node: qwen2.5-coder:7b | $0.00 | 15,321ms  (explain TCP/UDP)
→ msi-node: qwen2.5-coder:7b | $0.00 |  8,649ms  (classify ticket)

Total: 5 tasks · 506 tokens · $0.00 cloud spend · avg 22.95s/task
```

**Every task ran on local GPU. Zero cloud cost. That's what "AI Infrastructure OS" means.**

---

## Why OmniAgent exists

The AI industry is in an efficiency crisis:

- **73% of prompts** sent to frontier models could be handled by smaller local models
- Developers burn **$500–$1000/month** on Cursor + Claude + GPT with **no visibility** into what each line costs
- Agents **hallucinate APIs, break production code, leak secrets, forget to commit** — and you find out at 2 AM
- Massive energy waste: a single city could run on the daily inference cycles of one frontier API call
- Lock-in: one IDE, one provider, one pricing tier
- No coordination between local hardware, cloud APIs, VPS nodes, and the billions of idle GPUs sitting in garages and offices worldwide

**The models will keep changing. The hardware will keep evolving.**
**The only permanent problem is: how do you orchestrate all this intelligence efficiently, securely, and cheaply?**

That's what OmniAgent solves.

---

## What it is (and what it isn't)

OmniAgent is **not** a model. **Not** an agent. **Not** a chatbot.

OmniAgent is the **operating system that coordinates the entire AI ecosystem** — models, agents, hardware, costs, and security — so you stop wasting compute, money, and trust.

Think of it as:

- **Linux** doesn't create every app, but everything runs on it.
- **Kubernetes** doesn't build every container, but it orchestrates them all.
- **Steam** doesn't develop every game, but it hosts them.

**OmniAgent** doesn't compete with OpenAI, Anthropic, DeepSeek, or your favorite open-source model. **It makes all of them work together intelligently.**

---

## The 4 façades: one engine, four ways to use it

| Façade | Audience | What you get |
|--------|----------|--------------|
| **CLI** (`omniagent route "task"`) | Developers, power users | Full control, scriptable, fits in any pipeline |
| **Web app** (`omniagent web`) | Everyone, especially non-devs | 5-tab dashboard on `http://localhost:8765` — visualize routing, hardware, optimize |
| **YAML agents** (`*.yaml` in `~/.omniagent/agents/`) | Agent authors, teams | Declarative, shareable, version-controlled — see [docs/agents.md](docs/agents.md) |
| **MCP tools** (via any MCP client) | Tool integrators | 6 tools: route, classify, decide, audit, deploy, optimize |

Same Python engine. Four ways to use it. You pick the one that fits your workflow.

---

## The 90/9/1 design

- **90% of users** never touch the CLI. They open `http://localhost:8765`, type a task, see the routing, hit **Run it ▶**.
- **9% of users** open the **Optimize** tab, see what they're overspending on, and one-click install a cheaper agent.
- **1% of users** write their own YAML agents, publish them, share them.

The dashboard is the product. The YAML is the protocol. The CLI is the power tool.

---

## How it works (under the hood)

1. **Task arrives** — text in the CLI, the web, or via MCP
2. **TaskClassifier** — 10 categories, 5 complexities, detects vision / function-calling
3. **AgentRegistry** — finds the right agent (project > user > builtin, YAML-defined)
4. **SmartRouter** — picks the right model given the agent's constraints + your budget
5. **AdaptiveRouter** — combines all of the above into a single `RoutingDecision`
6. **LLM call** — local first, cloud only if budget + quality demand it
7. **CostTracker** — logs the spend, feeds back into the next routing decision
8. **Guardian++** — pre / during / post audit on every action (secret scan, command sandbox, commit verification)

**364 unit tests** + **7 integration tests** validate every step.

---

## Quickstart (60 seconds)

```bash
git clone https://github.com/landrover1984/omniagent
cd omniagent
pip install -e .
omniagent web
# open http://localhost:8765
```

Or use the CLI directly:

```bash
omniagent agent-route "review this code for security" --budget 0.10
omniagent agent-list                     # see all available agents
omniagent agent-install ./my-agent.yaml  # add your own
omniagent optimize                       # find cheaper routes
omniagent cost-report                    # what you've spent
omniagent agent-decide "design a cache"  # see the routing (no LLM call)
```

**Zero API keys needed to start.** Local models via Ollama work out of the box.

---

## What ships today

| Layer | Status | Tests |
|-------|--------|-------|
| Agent Protocol (YAML agents) | Shipped | 18 |
| Task Classifier (10 categories) | Shipped | 20 |
| AdaptiveRouter (the brain) | Shipped | 8 |
| 5-tab Web UI | Shipped | 13 endpoints |
| Cost Optimizer (the killer feature) | Shipped | 3 |
| Anti-Hallucination Audit (Guardian++) | Shipped | 23 |
| Hybrid Deploy (local / VPS / AWS) | Shipped | 28 |
| MCP Server (6 tools) | Shipped | 18 |
| **CLI commands** | **20+** | **50+** |
| **Total** | | **364 passing, 2 skipped** |

---

## Roadmap

| Phase | Theme | Status |
|-------|-------|--------|
| **v0.1.x** | **AI Infrastructure OS** — routing, cost, optimize, local-first | **Shipped** |
| v0.2.x | Optimization Layer — replay mode, "Claude unnecessary" detector, savings reports | Next |
| v0.3.x | Visual Dashboard — real-time cost graphs, agent analytics, team view | Planned |
| v0.4.x | Distributed Compute — idle GPU federation, opt-in mesh | Deferred |
| v0.5.x | Marketplace + Incentives — community YAMLs, reputation, rewards | Deferred |

We are **not** building another "AI wrapper". We are building the **coordination layer** that the entire AI ecosystem needs.

Distributed compute and marketplace are real, but they're not the wedge. The wedge is: **stop overspending on AI**. Get that right first.

---

## License

**MIT — 100% open source, forever.** No paid tier, no "enterprise edition", no bait-and-switch.

---

<div align="center">

**The models will change. The hardware will change. The coordination layer is permanent.**

[Star on GitHub](https://github.com/landrover1984/omniagent) · [Try the Cost Calculator](landing/index.html) · [Write your first agent](docs/agents.md)

</div>
