Same tasks. Different intelligence. Real benchmark on MSI desktop (GTX 1650, 4GB VRAM, 8 threads): 5 tasks, 506 tokens, $0.00 cloud spend. Every task ran locally on qwen2.5-coder:7b.
http://localhost:8765 — dashboard with 5 tabs.omniagent status to see your hardware. Run omniagent node start to join a fleet.
| Task | Model | Latency | Cost |
|---|---|---|---|
| Review Python function | qwen2.5-coder:7b | 30,123ms | $0.00 |
| Write Google-style docstring | qwen2.5-coder:7b | 50,475ms | $0.00 |
| Rename variable | qwen2.5-coder:7b | 10,181ms | $0.00 |
| Explain TCP vs UDP | qwen2.5-coder:7b | 15,321ms | $0.00 |
| Classify bug ticket | qwen2.5-coder:7b | 8,649ms | $0.00 |
Tries local model first, falls back to cloud only when budget or quality demands it.
omniagent route "fix the login bug"
Per agent, per model, per project. Real-time dashboard shows what each task costs.
omniagent status
Scans for secrets, blocks destructive commands, verifies commits. Auto-rollback on failure.
guardian scan .
Run, Optimize, Hardware, Agents, Monitor. Visual routing trace per task. No code needed.
omniagent web
Claude, GPT-4, DeepSeek, Mistral, Ollama (local), vLLM. You control costs.
OLLAMA_BASE_URL=http://192.168.1.50:11434
Runs on your hardware. No telemetry. No vendor lock-in. Zero marginal cost.
no API keys required
omniagent detect to see your hardware. Run omniagent optimize to find cheaper routes.
~/.omniagent/agents/my-agent.yaml · Share as omniagent agent-install ./agent.yaml
omniagent node start --port 8766 · Fleet router picks best node by free VRAM + CPU load.