AI coding tools often see only a slice of your codebase. Entroly gives compatible tools broader selected context — with 70-95% fewer tokens.
See Entroly cut up to 99.5% of tokens — a figure measured on the needle-in-a-haystack benchmark with accuracy fully retained — while your AI sees 100% of the code.
One command auto-detects your IDE, language, and project structure. No manual prompt engineering required.
A high-performance Rust core (via PyO3) processes your entire codebase in under 10ms. Scale to millions of lines.
Reinforcement Learning adjusts context weights based on AI response quality. Gets smarter every time you code.
55 SAST rules catch hardcoded secrets and SQL injection before they reach the AI. Security by design.
Get an A-F grade for your codebase health. Detect god files, dead code, and cross-module clones automatically.
First-class support for Multi-Agent systems. Nash bargaining for token budgets between sub-agents.
Most context tools optimize a single lever — input compression. Entroly ships 19 distinct mechanisms across input, inference, output, verification, and learning. Most are multiplicative, not additive — and every one reads from a real source file you can open and audit.
Compressors re-rank context on every call, which busts the provider's KV cache. The aligner hashes the injected context and holds the prefix stable so cache hits actually land.
Deterministic faithfulness verifier — no second LLM, no API call. Statistically ties a modern LLM judge on HaluEval-QA at zero marginal cost.
A Bayesian per-task router sends easy work to cheap models and escalates only when verifier risk says so. Fail-closed: when uncertain, it routes to the strongest model.
Knapsack DP + 9 specialized compressors + a dep-graph pick the most information-dense fragments that fit your budget.
proxy_transform.pyDeterministic $0 faithfulness verifier — no second LLM, no API call.
witness.py · stave.pyHolds the prefix stable so Anthropic's 90% / OpenAI's 50% cached-read discount actually lands.
cache_aligner.pyCheap model first; escalate only when verifier risk demands it. Bounded regret via split-conformal coverage.
escalation.pyTwo-verifier cascade with a measured Pareto frontier vs. either verifier alone.
conformal_cascade.pyPer-task model routing — cheap when capable, strong when needed. Fail-closed.
ravs/router.pyQueries that match a proven crystallized skill short-circuit the whole pipeline — 100% LLM cost saved.
fast_path.pyLearns the right token budget per query so easy questions don't overspend.
adaptive_budget.pyCompresses chat history each turn so long conversations don't bloat the input.
proxy_transform.pyTargeted fast paths for git, builds, logs, JSON, and test output — 60–95% smaller.
proxy_transform.pyCompresses the model's response before downstream chains consume it.
proxy_transform.pyRuns faithfulness NLI fully offline — ~$0.002/claim drops to $0.
witness.pyDrops hallucinated content from responses before it propagates downstream.
eicv_suppressor.pyLearns which fragment features matter, with a spectral natural-gradient optimizer.
online_learner.py · prism.rsAnonymized weight + skill sync across instances amortizes cold-start across the user base.
federation.pyUniversal entropy + SimHash compressor for any tool output — even ones it has never seen.
shell_codec.pyBudget-driven file reads — full, signature-only, or diff-only, chosen per block.
semantic_resolution.pyBlocks prompt-injection and context poisoning that bypass regex-only scanners.
context_firewall.pyScans agent output for hallucination before passing it to the next agent.
verified_handoff.pyEvery figure links to a committed JSON with sample counts, 95% confidence intervals, and model provenance. Clone the repo and reproduce them — or run the packaged smoke verifier on your own code in seconds.
| Benchmark | Token savings | Accuracy retained | Samples | Artifact |
|---|---|---|---|---|
| Needle-in-a-haystack | 99.5% | 100% | 20 | needle_accuracy.json |
| LongBench | 85.3% | 103% (↑) | 50 | longbench_accuracy.json |
| BFCL (function calling) | 79.3% | 100% | 50 | bfcl_accuracy.json |
| SQuAD | 43.8% | 90% | 50 | squad_accuracy.json |
| WITNESS+STAVE hallucination (HaluEval-QA) | AUROC 0.84 | ~3 ms · $0/call | 2,000 | stave_benchmark.json |
SQuAD is shown unfiltered — it's the one benchmark here where a tighter budget trades a little accuracy
(0.80 → 0.72) for savings. We include it because cherry-picking benchmarks is how marketing claims get caught.
Numbers measured with gpt-4o-mini; see each JSON for the full confidence intervals.