Metadata-Version: 2.4
Name: cdn-ai
Version: 0.5.0
Summary: Condensa — hyper-efficient AI-to-AI communication language. 41-66% token reduction, 95.8% zero-shot interpretability. LangChain adapter + contract validation.
Author: Worachet Dee
License: MIT
Project-URL: Homepage, https://github.com/worachetdee/condensa
Project-URL: Repository, https://github.com/worachetdee/condensa
Project-URL: Documentation, https://github.com/worachetdee/condensa/tree/main/docs
Project-URL: Issues, https://github.com/worachetdee/condensa/issues
Keywords: ai,agents,communication,protocol,compression,tokens,multi-agent,llm,condensa,cdn
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Communications
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: tiktoken>=0.7.0
Provides-Extra: llm
Requires-Dist: anthropic>=0.30.0; extra == "llm"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff>=0.4.0; extra == "dev"
Provides-Extra: web
Requires-Dist: fastapi>=0.104.0; extra == "web"
Requires-Dist: uvicorn>=0.24.0; extra == "web"
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3.0; extra == "langchain"
Provides-Extra: all
Requires-Dist: cdn-ai[dev,langchain,llm,web]; extra == "all"
Dynamic: license-file

# Condensa

> A hyper-efficient language designed exclusively for AI-to-AI communication, optimized for minimal token usage while maximizing semantic density.

[![PyPI version](https://img.shields.io/pypi/v/cdn-ai.svg)](https://pypi.org/project/cdn-ai/)
[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://pypi.org/project/cdn-ai/)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Tests](https://img.shields.io/badge/tests-505%20passed-brightgreen.svg)]()

### Install

```bash
pip install cdn-ai
```

### Three Editions

| Edition | Code | Focus | Best For |
|---------|------|-------|----------|
| **còndensa** | `!:cdn` | Max performance (41-66% compression, 95.8% interpretability) | Agent swarms, pipelines, batch ops |
| **cóndensa** | `~:cdn` | Tone + negotiation (soft/firm/tentative intent) | Collaborative AI teams |
| **cōndensa** | `@:cdn` | Enterprise security (classification, encryption, audit) | Healthcare, finance, defense |

---

## What It Does

Condensa replaces verbose natural language and bloated JSON in AI-to-AI communication with a dense, position-encoded notation that current LLMs already understand zero-shot.

**Before** (76 tokens):
```
AgentC, I need you to perform a thorough code review of the file that AgentB
just wrote at /workspace/src/transaction_processor.py. Please check the code
against the following criteria: code style and PEP 8 compliance, potential bugs
or logic errors, performance issues, security vulnerabilities, and type safety.
Format your review as a structured report with severity and line numbers.
```

**After** (23 tokens — 70% reduction):
```
>:@C review $_.path checks:(style,bugs,perf,security,types) /fmt:report
```

**Condensa Code** — agents share architecture, not implementations:
```
!:fn DashboardPage /props:(programs:Program[] onSelect:fn(id:n)->void) /renders:(stats-grid,cards)
!:wire dashboard.onSelect -> programs.highlight
!:wire programs.onStart -> workout.load
```
Three lines replace 800 lines of code passing between agents. 100% zero-shot comprehension across 3 LLMs.

---

## Production Case Study — Stratophic.dev

Real production data from integrating Condensa into a multi-agent code generation platform:

| Metric | Before | After | Change |
|--------|--------|-------|--------|
| Cost per generation | $0.135 | $0.019 | **-86%** |
| Assembly failure rate | ~50% | ~10% | **-80%** |
| Assembler AI calls | 1 (15K tokens) | 0 (mechanical) | **Eliminated** |

The biggest win was NOT token compression — it was **Condensa Code** (`!:fn`, `!:wire`).
Agents sharing contracts instead of code made assembly mechanical and reliable.

Full case study: [STRATOPHIC-CASE-STUDY.md](STRATOPHIC-CASE-STUDY.md)

---

## Results

| Metric | Value |
|--------|-------|
| Compression (regex encoder) | **41.1%** avg across 31 message types (0 overhead, free) |
| Compression (benchmark suite) | **66.9%** static, **71.7%** live agent (149 scenarios) |
| Zero-shot interpretability | **95.8%** avg across 5 LLMs |
| Cross-model execution | **93.8%** (Claude → Gemini Flash, 8 turns, 100% task completion) |
| Cross-model API tested | **23/23 pass** (Claude, DeepSeek, Grok — comprehension + execution) |
| Cost savings at 1M conversations | **$989-$4,943** (regex encoder, $3/M tokens, by model) |
| Prompt overhead break-even | Message 1 (regex encoder has zero overhead) |
| Package audit | 47/50 inputs handled correctly, 0 crashes, 505/505 tests pass |

---

## Quick Start

```bash
pip install cdn-ai
```

```python
from cdn import encode, decode, encode_with_stats

# Encode natural language to Condensa
encode("Search for SpaceX news, top 5 results")
# → '!: srch SpaceX news top results /n:5'

# Decode Condensa to natural language
decode("!:srch 'SpaceX' /n:5 | sumz /fmt:bullets")
# → 'Pipeline: Search SpaceX. Limit to 5 results. → Summarize. Format as bullets.'

# Get compression stats
stats = encode_with_stats("Error 429 rate limit. Retry 3 times, wait 30 seconds, exponential backoff.")
print(f"Saved {stats['reduction_pct']}% tokens")  # ~57%
```

### CLI

```bash
cdn encode "Search for SpaceX news, top 5"
cdn decode "!:srch 'SpaceX' /n:5 | sumz /fmt:bullets"
cdn stats "Filter active users, group by region"
cdn tokenize "hello world"
cdn version
```

Full guide: [docs/QUICK-START.md](docs/QUICK-START.md)

---

## LangChain Integration

```bash
pip install 'cdn-ai[langchain]'
```

```python
from langchain_core.messages import HumanMessage
from langchain_condensa import CondensaTransformer, CondensaCallbackHandler

# Compress messages between agents
transformer = CondensaTransformer()
msg = HumanMessage(content="Error 429 rate limit. Retry 3 times, wait 30 seconds, exponential backoff. If all fail, return degraded result.")
compressed = transformer.compress_message(msg)
# content: "[CDN] E: 429 wait:30s backoff:exp"  — 43% fewer tokens

# In LCEL chains
chain = transformer.as_compressor() | prompt | llm

# Monitor compression opportunities
handler = CondensaCallbackHandler()
result = chain.invoke(input, config={"callbacks": [handler]})
print(handler.stats)  # {'tokens_saved': 12, 'reduction_pct': 42.9, ...}
```

---

## Documentation

| Document | Description |
|----------|-------------|
| [Quick Start](docs/QUICK-START.md) | Setup, encode/decode, benchmarks, LLM encoder |
| [Language Reference](docs/LANGUAGE-REFERENCE.md) | Syntax, quick reference card, 6 worked examples |
| [Features](docs/FEATURES.md) | All 11 features (v0.2 + v0.3) + v0.4 tone research |
| [Benchmarks](docs/BENCHMARKS.md) | 149 scenarios, live agent data, cost analysis |
| [Architecture](docs/ARCHITECTURE.md) | Project structure, design, version history, branches |
| [Research Summary](RESEARCH-SUMMARY.md) | Full audit trail of research and testing |
| [Interpretability Tests](experiments/INTERPRETABILITY-TEST.md) | 5-model zero-shot testing + cross-model execution |
| [Transparency](research/transparency_notes.md) | Honest documentation of limitations |
| [Multilingual](research/multilingual_analysis.md) | Cross-lingual analysis |
| [Prompt Overhead](research/prompt_overhead_analysis.md) | Break-even analysis with 4 example cases |

---

## Multilingual

Condensa's structure is 100% language-neutral -- verbs (`srch`, `filt`, `grp`) are code patterns, not English words. Non-English agents benefit MORE because their NL instructions are more expensive under BPE tokenization (Thai: 37.1%, Japanese: 37.5%, Arabic: 31.7%). Cross-lingual agents communicate via Condensa without mutual NL translation -- the protocol is the lingua franca.

---

## Transparency

Condensa does NOT compress dense human prose (4.4% savings). It does NOT outperform Chinese NL (-5.6%). The regex encoder handles 94% of inputs correctly but is not perfect. The `!?:` sync command is understood by 80% of models (3/5). Condensa wins where machines talk to machines verbosely -- agent frameworks, JSON exchanges, multi-turn workflows. Full notes: [research/transparency_notes.md](research/transparency_notes.md)

---

## Roadmap

| Phase | Status |
|-------|--------|
| Analysis & Theory | Complete -- token economics, compression survey, 8 design principles |
| Language Specification | Complete -- v0.1 → v0.2 → v0.3 specs, EBNF grammar, primitives |
| Implementation | Complete -- encoder, decoder, 149-scenario benchmarks, validation suite |
| Interpretability Testing | Complete -- 5-model zero-shot (95.8%), v0.3 redesign, cross-model execution (93.8%) |
| PyPI Package | **Complete** -- `pip install cdn-ai` v0.4.1 (stable, 505 tests) |
| Fine-tuning Dataset | Complete -- 522 NL/Condensa pairs for pre-training |
| Tone Research | Complete -- v0.4 experimental (è soft works at 83%, firm/tentative don't) |
| Security Edition | Complete -- classification, encryption, ACL, audit, DLP (on branch) |
| Agent Framework Integration | **LangChain done** -- `langchain_condensa` package. CrewAI planned. |
| Production Pilot | **Done** -- stratophic.dev (86% cost reduction, mechanical assembly) |

---

## License

MIT
