Metadata-Version: 2.4
Name: polari-sdk
Version: 0.2.0
Summary: Official Python SDK for the Polari API
Author-email: Polari Technologies <hello@polariapi.com>
License: MIT
Project-URL: Homepage, https://polariapi.com
Project-URL: Documentation, https://docs.polariapi.com
Project-URL: Repository, https://github.com/polariapi/polari-sdk
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: httpx>=0.25.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-dateutil>=2.8.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"

# Polari Python SDK

Official Python SDK for the [Polari API](https://polariapi.com) — semantic news intelligence.

## Requirements

- Python 3.10+
- A Polari API key ([get one at polariapi.com](https://polariapi.com))

## Installation

```bash
pip install polari-sdk
```

## Quick start

```python
import asyncio
from polari import PolariClient, ArticleInput

async def main():
    async with PolariClient(api_key="pk_live_...") as client:
        article = ArticleInput(
            title="Fed holds rates steady",
            content="The Federal Reserve held interest rates steady on Wednesday...",
            url="https://reuters.com/fed-rates-2026",
            source="Reuters",
        )

        result = await client.layer0.analyze(article)
        print(f"Quality: {result.quality_score:.3f}  ID: {result.article_id}")

asyncio.run(main())
```

## Authentication

All requests require a Bearer API key. Pass it at client initialization:

```python
client = PolariClient(api_key="pk_live_...")
```

Or via environment variable:

```bash
export POLARI_API_KEY="pk_live_..."
```

```python
client = PolariClient.from_env()
```

## Client configuration

All layer URLs default to the correct Polari API endpoints — no configuration needed for standard use. Override only if needed:

```python
client = PolariClient(
    api_key="pk_live_...",
    timeout=60,
    max_retries=3,
)
```

The client is an async context manager. Use `async with` to ensure the HTTP connection is closed cleanly. Alternatively call `await client.close()` manually.

---



# README addition — paste this section into README.md
# Suggested placement: after "Authentication" section, before "Quickstart"
#
# ─────────────────────────────────────────────────────────────────────────────

## Namespaces & Quality Gate

Every API key operates in one of three namespace modes. The mode is set when
your key is issued — no SDK configuration is required.

| Mode | Articles written to | Articles visible | Quality gate |
|------|--------------------|--------------------|--------------|
| `public` | Shared pool | Shared pool | Enforced (default floor: 0.53) |
| `private` | Your namespace only | Your namespace only | Bypassed |
| `shared` | Your namespace only | Shared pool + yours | Bypassed on write |

**Public (default):** Articles enter the shared corpus. During the Layer 0 → Layer 1
handoff, any article whose quality score falls below the gate floor (0.53 by default)
is silently dropped and will not appear in Layer 1 or Layer 2 output. This protects
the shared pool from low-signal content.

**Private:** Articles are fully isolated to your key. The quality gate is bypassed
entirely — all articles proceed to Layer 1 regardless of score. Private namespace
requires a Professional or Enterprise key. To request one, contact
[support@polariapi.com](mailto:support@polariapi.com) or select Private Namespace
during signup.

**Shared:** Write-isolated, read-public. Your ingested articles bypass the quality
gate and land in your own namespace. Read operations (entities, stories, clusters)
span both your namespace and the shared public pool.

### Diagnosing dropped articles

If an article passes Layer 0 but never appears in Layer 1 output, the most likely
cause is the quality gate. You can check the quality score from the Layer 0 response:

```python
result = await client.layer0.analyze(article)
print(result.quality_score)  # must be >= 0.53 for public namespace keys
```

If you need to ingest content that consistently scores below the gate floor (internal
documents, short-form content, domain-specific text), request a private namespace key.





## Layer 0 — Token Intelligence

Layer 0 processes raw articles into quality-scored, embedded representations.

```python
from polari import ArticleInput

article = ArticleInput(
    title="Fed holds rates steady",
    content="The Federal Reserve held interest rates steady...",
    url="https://reuters.com/fed-rates-2026",  # used as deduplication key
    source="Reuters",
)

result = await client.layer0.analyze(article)

result.article_id      # "art_8f7h2k9s"
result.quality_score   # 0.87  (0.0–1.0; articles below 0.53 are filtered)
result.token_count     # 412
result.is_duplicate    # True if this URL was already processed
result.semantic_hash   # content fingerprint for fast similarity matching
```

**Quality score tiers:**

| Score | Tier |
|-------|------|
| 0.8+ | Excellent |
| 0.65–0.8 | Good |
| 0.5–0.65 | Medium |
| 0.3–0.5 | Low |
| <0.3 | Noise |

Articles below `0.53` are not forwarded to Layer 1 or Layer 2.

### Embeddings

By default, `embedding` is not returned (saves ~1.5KB per article). Pass `include_embedding=True` to fetch the 384-dimensional vector from ChromaDB:

```python
result = await client.layer0.analyze(article, include_embedding=True)
result.embedding  # List[float], length 384
```

### Batch processing

```python
articles = [ArticleInput(...), ArticleInput(...), ArticleInput(...)]
results = await client.layer0.analyze_batch(articles)
# Submits all at once, polls concurrently — significantly faster than sequential calls
```

---

## Layer 1 — Semantic Analysis

Layer 1 extracts entities, sentence semantics, and sentiment. **Requires Layer 0 first.**

```python
result = await client.layer1.process(
    article_id=l0.article_id,
    title=article.title,
    content=article.content,
    url=article.url,            # optional
    published_date="2026-05-29T00:00:00",  # optional
)

result.stats.sentence_count    # 12
result.stats.entity_count      # 8
result.entities                # {"PERSON": ["Jerome Powell"], "ORG": ["Federal Reserve"], "GPE": ["Washington"]}
result.locations               # ["Washington"]
result.article_embedding       # 384-dim article-level embedding
result.processed_at            # datetime
```

### Batch processing

```python
results = await client.layer1.process_batch([
    {"article_id": "art_xxx", "title": "...", "content": "..."},
    {"article_id": "art_yyy", "title": "...", "content": "..."},
])
```

---

## Layer 2 — Story Clustering

Layer 2 groups related articles across sources into unified stories. **Requires Layer 0 and Layer 1 first.**

```python
result = await client.layer2.cluster(article_id)

result.cluster_id         # "clus_9x3k2m8f"
result.confidence         # 0.94
result.is_new             # True if this article formed a new cluster
result.already_clustered  # True if this article was previously clustered
```

### Batch clustering

```python
result = await client.layer2.cluster_batch(
    article_ids=["art_xxx", "art_yyy", "art_zzz"],
    similarity_threshold=0.75,   # default
    time_window_hours=168,       # default (7 days)
)

result.stats.clusters_formed    # 2
result.stats.clustering_rate    # 1.0
result.stats.singleton_articles # 0

for cluster in result.clusters:
    cluster.cluster_id    # "clus_xxx"
    cluster.title         # "Fed holds rates steady"
    cluster.article_count # 3
    cluster.source_count  # 2
    cluster.confidence    # 0.89
```

### Story queries

```python
stories  = await client.layer2.list_stories(limit=20, offset=0)
detail   = await client.layer2.get_story("clus_xxx")
articles = await client.layer2.get_story_articles("clus_xxx")
timeline = await client.layer2.get_story_timeline("clus_xxx")
sources  = await client.layer2.get_story_sources("clus_xxx")
rels     = await client.layer2.get_stories_relationships(["clus_xxx", "clus_yyy"])
```

---

## Layer 3 — Intelligence Graph

Layer 3 maps entity relationships and reveals trends across the information landscape. **Requires Professional tier or higher.**

```python
# Graph statistics
stats = await client.layer3.get_stats()
stats["entity_relationships"]   # 5204
stats["cluster_relationships"]  # 48405
stats["narrative_threads"]      # 18
stats["trending_entities"]      # 49

# Trending entities
result = await client.layer3.get_trending_entities(min_velocity=2.0, limit=20)
for trend in result["trends"]:
    trend["entity"]    # "Federal Reserve"
    trend["velocity"]  # 7.0
    trend["mentions"]  # 51

# Cluster relationships
rels = await client.layer3.get_cluster_relationships("clus_xxx")

# Trigger full graph rebuild (admin)
await client.layer3.build_graph()
```

---

## Full pipeline example

```python
import asyncio
from polari import PolariClient, ArticleInput

async def main():
    async with PolariClient(api_key="pk_live_...") as client:

        article = ArticleInput(
            title="Fed holds rates steady",
            content="The Federal Reserve held interest rates steady on Wednesday...",
            url="https://reuters.com/fed-rates-2026",
            source="Reuters",
        )

        # Layer 0 — quality scoring
        l0 = await client.layer0.analyze(article)
        if l0.quality_score < 0.53:
            print("Article below quality threshold, skipping")
            return

        # Layer 1 — entity extraction
        l1 = await client.layer1.process(
            article_id=l0.article_id,
            title=article.title,
            content=article.content,
        )
        print(f"Entities: {l1.entities}")

        # Layer 2 — story clustering
        l2 = await client.layer2.cluster(l0.article_id)
        print(f"Cluster: {l2.cluster_id}  Confidence: {l2.confidence:.3f}")

        # Layer 3 — trending entities
        trends = await client.layer3.get_trending_entities(limit=5)
        for t in trends["trends"]:
            print(f"  {t['entity']}: velocity={t['velocity']}")

asyncio.run(main())
```

---

## Error handling

```python
from polari.exceptions import (
    AuthenticationError,   # 401 — invalid or revoked API key
    ValidationError,       # 400/422 — bad request parameters
    RateLimitError,        # 429 — rate limit exceeded
    ServerError,           # 5xx — service error
    NetworkError,          # connection failure
    TimeoutError,          # request timed out
    ProcessingError,       # article processing failed
    RetryExhaustedError,   # all retry attempts failed
)

try:
    result = await client.layer0.analyze(article)
except AuthenticationError:
    print("Invalid API key")
except RateLimitError:
    print("Rate limit hit — back off and retry")
except RetryExhaustedError as e:
    print(f"Failed after retries: {e}")
```

The SDK retries `429`, `5xx`, network errors, and timeouts automatically with exponential backoff (default: 3 retries). `401`, `400`, `422`, and `404` are not retried.

---

## Metrics and cost tracking

```python
metrics = client.get_metrics()
metrics.total_requests      # 42
metrics.successful_requests # 41
metrics.average_latency     # 0.34  (seconds)

cost = client.get_cost_summary()
cost.total_calls   # 42
cost.total_cost    # 0.063
cost.cost_by_layer # {"layer0": 0.01, "layer1": 0.02, "layer2": 0.01, "layer3": 0.03}
```

---

## Tier access

| Tier | Layers | Requests/day | Requests/min |
|------|--------|-------------|-------------|
| Starter | 0, 1, 2 | 10,000 | 10 |
| Professional | 0, 1, 2, 3 | 100,000 | 100 |
| Enterprise | 0, 1, 2, 3 | Custom | 1,000 |

Layer 3 (Intelligence Graph) and Trends endpoints require Professional tier or higher.
Starter tier keys receive `403` on all Layer 3 and Trends requests.

---

## Known limitations

- `Layer0Result.embedding` returns `[0.0] * 384` unless `include_embedding=True` is passed. Embedding retrieval adds latency due to ChromaDB lookup.
- `Layer1Result.sentences` returns empty for articles already processed by Layer 1 (cached result). Entity counts are still populated.
- Integration tests reuse article URLs — re-running tests returns cached results for already-processed articles. Append a unique suffix to URLs to force fresh processing.

---

## Support

- Docs: [docs.polariapi.com](https://docs.polariapi.com)
- Email: [support@polariapi.com](mailto:support@polariapi.com)
- Homepage: [polariapi.com](https://polariapi.com)
