Metadata-Version: 2.4
Name: hallutok
Version: 0.1.0
Summary: Anti-Hallucination & Token Optimization library for Groq and Gemini APIs
Author-email: Joel Pawar <joelpawarwork@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/joelpawar08/hallutok
Project-URL: Issues, https://github.com/joelpawar08/hallutok/issues
Project-URL: Documentation, https://github.com/joelpawar08/hallutok#readme
Keywords: llm,hallucination,token-optimization,groq,gemini,ai
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Provides-Extra: groq
Requires-Dist: groq>=0.9.0; extra == "groq"
Provides-Extra: gemini
Requires-Dist: google-generativeai>=0.7.0; extra == "gemini"
Provides-Extra: all
Requires-Dist: groq>=0.9.0; extra == "all"
Requires-Dist: google-generativeai>=0.7.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"

# 🛡️ Hallutok

**Anti-Hallucination & Token Optimization for Groq and Gemini APIs**

[![PyPI version](https://badge.fury.io/py/hallutok.svg)](https://pypi.org/project/hallutok/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

Hallutok solves two real problems that kill your API quota:

| Problem | Hallutok's Solution |
|---|---|
| Long prompts burning through tokens | `TokenOptimizer` compresses prompts before sending |
| LLM making up facts / hedging | `HallucinationValidator` scores and flags sketchy responses |

---

## ✨ Features

- **Token Optimization** — whitespace cleanup, filler-phrase compression, deduplication, smart truncation, in-memory caching
- **Anti-Hallucination** — detects hedging language, ungrounded claims, numeric anomalies, contradictions
- **Groq + Gemini** — works with both APIs via thin, swappable provider adapters
- **Zero hard dependencies** — core library is pure Python; providers are optional extras
- **Savings reporting** — see exactly how many tokens you saved per call

---

## 📦 Installation

```bash
# With Groq support
pip install hallutok[groq]

# With Gemini support
pip install hallutok[gemini]

# Both
pip install hallutok[all]
```

---

## 🚀 Quick Start

### Using Groq

```python
from hallutok import HallutokClient

# Factory shortcut
client = HallutokClient.with_groq(
    api_key="gsk_your_groq_key",
    model="llama3-8b-8192",       # optional, this is the default
    temperature=0.3,              # lower = more factual
)

result = client.chat(
    "Please note that I would like you to explain in order to help me "
    "understand what black holes are and how they work. Can you please "
    "provide a detailed explanation? It is important to note that I am "
    "a beginner."
)

print(result.response)
print(result.token_report)
# {'tokens_before': 48, 'tokens_after': 19, 'tokens_saved': 29, 'percent_saved': 60.4}

if result.validation.is_likely_hallucination:
    print("⚠️  Flags:", result.validation.flags)
```

### Using Gemini

```python
from hallutok import HallutokClient

client = HallutokClient.with_gemini(
    api_key="AIza_your_gemini_key",
    model="gemini-1.5-flash",
)

result = client.chat("Explain quantum entanglement to a 10-year-old.")
print(result.response)
```

### Using providers directly

```python
from hallutok import HallutokClient
from hallutok.providers import GroqProvider, GeminiProvider

# Swap providers without changing anything else
provider = GroqProvider(api_key="gsk_...", model="mixtral-8x7b-32768")
# provider = GeminiProvider(api_key="AIza_...", model="gemini-1.5-pro")

client = HallutokClient(
    provider=provider,
    optimize_tokens=True,          # default: True
    validate_responses=True,       # default: True
    max_prompt_tokens=512,         # hard cap on prompt size
    temperature=0.4,
    max_response_tokens=1024,
    system_prompt="You are a factual assistant. Cite sources when possible.",
)

result = client.chat("What causes inflation?")
```

---

## 🔧 Components

### TokenOptimizer

Use standalone if you only need compression:

```python
from hallutok.optimizer import TokenOptimizer

opt = TokenOptimizer()

raw = """
Please note that I would like you to, in order to be helpful,
can you please explain, it is important to note that, machine learning
is a subset of AI. Machine learning is a subset of AI. Machine learning is a subset of AI.
"""

compressed = opt.optimize(raw, max_tokens=100)
print(compressed)

report = opt.savings_report(raw, compressed)
# {'tokens_before': 54, 'tokens_after': 12, 'tokens_saved': 42, 'percent_saved': 77.8}
```

What the optimizer does, in order:
1. Normalize whitespace (collapse spaces, trim blank lines)
2. Strip boilerplate ("Please note that", "I would like you to", etc.)
3. Deduplicate repeated sentences
4. Replace verbose phrases ("in order to" → "to", "due to the fact that" → "because", …)
5. Truncate to `max_tokens` at a sentence boundary

### HallucinationValidator

Use standalone to audit any text:

```python
from hallutok.antihallucination import HallucinationValidator

validator = HallucinationValidator()

response = "I think maybe studies show that eating chocolate probably cures cancer."
result = validator.validate(response)

print(result.confidence_score)          # e.g. 0.72
print(result.is_likely_hallucination)   # True / False
print(result.flags)                     # list of issues found
print(result.warnings)                  # human-readable descriptions
print(result.suggestions)              # what to do about it
print(result.cleaned_response)          # response + disclaimer if flagged
```

**Detection layers:**

| Layer | What it catches |
|---|---|
| Hedging | "I think", "maybe", "perhaps", "I'm not sure", etc. |
| Ungrounded claims | "Studies show…", "Research suggests…" without citations |
| Numeric anomalies | Percentages over 100%, other implausible numbers |
| Contradictions | "always" + "never", "increases" + "decreases" in same text |

---

## 💡 Tips to Maximize Token Savings

1. **Avoid filler openers** — "Can you please", "I would like you to", "It is important that"
2. **Don't repeat yourself** — Hallutok deduplicates, but it's faster to not duplicate at all
3. **Use `max_prompt_tokens`** — set a hard cap so you never accidentally send a 4k-token prompt
4. **Lower the temperature** — `temperature=0.3` reduces hallucination risk significantly
5. **Use a system prompt** — instruct the model to cite sources and avoid speculation
6. **Check `token_report` per call** — it tells you exactly what was saved

---

## 📊 ChatResult Fields

```python
result.response           # final (possibly cleaned) text
result.original_prompt    # your original input
result.optimized_prompt   # what was actually sent to the API
result.token_report       # {tokens_before, tokens_after, tokens_saved, percent_saved}
result.validation         # ValidationResult object
result.provider           # "groq" or "gemini"
result.warnings           # list of human-readable warnings
```

---

## 🗺️ Roadmap

- [ ] Async support (`achat()`)
- [ ] Streaming responses
- [ ] OpenAI / Together AI provider adapters
- [ ] Per-call token budget enforcement
- [ ] Context window manager for multi-turn conversations
- [ ] More hallucination detection strategies (self-consistency, chain-of-thought verification)

---

## 📄 License

MIT License — see [LICENSE](LICENSE) for details.ß
