Metadata-Version: 2.4
Name: talkdb-ai
Version: 0.1.0
Summary: Autonomous data analyst: text-to-SQL with conversation, insight, monitoring, and community intelligence
Project-URL: Homepage, https://github.com/Nitin-Gupta1109/TalkDB
Project-URL: Repository, https://github.com/Nitin-Gupta1109/TalkDB
Project-URL: Issues, https://github.com/Nitin-Gupta1109/TalkDB/issues
Author-email: Nitin Gupta <nmittal1109@gmail.com>
License: MIT
License-File: LICENSE
Keywords: analytics,data-analyst,llm,mcp,nl2sql,sql,text-to-sql
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Requires-Dist: apscheduler>=3.10
Requires-Dist: chromadb>=0.5
Requires-Dist: click>=8.0
Requires-Dist: fastapi>=0.115
Requires-Dist: httpx>=0.27
Requires-Dist: litellm>=1.50
Requires-Dist: matplotlib>=3.9
Requires-Dist: mcp>=1.0
Requires-Dist: numpy>=2.0
Requires-Dist: pandas>=2.2
Requires-Dist: pydantic-settings>=2.0
Requires-Dist: pydantic>=2.0
Requires-Dist: python-dotenv>=1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rank-bm25>=0.2
Requires-Dist: rich>=13.0
Requires-Dist: seaborn>=0.13
Requires-Dist: sqlalchemy>=2.0
Requires-Dist: sqlglot>=26.0
Requires-Dist: uvicorn>=0.34
Provides-Extra: all
Requires-Dist: talkdb[bigquery,mssql,mysql,pgvector,plotly,postgres,redis,snowflake]; extra == 'all'
Provides-Extra: bigquery
Requires-Dist: google-cloud-bigquery>=3.0; extra == 'bigquery'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.8; extra == 'dev'
Provides-Extra: mssql
Requires-Dist: pymssql>=2.3; extra == 'mssql'
Provides-Extra: mysql
Requires-Dist: pymysql>=1.1; extra == 'mysql'
Provides-Extra: pgvector
Requires-Dist: pgvector>=0.3; extra == 'pgvector'
Provides-Extra: plotly
Requires-Dist: kaleido>=0.2; extra == 'plotly'
Requires-Dist: plotly>=5.0; extra == 'plotly'
Provides-Extra: postgres
Requires-Dist: psycopg2-binary>=2.9; extra == 'postgres'
Provides-Extra: redis
Requires-Dist: redis>=5.0; extra == 'redis'
Provides-Extra: snowflake
Requires-Dist: snowflake-connector-python>=3.0; extra == 'snowflake'
Description-Content-Type: text/markdown

# TalkDB

An open-source, MCP-native autonomous data analyst. Converts natural-language questions into validated SQL, then goes further: multi-turn conversations, statistical insight generation, proactive metric monitoring, dual-path result verification, and a learning loop from user corrections.

Not just text-to-SQL — a full analyst loop in a single tool.

## What it does

- **Ask questions in plain English** over Postgres, SQLite, and more. Answers include SQL, results, a narrative summary, and an auto-generated chart.
- **Multi-turn conversations.** "Revenue by month" → "just Q4" → "break that down by region" — follow-ups resolve to the previous turn's context.
- **Semantic layer (YAML).** Define business metrics once (`revenue = SUM(orders.total_amount) WHERE status='completed'`); the LLM uses your definitions instead of guessing.
- **Dual-path verification.** Every novel query is generated two structurally different ways; results are compared. Divergence drops confidence and surfaces a warning — catching semantic errors that schema validation can't.
- **Confidence scoring with graceful refusal.** Queries below threshold aren't silently wrong — they return a refusal explaining what's uncertain.
- **Statistical insight agent.** After results land, a pandas-based analyzer detects trends, anomalies, and concentrations. A chart is auto-generated from data shape. An LLM narrator writes 2–4 sentences using only the analyzer's facts (no hallucinated numbers).
- **Proactive watchdog.** Save any query as a scheduled watch. APScheduler runs it on your cadence, compares to a rolling baseline, and fires a webhook/Slack/stdout alert when conditions trigger.
- **Self-improving via corrections.** `correct_query(question, wrong_sql, correct_sql)` stores the pattern and indexes it into retrieval, so future similar questions benefit.

## Interfaces

- **MCP server (primary).** Works with Claude Desktop, Cursor, VS Code, and any MCP-compatible client. 12 tools: `ask`, `analyze`, `follow_up`, `list_databases`, `describe_database`, `validate_sql`, `correct_query`, `watch`, `list_watches`, `remove_watch`, `run_watch`, `get_session`.
- **CLI.** `talkdb ask`, `talkdb chat`, `talkdb init`, `talkdb index`, `talkdb watchdog add/list/remove/run/start`, `talkdb serve`.

## Tech stack

Python 3.11+ · FastMCP · LiteLLM (Claude / GPT / Gemini / Ollama) · SQLAlchemy 2 · sqlglot · ChromaDB + BM25 hybrid retrieval · pandas + matplotlib + seaborn · APScheduler · Pydantic.

## Quick start

```bash
# Setup
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

cp .env.example .env
# Edit .env: set OPENAI_API_KEY or ANTHROPIC_API_KEY, and TALKDB_DEFAULT_DB

# Seed a demo ecommerce DB (optional — or point at your own)
python scripts/seed_example_db.py

# Index schema + semantic model for retrieval
talkdb index

# Ask a question
talkdb ask "What is our total revenue?"

# Multi-turn chat
talkdb chat

# Start MCP server (for Claude Desktop / Cursor)
talkdb serve --transport stdio
```

## Wiring into Claude Desktop

Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "talkdb": {
      "command": "/absolute/path/to/.venv/bin/talkdb",
      "args": ["serve", "--transport", "stdio"],
      "cwd": "/absolute/path/to/project"
    }
  }
}
```

Restart Claude Desktop. The `ask`, `analyze`, `follow_up`, etc. tools appear in the tool picker.

## Semantic model

Define business meaning once in `semantic_models/<db>.yaml`:

```yaml
metrics:
  - name: revenue
    description: "Total revenue from completed orders."
    calculation: "SUM(orders.total_amount) WHERE orders.status = 'completed'"
    table: orders

tables:
  - name: customers
    columns:
      - name: tier
        valid_values: ["bronze", "silver", "gold", "platinum"]
        aliases: ["loyalty level", "membership tier"]

joins:
  - left: orders
    right: customers
    on: "orders.customer_id = customers.id"
    type: "INNER JOIN"
```

Run `talkdb init --database mydb --output semantic_models/mydb.yaml` to auto-generate a skeleton from your DB.

## Watchdog example

```bash
talkdb watchdog add \
  --name "Revenue monitor" \
  --question "What is today's total revenue?" \
  --schedule "every 1 hour" \
  --alert "drops more than 20% below 7-day average" \
  --slack-webhook "https://hooks.slack.com/services/..."

talkdb watchdog list
talkdb watchdog start   # blocks; runs the scheduler
```

Alerts render like:
> 🔴 Revenue monitor — Current value: $38.2k — 27% below baseline $52.4k (7_day_avg). Suggested follow-up: "Why is today's revenue below baseline?"

## Benchmark

Ships with a 27-case mini benchmark on the seeded DB for regression tracking:

```bash
python -m tests.benchmarks.run_benchmark
```

Current baseline: **23/27 (85%) execution accuracy, 96% structural correctness, 0 silent-wrong refusals**. Per-phase baselines are checked into `tests/benchmarks/`.

## Project layout

```
src/talkdb/
├── core/          # Engine, SQL generator, intent classifier, dialect transpiler
├── conversation/  # Session state, rewriter, reference resolver
├── schema/        # Introspector, data models, semantic model loader
├── retrieval/     # ChromaDB + BM25 hybrid retriever, embeddings
├── validation/    # Schema, execution, shape, dual-path, confidence
├── insight/       # Analyzer (pandas), charter (matplotlib), narrator (LLM)
├── watchdog/      # Scheduler, baseline, alerter, storage
├── learning/      # Pattern store, feedback recorder
├── connectors/    # Postgres, SQLite (more dialects via sqlglot)
├── server/        # FastMCP server
└── config/        # Pydantic settings
```

## Design principles

- **Never dump full schema into prompts.** Hybrid retrieval (BM25 + vector) surfaces only relevant context.
- **Never return results below confidence threshold.** Refuse and explain — silent wrong answers destroy trust.
- **SELECT only.** sqlglot AST walk rejects every mutating statement before execution.
- **Read-only validation.** All validation queries run in `READ ONLY` transactions with `LIMIT 10` and a 10-second timeout.
- **LLM-provider agnostic.** Every LLM call goes through LiteLLM — swap Claude ↔ GPT ↔ Gemini ↔ Ollama with one config change.
- **Vector store abstraction.** ChromaDB for dev, pgvector can plug in behind the same interface.
- **Temperature 0 for SQL generation.** Deterministic output.
- **Insight stats are pandas, not LLM.** Only narration uses the LLM (so the numbers in insights are always correct).
- **Dual-path uses structurally different prompts.** Path A direct, Path B decompose-then-compose. Catches correlated errors that self-correction misses.
- **Conversation rewriting, not SQL mutation.** Follow-ups are rewritten into standalone questions before SQL is regenerated from scratch.

## License

MIT.
