Metadata-Version: 2.4
Name: sqlgraph
Version: 0.1.8
Summary: Gerador de queries PostgreSQL a partir de linguagem natural, alimentado por LangGraph e pgvector.
Project-URL: Homepage, https://github.com/Ilhe8l/sqlgraph
Project-URL: Repository, https://github.com/Ilhe8l/sqlgraph
Project-URL: Issues, https://github.com/Ilhe8l/sqlgraph/issues
Project-URL: Changelog, https://github.com/Ilhe8l/sqlgraph/blob/main/CHANGELOG.md
Project-URL: Documentation, https://github.com/Ilhe8l/sqlgraph#readme
Author-email: Guilherme <gcsz.contato@gmail.com>
License: MIT License
        
        Copyright (c) 2026 SQLGraph Project
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: langgraph,llm,natural-language,nlp,pgvector,postgresql,rag,sql
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: Portuguese (Brazilian)
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Database :: Database Engines/Servers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Requires-Dist: langchain-core>=0.3.0
Requires-Dist: langchain-google-genai>=2.0.0
Requires-Dist: langchain-openai>=0.2.0
Requires-Dist: langgraph>=0.2.0
Requires-Dist: pgvector>=0.3.0
Requires-Dist: psycopg2-binary>=2.9.0
Requires-Dist: sqlglot>=25.0.0
Provides-Extra: cli
Requires-Dist: typer>=0.12.0; extra == 'cli'
Provides-Extra: dev
Requires-Dist: mypy>=1.10.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.5.0; extra == 'dev'
Requires-Dist: typer>=0.12.0; extra == 'dev'
Description-Content-Type: text/markdown

# sqlgraph

> Uma lib Python open source que combina **LangGraph + pgvector** para geração de SQL através de linguagem natural.

[![CI](https://github.com/Ilhe8l/sqlgraph/actions/workflows/ci.yml/badge.svg)](https://github.com/Ilhe8l/sqlgraph/actions/workflows/ci.yml)
[![PyPI version](https://badge.fury.io/py/sqlgraph.svg)](https://badge.fury.io/py/sqlgraph)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

O `sqlgraph` é um pipeline **RAG (Retrieval-Augmented Generation)** focado em converter perguntas de negócio em consultas SQL corretas e validadas. Em vez de operar como um chatbot genérico, ele orquestra um agente corporativo que:

1. Classifica a intenção do usuário (SQL, chitchat, ambíguo).
2. Recupera o contexto relevante do schema via embeddings + `pgvector`.
3. Gera a query SQL usando um LLM.
4. Valida a sintaxe com `sqlglot` e a semântica com `EXPLAIN` diretamente no banco.
5. Corrige automaticamente queries inválidas antes de retornar.

---

## Pré-requisitos

- **Python 3.11+**
- **PostgreSQL** com a extensão **pgvector** instalada:
  ```sql
  -- Execute no seu banco PostgreSQL:
  CREATE EXTENSION IF NOT EXISTS vector;
  ```
  > Instalação do pgvector: [https://github.com/pgvector/pgvector](https://github.com/pgvector/pgvector)
- Uma **API key** da [OpenAI](https://platform.openai.com/) **ou** do [Google Gemini](https://ai.google.dev/).

---

## Instalação

```bash
pip install sqlgraph
```

Para usar o CLI de linha de comando:
```bash
pip install sqlgraph[cli]
```

---

## Início Rápido

```python
from sqlgraph import SQLAgent
from sqlgraph.llm import OpenAIProvider

# 1. Configura o agente
agent = SQLAgent(
    dsn="postgresql://user:pass@localhost/db",
    llm=OpenAIProvider(api_key="sk-..."),
)

# 2. Treina com o esquema e regras do seu banco
agent.train(ddl="CREATE TABLE pagamentos (id SERIAL, valor NUMERIC, data DATE)")
agent.train(documentation="O campo 'valor' representa o valor bruto em reais.")
agent.train(
    question="Qual o total pago em 2023?",
    sql="SELECT SUM(valor) FROM pagamentos WHERE EXTRACT(year FROM data) = 2023"
)

# 3. Faz perguntas em linguagem natural
response = agent.ask("Qual o total pago por mês em 2023?")
print(response.sql)         # A query gerada
print(response.explanation) # Explicação em texto
print(response.valid)       # True se passou na validação
```

---

## Configuração via Variáveis de Ambiente

```bash
export OPENAI_API_KEY="sk-..."
export SQLGRAPH_DSN="postgresql://user:pass@localhost/db"
```

---

## API de Referência

### `SQLAgent`

```python
agent = SQLAgent(
    dsn="postgresql://...",           # Connection string do PostgreSQL (obrigatório)
    llm=OpenAIProvider(...),          # Provider de LLM (obrigatório)
    session_store=None,               # Default: InMemorySessionStore()
    max_correction_attempts=2,        # Tentativas de auto-correção de queries inválidas
    validate_sql=True,                # Validar SQL com sqlglot + EXPLAIN
    retrieval_limit=10,               # Chunks recuperados por query do knowledge base
    system_prompt=None,               # System prompt customizado para o gerador de SQL
    dialect="postgres",               # Dialeto SQL (postgres, mysql, sqlite...)
)
```

#### Métodos de Treinamento

| Método | Descrição |
|---|---|
| `train(ddl=..., documentation=..., question=..., sql=...)` | Adiciona 1 item ao knowledge base |
| `train_bulk(items)` | Adiciona múltiplos itens de uma vez |
| `train_from_file("schema.sql")` | Ingere todos os `CREATE TABLE` de um arquivo |
| `list_training()` | Retorna todos os itens como `list[TrainingItem]` |
| `remove_training(item_id)` | Remove um item pelo UUID |
| `clear_training()` | Remove todos os itens |
| `export_training("backup.json")` | Exporta o knowledge base para JSON |
| `import_training("backup.json")` | Importa o knowledge base de um JSON |

#### `agent.ask()`

```python
response = agent.ask(
    question="Qual o total pago?",  # Pergunta em linguagem natural
    session_id="minha-sessao",      # ID de sessão para conversa multi-turn (opcional)
    execute=False,                  # Se True, executa o SQL e retorna os dados
    max_rows=100,                   # Limite de linhas quando execute=True
    explain=False,                  # Se True, inclui EXPLAIN ANALYZE na resposta
)
```

**`Response`:**
```
response.sql          # SQL gerado
response.explanation  # Explicação em linguagem natural
response.intent       # 'sql' | 'chitchat' | 'ambiguous'
response.valid        # True se passou na validação
response.attempts     # Número de tentativas até o SQL válido
response.data         # list[dict] com linhas (se execute=True)
response.error        # Mensagem de erro (se válido=False)
```

### Conversas Multi-Turn

```python
session_id = "sessao-usuario-123"
r1 = agent.ask("Liste os 5 maiores pagamentos.", session_id=session_id)
r2 = agent.ask("Agora filtre só os de 2024.", session_id=session_id)  # Contexto mantido!
```

### Providers de LLM

```python
from sqlgraph.llm import OpenAIProvider, GeminiProvider

# OpenAI
llm = OpenAIProvider(api_key="sk-...", model="gpt-4o")

# Google Gemini
llm = GeminiProvider(api_key="AIza...", model="gemini-2.0-flash")
```

### Session Store Persistente (PostgreSQL)

```python
from sqlgraph.storage import PostgresSessionStore

store = PostgresSessionStore(dsn="postgresql://...")
agent = SQLAgent(dsn="...", llm=llm, session_store=store)
```

---

## CLI

Configure as variáveis de ambiente e use o CLI direto do terminal:

```bash
# Ingerir um schema SQL
sqlgraph train --dsn postgresql://... --file schema.sql

# Adicionar uma regra de negócio
sqlgraph train --dsn postgresql://... --doc "vl_pagamento é o valor bruto em reais"

# Fazer uma pergunta
sqlgraph ask --dsn postgresql://... "Qual o total pago em 2023?"

# Executar a query no banco e ver os dados
sqlgraph ask --dsn postgresql://... --execute "Top 10 clientes por valor total"

# Listar o knowledge base
sqlgraph list --dsn postgresql://...

# Exportar/importar backup
sqlgraph export --dsn postgresql://... --output backup.json
sqlgraph import --dsn postgresql://... --input backup.json
```

---

## Exemplos

Veja exemplos completos na pasta [`examples/`](./examples/):

- [`01_basic_usage.py`](./examples/01_basic_usage.py): uso básico do `SQLAgent`.

---

## Documentação

Documentação completa será disponibilizada na Wiki em breve.

---

## Licença

MIT © 2026 SQLGraph Project
