Metadata-Version: 2.4
Name: databao-context-engine
Version: 0.2.2
Summary: Semantic context for your LLMs — generated automatically
License-Expression: Apache-2.0 AND LicenseRef-Additional-Terms
License-File: LICENSE.md
Requires-Dist: click>=8.3.0
Requires-Dist: duckdb>=1.4.3
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: requests>=2.32.5
Requires-Dist: mcp>=1.23.3
Requires-Dist: pydantic>=2.12.4
Requires-Dist: jinja2>=3.1.6
Requires-Dist: pyathena>=3.25.0 ; extra == 'athena'
Requires-Dist: clickhouse-connect>=0.10.0 ; extra == 'clickhouse'
Requires-Dist: mssql-python>=1.0.0 ; extra == 'mssql'
Requires-Dist: pymysql>=1.1.2 ; extra == 'mysql'
Requires-Dist: docling>=2.70.0 ; extra == 'pdf'
Requires-Dist: asyncpg>=0.31.0 ; extra == 'postgresql'
Requires-Dist: snowflake-connector-python>=4.2.0 ; extra == 'snowflake'
Requires-Python: >=3.12
Project-URL: Homepage, https://databao.app/
Project-URL: Source, https://github.com/JetBrains/databao-context-engine
Provides-Extra: athena
Provides-Extra: clickhouse
Provides-Extra: mssql
Provides-Extra: mysql
Provides-Extra: pdf
Provides-Extra: postgresql
Provides-Extra: snowflake
Description-Content-Type: text/markdown

[![official project](https://jb.gg/badges/official.svg)](https://github.com/JetBrains#jetbrains-on-github)
[![PyPI version](https://img.shields.io/pypi/v/databao-context-engine.svg)](https://pypi.org/project/databao-context-engine)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/JetBrains/databao-context-engine?tab=License-1-ov-file)

[//]: # ([![Python versions]&#40;https://img.shields.io/pypi/pyversions/databao-context-engine.svg&#41;]&#40;https://pypi.org/project/databao-context-engine/&#41;)


<h1 align="center">Databao Context Engine</h1>
<p align="center">
 <b>Semantic context for your LLMs — generated automatically.</b><br/>
 No more copying schemas. No manual documentation. Just accurate answers.
</p>
<p align="center"> 
  <a href="https://databao.app">Website</a> •
  <a href="#quickstart">Quickstart</a> •
  <a href="#supported-data-sources">Data Sources</a> •
  <a href="#contributing">Contributing</a>
</p>

---

## What is Databao Context Engine?

Databao Context Engine is a CLI tool that **automatically generates governed semantic context** from your databases, BI tools, documents, and spreadsheets.

Integrate it with any LLM to deliver **accurate, context-aware answers** — without copying schemas or writing documentation by hand.

```
Your data sources → Context Engine → Unified semantic graph → Any LLM
```

## Why choose Databao Context Engine?

| Feature                    | What it means for you                                          |
|----------------------------|----------------------------------------------------------------|
| **Auto-generated context** | Extracts schemas, relationships, and semantics automatically   |
| **Runs locally**           | Your data never leaves your environment                        |
| **MCP integration**        | Works with Claude Desktop, Cursor, and any MCP-compatible tool |
| **Multiple sources**       | Databases, dbt projects, spreadsheets, documents               |
| **Built-in benchmarks**    | Measure and improve context quality over time                  |
| **LLM agnostic**           | OpenAI, Anthropic, Ollama, Gemini — use any model              |
| **Governed & versioned**   | Track, version, and share context across your team             |
| **Dynamic or static**      | Serve context via MCP server or export as artifact             |

## Installation

Databao Context Engine is [available on PyPI](https://pypi.org/project/databao-context-engine/) and can be installed with uv, pip, or another package manage.

### Using uv

1. Install Databao Context Engine:

   ```bash
   uv tool install databao-context-engine
   ```

1. Add it to your PATH:

   ```bash
   uv tool update-shell
   ```

1. Verify the installation:

   ```bash
   dce --help
   ```

### Using pip

1. Install Databao Context Engine:

   ```bash
   pip install databao-context-engine
   ```

1. Verify the installation:

   ```bash
   dce --help
   ```

##  Supported data sources

* <img src="https://cdn.simpleicons.org/postgresql/316192" width="16" height="16" alt=""> PostgreSQL
* <img src="https://cdn.simpleicons.org/mysql/4479A1" width="16" height="16" alt=""> MySQL
* <img src="https://cdn.simpleicons.org/sqlite/003B57" width="16" height="16" alt=""> SQLite
* <img src="https://cdn.simpleicons.org/duckdb/FFF000" width="16" height="16" alt=""> DuckDB
* <img src="https://cdn.simpleicons.org/dbt/FF694B" width="16" height="16" alt=""> dbt projects
* 📄 Documents & spreadsheets *(coming soon)*

##  Supported LLMs

| Provider      | Configuration                                |
|---------------|----------------------------------------------|
| **Ollama**    | `languageModel: OLLAMA`: runs locally, free  |
| **OpenAI**    | `languageModel: OPENAI`: requires an API key |
| **Anthropic** | `languageModel: CLAUDE`: requires an API key |
| **Google**    | `languageModel: GEMINI`: requires an API key |

## Quickstart

### 1. Create a project

1. Create a new directory for your project and navigate to it:

   ```bash
   mkdir dce-project && cd dce-project
   ```

1. Initialize a new project:

   ```bash
   dce init
   ```

### 2. Configure data sources

1. When prompted, agree to create a new datasource.
   You can also use the `dce datasource add` command.

1. Provide the data source type and its name.

1. Open the config file that was created for you in your editor and fill in the connection details.

1. Repeat these steps for all data sources you want to include in your project.

1. If you have data in Markdown or text files,
   you can add them to the `dce/src/files` directory.

### 3. Build context

1. To build the context, run the following command:

   ```bash
   dce build
   ```

### 4. Use Context with Your LLM

**Option A: Dynamic via MCP Server**

Databao Context Engine exposes the context through a local MCP Server, so your agent can access the latest context at runtime.

1. In **Claude Desktop**, **Cursor**, or another MCP-compatible agent, add the following configuration.
   Replace `dce-project/` with the path to your project directory:
  
   ```json 
   # claude_desktop_config.json, mcp.json, or similar
   
   {
     "mcpServers": {
       "dce": {
         "command": "dce mcp",
         "args": ["--project-dir", "dce-project/"]
       }
     }
   }
   ```

1. Save the file and restart your agent.

1. Open a new chat, in the chat window, select the `dce` server, and ask questions related to your project context.

**Option B: Static artifact**

Even if you don’t have Claude or Cursor installed on your local machine,
you can still use the context built by Databao Context Engine by pasting it directly into your chat with an AI assistant.

1. Navigate to `dce-project/output/` and open the directory with the latest run.

1. Attach the `all_results.yaml` file to your chat with the AI assistant or copy and paste its contents into your chat.

## API Usage

### 1. Create a project

```python
# Initialise the project in an existing directory
from databao_context_engine import init_dce_project
project_manager = init_dce_project(Path(tempfile.mkdtemp()))

# Or use an existing project
from databao_context_engine import DatabaoContextProjectManager
project_manager = DatabaoContextProjectManager(project_dir=Path("path/to/project"))
```

### 2. Configure data sources

```python
from databao_context_engine import (
    DatasourceConnectionStatus,
    DatasourceType,
)

# Create a new datasource
postgres_datasource_id = project_manager.create_datasource_config(
    DatasourceType(full_type="postgres"),
    datasource_name="my_postgres_datasource",
    config_content={
        "connection": {"host": "localhost", "user": "dev", "password": "pass"}
    },
).datasource.id

# Check the connection to the datasource is valid
check_result = project_manager.check_datasource_connection()

assert len(check_result) == 1
assert check_result[0].datasource_id == postgres_datasource_id
assert check_result[0].connection_status == DatasourceConnectionStatus.VALID
```

### 3. Build context

```python
build_result = project_manager.build_context()

assert len(build_result) == 1
assert build_result[0].datasource_id == postgres_datasource_id
assert build_result[0].datasource_type == DatasourceType(full_type="postgres")
assert build_result[0].context_file_path.is_file()
```

### 4. Use the built contexts

#### Create a context engine

```python
# Switch to the engine if you're already using a project_manager
context_engine = project_manager.get_engine_for_project()

# Or directly create a context engine from the path to your DCE project
from databao_context_engine import DatabaoContextEngine
context_engine = DatabaoContextEngine(project_dir=Path("path/to/project"))
```

#### Get all built contexts

```python
# Switch to the engine to use the context built
all_built_contexts = context_engine.get_all_contexts()
assert len(all_built_contexts) == 1
assert all_built_contexts[0].datasource_id == postgres_datasource_id

print(all_built_contexts[0].context)
```

#### Search in built contexts

```python
# Run a vector similarity search
results = context_engine.search_context("my search query")

print(f"Found {len(results)} results for query")
print(
    "\n\n".join(
        [f"{str(result.datasource_id)}\n{result.context_result}" for result in results]
    )
)
```

##  Contributing

We’d love your help! Here’s how to get involved:

- ⭐ **Star this repo** — it helps others find us!
- 🐛 **Found a bug?** [Open an issue](https://github.com/JetBrains/databao-context-engine/issues)
- 💡 **Have an idea?** We’re all ears — create a feature request
- 👍 **Upvote issues** you care about — helps us prioritize
- 🔧 **Submit a PR**
- 📝 **Improve docs** — typos, examples, tutorials — everything helps!

New to open source? No worries! We're friendly and happy to help you get started. 🌱

For more details, see [CONTRIBUTING](CONTRIBUTING.md).

## 📄 License

Apache 2.0 — use it however you want. See the [LICENSE](LICENSE.md) file for details.

---

<p align="center">
 <b>Like Databao Context Engine?</b> Give us a ⭐ — it means a lot!
</p>

<p align="center">
 <a href="https://databao.app">Website</a> •
 <a href="https://discord.gg/hEUqCcWdVh">Discord</a>
</p>
