Metadata-Version: 2.4
Name: giql
Version: 0.4rc0
Summary: Genomic Interval Query Language - SQL dialect for genomic range queries
Author-email: Nezar Abdennur <nabdennur@gmail.com>, Conrad Bzura <conradbzura@gmail.com>
Maintainer-email: Nezar Abdennur <nabdennur@gmail.com>, Conrad Bzura <conradbzura@gmail.com>
Requires-Python: >=3.11
Requires-Dist: sqlglot<30,>=20.0.0
Provides-Extra: mcp
Requires-Dist: fastmcp>=2.0.0; extra == 'mcp'
Description-Content-Type: text/markdown

# GIQL

<p align="center">Genomic Interval Query Language (GIQL)</p>
<p align="center"><i>/JEE-quel/</i></p>

<samp>
  <p align="center">
      <a href="https://giql.readthedocs.io/">docs</a> |
      <a href="https://giql.readthedocs.io/en/latest/dialect/index.html">syntax</a> |
      <a href="https://giql.readthedocs.io/en/latest/transpilation/index.html">transpiler</a>
  </p>
</samp>

GIQL is an extended SQL dialect that allows you to declaratively express genomic interval operations.

The `giql` Python package transpiles GIQL queries into standard SQL syntax for execution on any database or analytics engine.

> **Note:** This project is in active development — APIs, syntax, and behavior may change.

## Installation

To install the transpiler:

```bash
pip install giql
```

## Usage (transpilation)

The `giql` package transpiles GIQL queries to standard SQL.

```python
from giql import transpile

sql = transpile(
    "SELECT * FROM peaks WHERE interval INTERSECTS 'chr1:1000-2000'",
    tables=["peaks"],
)
print(sql)
```

```sql
SELECT
  *
FROM peaks
WHERE
  (
    "chrom" = 'chr1' AND "start" < 2000 AND "end" > 1000
  )
```

Each table referenced in a GIQL query exposes a genomic "pseudo-column" that maps to separate logical chromosome, start, end, and strand columns. You can customize the column mappings.

```python
from giql import Table, transpile

sql = transpile(
    "SELECT * FROM variants WHERE position INTERSECTS 'chr1:1000-2000'",
    tables=[
        Table(
            "variants",
            genomic_col="position",
            chrom_col="chromosome",
            start_col="start_pos",
            end_col="end_pos",
        )
    ],
)
print(sql)
```

The transpiled SQL can be executed with fast genome-unaware databases or in-memory analytic engines like DuckDB.

You can also use [oxbow](https://oxbow.readthedocs.io) to efficiently stream specialized genomics formats into DuckDB. 

```python
import duckdb
import oxbow as ox
from giql import transpile

conn = duckdb.connect()

# Load a streaming data source as a DuckDB relation
peaks = ox.from_bed("peaks.bed", bed_schema="bed6+4").to_duckdb(conn)

sql = transpile(
    "SELECT * FROM peaks WHERE interval INTERSECTS 'chr1:1000-2000'",
    tables=["peaks"],
)

# Execute and return the output as a dataframe
df = con.execute(sql).fetchdf()
```

## MCP Server

GIQL includes an [MCP](https://modelcontextprotocol.io/) server that gives LLM-powered tools access to operator references, syntax guides, and documentation. Install with the `mcp` extra:

```bash
pip install giql[mcp]
```

Or spawn a server directly with `uvx`:

```sh
uvx --from "giql[mcp]" giql-mcp
```

To add the GIQL MCP server to a specific project in Claude Code:

```sh
claude mcp add --scope project giql-mcp -- uvx --from "giql[mcp]" giql-mcp
```

See [`src/giql/mcp/README.md`](src/giql/mcp/README.md) for configuration and usage details.

## Development

```bash
git clone https://github.com/abdenlab/giql.git
cd giql
uv sync
```

To build the documentation locally:

```bash
uv run --group docs sphinx-build docs docs/_build
# The built docs will be in docs/_build/html/
```

For serve the docs locally with automatic rebuild:
```bash
uv run --group docs sphinx-autobuild docs docs/_build
```
