Metadata-Version: 2.4
Name: select-bq
Version: 0.3.0
Summary: Select BigQuery CLI - SELECT-only queries with allowlist and audit logging
Author: select-bq contributors
License-Expression: MIT
Project-URL: Homepage, https://github.com/sqlhabit/select-bq
Project-URL: Repository, https://github.com/sqlhabit/select-bq
Keywords: bigquery,cli,security,gcp,sql,allowlist,audit
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: sqlglot>=24.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: build>=1.0; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Dynamic: license-file

# select-bq

A **SELECT-only BigQuery CLI wrapper** for agentic use (e.g. with Cursor). Wraps the official `bq` CLI and enforces:

- **SELECT-only**: Queries are validated via SQL AST parsing—no DML, DDL, scripting, `EXECUTE IMMEDIATE`, or hidden CTEs that could modify data.
- **Allowlist**: Optional config restricts queries to allowed tables (string patterns `project.dataset.table`). Empty or missing allowlist = no restriction.
- **Audit logging**: All queries (including rejections) are logged to a YAML file with timestamps.

## Install

```bash
pip install select-bq
```

Requires the [Google Cloud SDK](https://cloud.google.com/sdk/docs/install) (`bq` CLI) and `gcloud auth login`.

## Setup

Create a config file `.select-bq.yaml` in your project root:

```yaml
# Where to log queries (default: select-bq-queries.yaml in current dir)
log_path: select-bq-queries.yaml

# Optional allowlist. Omit or leave empty to allow all tables.
# When present, only these patterns can be queried (three dot-separated parts).
allowlist:
  - my-gcp-project.analytics.events
  - my-gcp-project.analytics.*           # all tables in this dataset
  - my-gcp-project.*.*                 # entire project
  - my-gcp-project.analytics.onecrm-*  # glob on table name (fnmatch)
```

- **`log_path`** — Path for the query log (default: `select-bq-queries.yaml`). Use an absolute path to log outside the project.
- **`allowlist`** — List of strings `project.dataset.table`. Use `*` for a full segment wildcard; use `prefix*` / `*suffix` style globs in a segment when you need pattern matching. Omit to allow all tables.

To use a different config path: `select-bq query --config ./my-config.yaml "SELECT 1"`.

To use an external allowlist file:

```yaml
log_path: select-bq-queries.yaml
allowlist_path: allowlist.yaml
```

## Usage

```bash
# Run a SELECT query (same as bq query, but validated)
select-bq query "SELECT 1"
select-bq query "SELECT * FROM project.dataset.table LIMIT 10" --format=pretty

# Query from file
select-bq query -f query.sql

# Custom config
select-bq query --config ./my-config.yaml "SELECT * FROM my_table"

# Use Standard SQL (default) or legacy SQL
select-bq query --use_legacy_sql=false "SELECT 1"
select-bq query --use_legacy_sql=true "SELECT 1"

# All bq query flags are passed through (format, project_id, etc.)
select-bq query --format=pretty --project_id=my-project "SELECT 1"
```

## Query Log

Logged to `log_path` (default `select-bq-queries.yaml`):

```yaml
queries:
  - timestamp: "2025-03-09T12:00:00.000000+00:00"
    query: "SELECT 1"
    success: true
  - timestamp: "2025-03-09T12:01:00.000000+00:00"
    query: "INSERT INTO t VALUES (1)"
    success: false
    error: "Statement type 'Insert' is not allowed. Only SELECT queries are permitted."
```

## Security

- **AST parsing**: Uses [sqlglot](https://github.com/tobymao/sqlglot) with BigQuery dialect. Only `SELECT` statements are allowed; `INSERT`, `UPDATE`, `DELETE`, `CREATE`, `DROP`, `EXECUTE IMMEDIATE`, `DECLARE`, `SET`, and similar are rejected.
- **Allowlist**: When configured, only tables matching the allowlist patterns can be queried (e.g. `dataset.*` for all tables in a dataset).
- **No eval/exec**: Validation is purely structural—no dynamic execution of user input.

## Cursor Integration

Add to your project's Cursor rules or AGENTS.md:

```markdown
Use `select-bq query "SELECT ..."` when querying BigQuery. Do not use raw `bq` for queries.
```

## Publishing

To build and publish to PyPI:

```bash
pip install build twine
python -m build
twine upload dist/*
```
