# surql - Code-First Database Toolkit for SurrealDB

# Library Overview

- surqlis a code-first database toolkit for building modern applications with SurrealDB
- Provides type-safe schema definitions, code-first migrations, and async-first database operations
- Built on functional composition patterns with immutable data structures
- Integrates Pydantic for runtime validation and type safety
- Supports SurrealDB's graph features including edge relationships and graph traversal
- Supports vector similarity search through HNSW and MTREE indexes for AI/ML embeddings
- Includes query result caching with Redis and in-memory backends
- Provides real-time streaming and live query support via WebSocket
- Supports multiple named database connections with connection registry
- **Advanced features**: Query optimization hints, multi-database orchestration, schema versioning and rollback
- **Requirements**: Python 3.12+ and SurrealDB 1.0+
- **Version**: 1.1.0 (Production/Stable)
- **License**: Apache 2.0

# Installation

## Using pip
- Install via pip: `pip install surql`
- Verify installation: `surql --version`

## Using uv (Recommended)
- Install via uv: `uv add surql`
- Install with dev dependencies: `uv add surql --dev`
- Install with cache backend: `uv add "surql[cache]"` (includes Redis support)
- Sync dependencies: `uv sync`

## From Source
- Clone repository: `git clone https://github.com/Oneiriq/surql-py.git`
- Navigate to directory: `cd surql-py`
- Install with uv: `uv sync`
- Install in editable mode: `pip install -e .`

## From Source with pyproject.toml (Currently Recommended)
- Update the other project's `pyproject.toml` to include surql as a dependency with:
    ```toml
    # Other project's pyproject.toml
    [project]
    dependencies = [
        #...
        "surql @ git+https://github.com/Oneiriq/surql-py.git@v0.1.0",
    ]
- Then install with uv: `uv sync`

## SurrealDB Installation (If required)
- Download from https://surrealdb.com/
- Run locally: `surreal start --log trace --user root --pass root`
- Run with Docker: `docker run --rm -p 8000:8000 surrealdb/surrealdb:latest start --log trace --user root --pass root`
- Default endpoint: `ws://localhost:8000/rpc`

# Environment Setup

## Environment Variables
- Use `DB_` prefix for all database configuration variables
- `DB_URL`: Database connection URL (default: `ws://localhost:8000/rpc`)
- `DB_NAMESPACE`: Database namespace (default: `development`)
- `DB_DATABASE`: Database name (default: `main`)
- `DB_USERNAME`: Authentication username (optional, default: `None`)
- `DB_PASSWORD`: Authentication password (optional, default: `None`)
- `DB_TIMEOUT`: Connection timeout in seconds (default: `30.0`, min: `1.0`)
- `DB_MAX_CONNECTIONS`: Maximum concurrent connections (default: `10`, range: `1-100`)
- `DB_RETRY_MAX_ATTEMPTS`: Maximum retry attempts (default: `3`, range: `1-10`)
- `DB_RETRY_MIN_WAIT`: Minimum wait between retries in seconds (default: `1.0`, min: `0.1`)
- `DB_RETRY_MAX_WAIT`: Maximum wait between retries in seconds (default: `10.0`, min: `1.0`)
- `DB_RETRY_MULTIPLIER`: Exponential backoff multiplier (default: `2.0`, min: `1.0`)

## Named Connection Environment Variables
- Use `SURQL_{NAME}_` prefix for named connections (also supports legacy `DB_{NAME}_` prefix)
- Example for "analytics" connection:
  * `SURQL_ANALYTICS_URL`: URL for analytics database
  * `SURQL_ANALYTICS_NAMESPACE`: Namespace for analytics
  * `SURQL_ANALYTICS_DATABASE`: Database name for analytics
  * `SURQL_ANALYTICS_USERNAME`: Username for analytics
  * `SURQL_ANALYTICS_PASSWORD`: Password for analytics
- Load via: `NamedConnectionConfig.from_env("analytics")`

## .env File
- Create `.env` file in project root for local development
- Load automatically via `pydantic-settings`
- Example: `DB_URL=ws://localhost:8000/rpc`, `DB_NAMESPACE=myapp`, `DB_DATABASE=mydb`
- Create environment-specific files: `.env.development`, `.env.staging`, `.env.production`

## Connection Configuration
- Import: `from surql.connection.config import ConnectionConfig`
- Create programmatically: `config = ConnectionConfig(url='ws://localhost:8000/rpc', namespace='test', database='test', username='root', password='root')`
- Load from environment: `config = ConnectionConfig()` (automatically loads from env vars)
- Validates URL format (must start with `ws://`, `wss://`, `http://`, or `https://`)
- Validates namespace and database identifiers (alphanumeric with underscores/hyphens)
- All settings accessible as attributes: `config.url`, `config.namespace`, `config.database`

# Schema Definition

## Field Types Overview
- Import field types: `from surql.schema.fields import FieldType`
- Available types: `STRING`, `INT`, `FLOAT`, `BOOL`, `DATETIME`, `DURATION`, `DECIMAL`, `NUMBER`, `OBJECT`, `ARRAY`, `RECORD`, `GEOMETRY`, `ANY`
- All field functions in: `src.schema.fields`

## Common Field Parameters
- `name`: Field name (required) - supports dot notation for nested fields (e.g., `'address.city'`)
- `assertion`: SurrealQL assertion to validate field value (optional)
- `default`: SurrealQL expression for default value (optional)
- `readonly`: If `True`, field cannot be modified after creation (optional, default: `False`)
- `permissions`: Dictionary of permission rules with keys `select`, `create`, `update`, `delete` (optional)

## String Fields
- Import: `from surql.schema.fields import string_field`
- Create: `string_field('name', assertion='string::len($value) > 0')`
- Email validation: `string_field('email', assertion='string::is::email($value)')`
- Length validation: `string_field('username', assertion='string::len($value) >= 3 AND string::len($value) <= 20')`
- Pattern matching: `string_field('code', assertion='string::matches($value, "^[A-Z]{2}[0-9]{4}$")')`
- Default value: `string_field('status', default='"active"')`
- Readonly: `string_field('id_number', readonly=True)`

## Integer Fields
- Import: `from surql.schema.fields import int_field`
- Create: `int_field('age', assertion='$value >= 0 AND $value <= 150')`
- Range validation: `int_field('score', assertion='$value >= 0 AND $value <= 100')`
- Default value: `int_field('attempts', default='0')`
- Positive only: `int_field('quantity', assertion='$value > 0')`

## Float Fields
- Import: `from surql.schema.fields import float_field`
- Create: `float_field('price', assertion='$value >= 0.0')`
- Precision: `float_field('latitude', assertion='$value >= -90.0 AND $value <= 90.0')`
- Default value: `float_field('rating', default='0.0')`

## Boolean Fields
- Import: `from surql.schema.fields import bool_field`
- Create: `bool_field('is_active', default='true')`
- Default to false: `bool_field('is_deleted', default='false')`
- Readonly: `bool_field('is_verified', readonly=True)`

## Datetime Fields
- Import: `from surql.schema.fields import datetime_field`
- Create: `datetime_field('created_at', default='time::now()', readonly=True)`
- Auto timestamp: `datetime_field('updated_at', default='time::now()')`
- Future date: `datetime_field('expires_at', assertion='$value > time::now()')`
- Past date: `datetime_field('birth_date', assertion='$value < time::now()')`

## Record Fields (Foreign Keys)
- Import: `from surql.schema.fields import record_field`
- Create: `record_field('author', table='user')`
- Any table: `record_field('related_item')`
- With custom assertion: `record_field('owner', table='user', assertion='$value.id != NONE')`
- The `table` parameter automatically generates assertion: `$value.table = "table_name"`

## Array Fields
- Import: `from surql.schema.fields import array_field`
- Create: `array_field('tags', default='[]')`
- Length validation: `array_field('items', assertion='array::len($value) > 0')`
- Type validation: `array_field('scores', assertion='array::all($value, |$v| type::is::number($v))')`
- Default empty: `array_field('permissions', default='[]')`

## Object Fields
- Import: `from surql.schema.fields import object_field`
- Create: `object_field('metadata', flexible=True)`
- Strict schema: `object_field('settings', flexible=False)`
- Default value: `object_field('config', default='{}')`
- Nested validation: `object_field('address', assertion='$value.city != NONE AND $value.country != NONE')`

## Computed Fields
- Import: `from surql.schema.fields import computed_field`
- Create: `computed_field('full_name', 'string::concat(first_name, " ", last_name)', FieldType.STRING)`
- Count calculation: `computed_field('post_count', 'count(<-authored<-post)', FieldType.INT)`
- Date calculation: `computed_field('age', 'time::year() - time::year(birth_date)', FieldType.INT)`
- Always readonly (automatically set)
- Returns computed value on query

## Nested Fields
- Use dot notation in field name: `string_field('address.street')`, `string_field('address.city')`, `string_field('address.country')`
- Parent must be object type: `object_field('address', flexible=False)` with nested fields
- Access: Query returns nested structure automatically

## Generic Field Definition
- Import: `from surql.schema.fields import field, FieldType`
- Create: `field('custom_field', FieldType.STRING, assertion='...', default='...', readonly=False)`
- All parameters: `field(name, field_type, assertion=None, default=None, value=None, permissions=None, readonly=False, flexible=False)`

## Table Schemas

### Basic Table Schema
- Import: `from surql.schema.table import table_schema, TableMode`
- Create: `table_schema('user', mode=TableMode.SCHEMAFULL, fields=[...], indexes=[...], events=[...])`
- Schemaless mode: `table_schema('logs', mode=TableMode.SCHEMALESS)`
- Drop table: `table_schema('old_table', drop=True)`

### Table Modes
- Import: `from surql.schema.table import TableMode`
- `TableMode.SCHEMAFULL`: Enforces strict schema validation (recommended)
- `TableMode.SCHEMALESS`: Allows any fields
- `TableMode.DROP`: Marks table for deletion in migration

### Table Permissions
- Set on table: `table_schema('user', permissions={'select': '$auth.id = id', 'update': '$auth.id = id', 'delete': '$auth.admin = true'})`
- Permission keys: `select`, `create`, `update`, `delete`
- Uses SurrealQL expressions with `$auth` context
- Example public read: `{'select': 'true', 'create': '$auth != NONE'}`

### Functional Table Composition
- Import: `from surql.schema.table import with_fields, with_indexes, with_events, with_permissions, set_mode`
- Add fields: `with_fields(table, string_field('email'), int_field('age'))`
- Add indexes: `with_indexes(table, unique_index('email_idx', ['email']))`
- Add events: `with_events(table, event('audit', '$event = "CREATE"', '...'))`
- Set permissions: `with_permissions(table, {'select': 'true'})`
- Change mode: `set_mode(table, TableMode.SCHEMALESS)`
- Chain operations: `with_permissions(with_indexes(with_fields(table, ...), ...), ...)`

## Indexes

### Index Types
- Import: `from surql.schema.table import IndexType`
- `IndexType.STANDARD`: Regular index for fast lookups
- `IndexType.UNIQUE`: Enforces uniqueness constraint
- `IndexType.SEARCH`: Full-text search index

### Creating Indexes
- Import: `from surql.schema.table import index, unique_index, search_index`
- Generic index: `index('name_idx', ['name'], IndexType.STANDARD)`
- Unique index: `unique_index('email_idx', ['email'])`
- Search index: `search_index('content_search', ['title', 'body'])`
- Multi-column: `unique_index('user_email_idx', ['user_id', 'email'])`
- Nested field: `index('city_idx', ['address.city'])`

## Events (Triggers)

### Event Definition
- Import: `from surql.schema.table import event`
- Create: `event('audit_trail', '$event = "UPDATE"', 'CREATE audit_log SET ...')`
- On insert: `event('new_user', '$event = "CREATE"', 'CREATE notification SET ...')`
- On update: `event('email_changed', '$before.email != $after.email', 'CREATE email_verification SET ...')`
- On delete: `event('cleanup', '$event = "DELETE"', 'DELETE related_data WHERE owner = $before.id')`
- Variables available: `$event`, `$before`, `$after`, `$value`

## Edge Schemas (Graph Relationships)

### Edge Modes
- Import: `from surql.schema.edge import EdgeMode`
- `EdgeMode.RELATION`: Creates `TYPE RELATION` (default, recommended for graph relationships)
- `EdgeMode.SCHEMAFULL`: Creates SCHEMAFULL edge table
- `EdgeMode.SCHEMALESS`: Creates SCHEMALESS edge table

### Basic Edge Schema
- Import: `from surql.schema.edge import edge_schema`
- Create: `edge_schema('likes', from_table='user', to_table='post')`
- With mode: `edge_schema('likes', from_table='user', to_table='post', mode=EdgeMode.RELATION)`
- Unconstrained: `edge_schema('generic_relation')`
- With fields: `edge_schema('follows', from_table='user', to_table='user', fields=[datetime_field('since', default='time::now()')])`

### Convenience Edge Functions
- Import: `from surql.schema.edge import bidirectional_edge, typed_edge`
- Self-referential: `bidirectional_edge('follows', 'user', fields=[datetime_field('since')])`
- Typed edge: `typed_edge('authored', from_table='user', to_table='post', fields=[datetime_field('published_at')])`

### Edge Composition
- Import: `from surql.schema.edge import with_from_table, with_to_table, with_edge_fields, with_edge_indexes, with_edge_events, with_edge_permissions`
- Set source: `with_from_table(edge, 'user')`
- Set target: `with_to_table(edge, 'post')`
- Add fields: `with_edge_fields(edge, datetime_field('created_at'), int_field('weight'))`
- Add indexes: `with_edge_indexes(edge, index('created_idx', ['created_at']))`
- Add events: `with_edge_events(edge, event('on_create', '$event = "CREATE"', '...'))`
- Set permissions: `with_edge_permissions(edge, {'create': '$auth.id = in', 'delete': '$auth.id = in'})`

# Query Operations

## CRUD Operations

### Return Formats
- Import: `from surql.query.crud import ReturnFormat`
- `ReturnFormat.NONE`: Returns nothing (default for delete operations)
- `ReturnFormat.DIFF`: Returns only changed fields
- `ReturnFormat.FULL`: Returns complete record after operation (default for create/update)
- `ReturnFormat.BEFORE`: Returns record state before operation
- `ReturnFormat.AFTER`: Returns record state after operation
- Usage: `await create_record('user', data, return_format=ReturnFormat.AFTER)`

### Create Records
- Import: `from surql.query.crud import create_record, create_records`
- Single record: `created = await create_record('user', User(name='Alice', email='alice@example.com'))`
- With dict: `created = await create_record('user', {'name': 'Bob', 'email': 'bob@example.com'})`
- With return format: `created = await create_record('user', data, return_format=ReturnFormat.AFTER)`
- Multiple records: `created_list = await create_records('user', [User(name='Alice'), User(name='Bob')])`
- Returns created record(s) with ID

### Read Records
- Import: `from surql.query.crud import get_record, query_records, first, last`
- Get by ID: `user = await get_record('user', 'alice', User)`
- With RecordID: `user = await get_record('user', RecordID('user', 'alice'), User)`
- Query with filters: `users = await query_records('user', User, conditions=['age > 18', 'status = "active"'], order_by=('created_at', 'DESC'), limit=10)`
- First match: `user = await first('user', User, 'age > 18', ('created_at', 'DESC'))`
- Last match: `user = await last('user', User, 'age > 18', ('created_at', 'ASC'))`
- Returns `None` if not found

### Update Records
- Import: `from surql.query.crud import update_record, merge_record`
- Full update: `updated = await update_record('user', 'alice', User(name='Alice Updated', email='alice@new.com'))`
- Partial update: `updated = await merge_record('user', 'alice', {'status': 'active'})`
- With return format: `updated = await update_record('user', 'alice', data, return_format=ReturnFormat.DIFF)`
- Update replaces entire record, merge updates only specified fields
- Returns updated record

### Delete Records
- Import: `from surql.query.crud import delete_record, delete_records`
- Single delete: `await delete_record('user', 'alice')`
- Bulk delete: `await delete_records('user', 'status = "inactive"')`
- With operator: `from surql.types.operators import eq; await delete_records('user', eq('status', 'inactive'))`
- Returns nothing (void)

### Check Existence
- Import: `from surql.query.crud import exists`
- Check if exists: `if await exists('user', 'alice'): print('User exists')`
- Returns boolean

### Count Records
- Import: `from surql.query.crud import count_records`
- Total count: `total = await count_records('user')`
- With condition: `active = await count_records('user', 'status = "active"')`
- Returns integer

## Query Result Extraction

### Extract Results from Response
- Import: `from surql.query.results import extract_result`
- From list response: `records = extract_result([{'id': 'user:1', 'name': 'Alice'}])`
- From dict response: `record = extract_result({'result': [{'id': 'user:1'}]})`
- From nested: `records = extract_result({'ok': True, 'result': [{'data': [...]}]})`
- Handles various SurrealDB response formats automatically

### Extract Single Record
- Import: `from surql.query.results import extract_one`
- Without model: `record = extract_one(response)`
- With model: `user = extract_one(response, User)`
- Returns single record or None if empty
- Validates with Pydantic model if provided

### Extract Scalar Value
- Import: `from surql.query.results import extract_scalar`
- Get count: `count = extract_scalar(response)`
- Get single value: `name = extract_scalar(response)`
- Returns first value from first record
- Useful for aggregations and single-value queries

### Check for Results
- Import: `from surql.query.results import has_results`
- Check if not empty: `if has_results(response): print('Has data')`
- Returns boolean
- Works with all response formats

## Batch Operations

### Batch Upsert
- Import: `from surql.query.batch import upsert_many`
- Insert/update multiple: `results = await upsert_many('user', [User(...), User(...)])`
- With dicts: `results = await upsert_many('user', [{'name': 'Alice'}, {'name': 'Bob'}])`
- Conflict resolution: Uses UPSERT to handle existing records
- Returns list of upserted records

### Batch Relate (Create Edges)
- Import: `from surql.query.batch import relate_many`
- Create multiple edges: `edges = await relate_many('likes', [('user:alice', 'post:1'), ('user:bob', 'post:2')])`
- With properties: `edges = await relate_many('follows', [('user:alice', 'user:bob')], {'since': '2024-01-01'})`
- Returns list of created edge records

### Batch Insert
- Import: `from surql.query.batch import insert_many`
- Insert multiple: `results = await insert_many('user', [User(...), User(...)])`
- Fails if records exist: Unlike upsert, INSERT fails on conflicts
- Returns list of inserted records

### Batch Delete
- Import: `from surql.query.batch import delete_many`
- Delete by IDs: `await delete_many('user', ['alice', 'bob', 'charlie'])`
- With RecordID: `await delete_many('user', [RecordID('user', 'alice'), ...])`
- Returns deleted records (RETURN BEFORE)

## Expression Builders

### Field References
- Import: `from surql.query.expressions import field`
- Reference field: `field('name')` → `name`
- Nested field: `field('address.city')` → `address.city`
- In queries: `Query().select([field('name'), field('email')])`

### Literal Values
- Import: `from surql.query.expressions import value`
- String value: `value('hello')` → `"hello"`
- Number value: `value(42)` → `42`
- Boolean value: `value(True)` → `true`

### Aggregate Functions
- Import: `from surql.query.expressions import count, sum_, avg, min_, max_`
- Count all: `count()` → `count()`
- Count field: `count('id')` → `count(id)`
- Sum: `sum_('price')` → `sum(price)`
- Average: `avg('rating')` → `avg(rating)`
- Minimum: `min_('age')` → `min(age)`
- Maximum: `max_('score')` → `max(score)`

### String Functions
- Import: `from surql.query.expressions import concat, string_`
- Concatenate: `concat(field('first'), value(' '), field('last'))` → `string::concat(first, " ", last)`
- String operation: `string_('upper', field('name'))` → `string::upper(name)`

### Array Functions
- Import: `from surql.query.expressions import array`
- Create array: `array([value(1), value(2), value(3)])` → `[1, 2, 3]`
- With fields: `array([field('id'), field('name')])` → `[id, name]`

### Utility Functions
- Import: `from surql.query.expressions import time_now, rand_uuid`
- Current time: `time_now()` → `time::now()`
- Random UUID: `rand_uuid()` → `rand::uuid()`
- In defaults: `datetime_field('created_at', default=time_now())`

## Query Builder

### Basic Query Building
- Import: `from surql.query.builder import Query`
- SELECT all: `query = Query().select().from_table('user')`
- SELECT fields: `query = Query().select(['name', 'email']).from_table('user')`
- Build to SurrealQL: `sql = query.to_surql()`

### Filtering
- String condition: `query = Query().select().from_table('user').where('age > 18')`
- Multiple conditions: `query.where('age > 18').where('status = "active"')`
- Combined with AND: All conditions are ANDed together
- Import operators: `from surql.types.operators import eq, gt, lt, gte, lte, neq, contains, in_list`
- Operator usage: `query.where(gt('age', 18)).where(eq('status', 'active'))`

### Ordering
- Single order: `query.order_by('created_at', 'DESC')`
- Multiple orders: `query.order_by('status', 'ASC').order_by('created_at', 'DESC')`
- Ascending: `query.order_by('name', 'ASC')` or `query.order_by('name')`
- Descending: `query.order_by('name', 'DESC')`

### Pagination
- Limit: `query.limit(10)`
- Offset: `query.offset(20)`
- Combined: `query.limit(10).offset(20)` (skip 20, take 10)
- Page calculation: `page = 3; page_size = 10; query.limit(page_size).offset((page - 1) * page_size)`

### Grouping
- Single field: `query.select(['status', 'COUNT(*)']).from_table('user').group_by('status')`
- Multiple fields: `query.group_by('status', 'role')`

### Vector Similarity Search
- Import: `from surql.query.builder import Query, VectorDistanceType`
- Basic search: `query = Query().from_('documents').vector_search('embedding', [0.1, 0.2, ...], k=10)`
- With distance type: `query.vector_search('embedding', vector, k=5, distance=VectorDistanceType.COSINE)`
- Distance types: `COSINE`, `EUCLIDEAN`, `MANHATTAN`, `HAMMING`, `MINKOWSKI`, `CHEBYSHEV`, `PEARSON`, `JACCARD`, `DOT`
- Helper function: `from surql.query.builder import vector_search_query; query = vector_search_query('documents', 'embedding', vector, k=10)`
- Generates: `SELECT * FROM documents WHERE embedding <|k,DISTANCE|> [vector]`

### INSERT Operations
- Import: `from surql.query.builder import insert`
- Create query: `query = Query().insert('user', {'name': 'Alice', 'email': 'alice@example.com'})`
- Functional: `query = insert('user', data_dict)`
- Execute: `result = await client.execute(query.to_surql())`

### UPDATE Operations
- Import: `from surql.query.builder import update`
- Update record: `query = Query().update('user:alice', {'status': 'active'})`
- Update with condition: `query = Query().update('user', {'status': 'inactive'}).where('last_login < "2024-01-01"')`
- Functional: `query = update('user:alice', {'status': 'active'})`

### DELETE Operations
- Import: `from surql.query.builder import delete`
- Delete record: `query = Query().delete('user:alice')`
- Delete with condition: `query = Query().delete('user').where('deleted_at IS NOT NULL')`
- Functional: `query = delete('user:alice')`

### Query Execution
- Import: `from surql.query.executor import fetch_all, fetch_one`
- Execute with model: `results = await fetch_all(query, User, client)`
- Fetch single: `result = await fetch_one(query, User, client)`
- Raw execution: `result = await client.execute(query.to_surql())`

### Functional Query Helpers
- Import: `from surql.query.builder import select, from_table, where, order_by, limit, offset`
- Compose: `query = limit(order_by(where(from_table(select(['name']), 'user'), 'age > 18'), 'name'), 10)`
- More readable with chaining: Use method chaining instead

## Query Operators

### Comparison Operators
- Import: `from surql.types.operators import eq, neq, gt, gte, lt, lte`
- Equal: `eq('status', 'active')` → `status = "active"`
- Not equal: `neq('status', 'deleted')` → `status != "deleted"`
- Greater than: `gt('age', 18)` → `age > 18`
- Greater or equal: `gte('score', 100)` → `score >= 100`
- Less than: `lt('price', 50.0)` → `price < 50.0`
- Less or equal: `lte('quantity', 10)` → `quantity <= 10`

### Collection Operators
- Import: `from surql.types.operators import contains, in_list, not_in`
- Contains: `contains('tags', 'python')` → `"python" IN tags`
- In list: `in_list('status', ['active', 'pending'])` → `status IN ["active", "pending"]`
- Not in: `not_in('role', ['admin', 'moderator'])`

### Logical Operators
- Import: `from surql.types.operators import and_op, or_op, not_op`
- AND: `and_op(gt('age', 18), eq('status', 'active'))`
- OR: `or_op(eq('role', 'admin'), eq('role', 'moderator'))`
- NOT: `not_op(eq('deleted', true))`
- Combine: `and_op(gt('age', 18), or_op(eq('status', 'active'), eq('status', 'pending')))`

### Using Operators in Queries
- With query builder: `query.where(and_op(gt('age', 18), eq('status', 'active')))`
- With CRUD: `await query_records('user', User, conditions=[gt('age', 18), eq('status', 'active')])`
- Multiple conditions are ANDed: `conditions=[cond1, cond2]` → `WHERE (cond1) AND (cond2)`

# Graph Operations

## Graph Query Builder

### GraphQuery Class
- Import: `from surql.query.graph import GraphQuery`
- Create: `query = GraphQuery[User]('user:alice')`
- Generic type support: `GraphQuery[T]` where T is your Pydantic model

### Traversal Methods
- Outgoing: `query.out('likes', 'post')` → Follow outgoing 'likes' edges to 'post' table
- Incoming: `query.in_('follows', 'user')` → Follow incoming 'follows' edges from 'user' table
- Bidirectional: `query.both('friends', 'user')` → Follow edges in both directions
- Edge target: `query.to('user')` → Specify target table type
- Chaining: `query.out('follows', 'user').out('likes', 'post')`

### Filtering and Modifiers
- Filter results: `query.where('age > 18')`
- Select fields: `query.select(['name', 'email'])`
- Limit results: `query.limit(10)`
- Set depth: `query.depth(3)` → Maximum traversal depth

### Execution
- Build query: `sql = query.build()` → Returns SurrealQL string
- Fetch results: `results = await query.fetch(client)` → Returns list of T
- Count results: `count = await query.count(client)` → Returns integer
- Check existence: `exists = await query.exists(client)` → Returns boolean

### Example Usage
```python
from surql.query.graph import GraphQuery
from models import User, Post

# Find all posts liked by users that alice follows
posts = await (
    GraphQuery[User]('user:alice')
    .out('follows', 'user')
    .out('likes', 'post')
    .where('published = true')
    .limit(20)
    .fetch(client)
)
```

## Graph Helper Functions

### Find Mutual Connections
- Import: `from surql.query.graph import find_mutual_connections`
- Find mutual: `mutual = await find_mutual_connections('user:alice', 'user:bob', 'follows', User)`
- Returns list of users that both alice and bob follow
- Bidirectional detection

### Find Shortest Path
- Import: `from surql.query.graph import find_shortest_path`
- Find path: `path = await find_shortest_path('user:alice', 'user:charlie', 'follows', max_depth=5)`
- Returns list of record IDs in the path from source to target
- Path reconstruction included

### Get Neighbors
- Import: `from surql.query.graph import get_neighbors`
- Get N-hop neighbors: `neighbors = await get_neighbors('user:alice', 'follows', depth=2, User)`
- Direction: `neighbors = await get_neighbors('user:alice', 'follows', depth=1, User, direction='out')`
- Returns list of neighbors at specified depth

### Compute Degree
- Import: `from surql.query.graph import compute_degree`
- Total degree: `degree = await compute_degree('user:alice', 'follows')`
- In-degree: `in_deg = await compute_degree('user:alice', 'follows', direction='in')`
- Out-degree: `out_deg = await compute_degree('user:alice', 'follows', direction='out')`
- Returns integer count

## Graph Traversal

### Basic Traversal
- Import: `from surql.query.graph import traverse`
- Outgoing: `posts = await traverse('user:alice', '->likes->post', Post)`
- Incoming: `followers = await traverse('user:alice', '<-follows<-user', User)`
- Multi-hop: `fof = await traverse('user:alice', '<-follows<-user<-follows<-user', User)`
- Pattern: `start_record` + traversal_path + target_type

### Traversal with Depth
- Import: `from surql.query.graph import traverse_with_depth`
- Fixed depth: `posts = await traverse_with_depth('user:alice', 'likes', 'post', 'out', 1, Post)`
- Direction options: `'out'` (outgoing), `'in'` (incoming), `'both'` (bidirectional)
- Unlimited depth: `connected = await traverse_with_depth('user:alice', 'follows', 'user', 'both', None, User)`
- Returns list of target records

### Get Related Records
- Import: `from surql.query.graph import get_related_records`
- Outgoing relations: `posts = await get_related_records('user:alice', 'likes', 'post', 'out', Post)`
- Incoming relations: `likers = await get_related_records('post:123', 'likes', 'user', 'in', User)`
- Without model: `records = await get_related_records('user:alice', 'likes', 'post', 'out')`

### Count Related
- Import: `from surql.query.graph import count_related`
- Outgoing count: `like_count = await count_related('user:alice', 'likes', 'out')`
- Incoming count: `follower_count = await count_related('user:alice', 'follows', 'in')`
- Returns integer

## Edge Operations

### Create Edges
- Import: `from surql.query.graph import relate`
- Simple edge: `edge = await relate('likes', 'user:alice', 'post:123')`
- With properties: `edge = await relate('follows', 'user:alice', 'user:bob', {'since': '2024-01-01', 'weight': 1})`
- Returns created edge record with ID

### Delete Edges
- Import: `from surql.query.graph import unrelate`
- Remove edge: `await unrelate('likes', 'user:alice', 'post:123')`
- Specific direction only: Deletes the exact `from->edge->to` relationship
- Returns nothing (void)

### Get Edges
- Import: `from surql.query.graph import get_outgoing_edges, get_incoming_edges`
- Outgoing: `likes = await get_outgoing_edges('user:alice', 'likes')`
- Incoming: `liked_by = await get_incoming_edges('post:123', 'likes')`
- With model: `likes = await get_outgoing_edges('user:alice', 'likes', LikeEdge)`
- Returns list of edge records (not target records)

### Query Builder Graph Support
- Import: `from surql.query.builder import Query`
- RELATE: `query = Query().relate('likes', 'user:alice', 'post:123', {'created_at': 'time::now()'})`
- Traverse: `query = Query().select().from_table('user:alice').traverse('->likes->post')`
- Execute: `result = await client.execute(query.to_surql())`

# Vector Operations

## HNSW Vector Indexes (Recommended)

### Overview
- HNSW (Hierarchical Navigable Small World) is the recommended vector index type for SurrealDB 2.x+
- Successor to MTREE indexes with better performance characteristics
- Supports 8 distance metrics and 5 vector data types
- Optional EFC and M tuning parameters for quality/speed tradeoff
- Ideal for AI/ML applications with embeddings (OpenAI, sentence transformers, etc.)

### Distance Metrics
- Import: `from surql.schema.table import HnswDistanceType`
- `HnswDistanceType.CHEBYSHEV`: Maximum-dimension deviation
- `HnswDistanceType.COSINE`: Cosine similarity (ideal for normalized vectors like embeddings)
- `HnswDistanceType.EUCLIDEAN`: Euclidean distance (L2 norm, default)
- `HnswDistanceType.HAMMING`: Hamming distance (binary/categorical vectors)
- `HnswDistanceType.JACCARD`: Set similarity
- `HnswDistanceType.MANHATTAN`: Manhattan distance (L1 norm)
- `HnswDistanceType.MINKOWSKI`: Minkowski distance
- `HnswDistanceType.PEARSON`: Correlation-based similarity

### Creating HNSW Indexes
- Import: `from surql.schema.table import hnsw_index, HnswDistanceType, MTreeVectorType`
- Basic: `hnsw_index('embedding_idx', 'embedding', 1536)`
- With distance metric: `hnsw_index('embedding_idx', 'embedding', 1536, distance=HnswDistanceType.COSINE)`
- With vector type: `hnsw_index('embedding_idx', 'embedding', 1536, distance=HnswDistanceType.COSINE, vector_type=MTreeVectorType.F32)`
- With tuning: `hnsw_index('embedding_idx', 'embedding', 1536, distance=HnswDistanceType.COSINE, efc=500, m=16)`
- Parameters:
  * `name`: Index name (required)
  * `column`: Column containing vector data (required)
  * `dimension`: Number of dimensions in the vector (required)
  * `distance`: Distance metric (optional, default: EUCLIDEAN)
  * `vector_type`: Vector component data type (optional, default: F64)
  * `efc`: Exploration factor during construction (optional, SurrealDB default: 150)
  * `m`: Max bidirectional links per node (optional, SurrealDB default: 12)

### HNSW Example
```python
from surql.schema import (
  table_schema,
  with_fields,
  with_indexes,
  hnsw_index,
  array_field,
  string_field,
  HnswDistanceType,
  MTreeVectorType,
)

documents = table_schema('documents')
documents = with_fields(
  documents,
  string_field('title'),
  string_field('content'),
  array_field('embedding'),
)
documents = with_indexes(
  documents,
  hnsw_index(
    'embedding_idx',
    'embedding',
    1536,
    distance=HnswDistanceType.COSINE,
    vector_type=MTreeVectorType.F32,
    efc=500,
    m=16,
  ),
)
```

### Generated SQL
- HNSW indexes generate SurrealDB DEFINE INDEX statements
- Example: `DEFINE INDEX embedding_idx ON TABLE documents COLUMNS embedding HNSW DIMENSION 1536 DIST COSINE TYPE F32 EFC 500 M 16;`
- Migrations automatically handle creation and removal
- Use `surql migrate up` to apply HNSW indexes to database

## MTREE Vector Indexes (Legacy)

### Overview
- MTREE indexes are the legacy vector index type; prefer HNSW for new projects
- MTREE indexes enable efficient nearest-neighbor search for embeddings and vectors
- Supports multiple distance metrics and vector data types
- Ideal for AI/ML applications with embeddings (OpenAI, sentence transformers, etc.)

### Distance Metrics
- Import: `from surql.schema.table import MTreeDistanceType`
- `MTreeDistanceType.COSINE`: Cosine similarity (ideal for normalized vectors like embeddings)
- `MTreeDistanceType.EUCLIDEAN`: Euclidean distance (L2 norm, default)
- `MTreeDistanceType.MANHATTAN`: Manhattan distance (L1 norm)
- `MTreeDistanceType.MINKOWSKI`: Minkowski distance

### Vector Data Types
- Import: `from surql.schema.table import MTreeVectorType`
- `MTreeVectorType.F64`: 64-bit floating point (default, highest precision)
- `MTreeVectorType.F32`: 32-bit floating point (recommended for embeddings)
- `MTreeVectorType.I64`: 64-bit integer
- `MTreeVectorType.I32`: 32-bit integer
- `MTreeVectorType.I16`: 16-bit integer

### Creating MTREE Indexes
- Import: `from surql.schema.table import mtree_index, MTreeDistanceType, MTreeVectorType`
- Basic: `mtree_index('embedding_idx', 'embedding', 1536)`
- With distance metric: `mtree_index('embedding_idx', 'embedding', 1536, distance=MTreeDistanceType.COSINE)`
- With vector type: `mtree_index('embedding_idx', 'embedding', 1536, distance=MTreeDistanceType.COSINE, vector_type=MTreeVectorType.F32)`
- Parameters:
  * `name`: Index name (required)
  * `column`: Column containing vector data (required)
  * `dimension`: Number of dimensions in the vector (required)
  * `distance`: Distance metric (optional, default: EUCLIDEAN)
  * `vector_type`: Vector component data type (optional, default: F64)

### OpenAI Embeddings Example
```python
from surql.schema import (
  table_schema,
  with_fields,
  with_indexes,
  mtree_index,
  array_field,
  string_field,
  MTreeDistanceType,
  MTreeVectorType,
)

documents = table_schema('documents')
documents = with_fields(
  documents,
  string_field('title'),
  string_field('content'),
  array_field('embedding'),  # Store embedding vector as array
)
documents = with_indexes(
  documents,
  mtree_index(
    'embedding_idx',
    'embedding',
    1536,  # OpenAI embedding dimension
    distance=MTreeDistanceType.COSINE,
    vector_type=MTreeVectorType.F32,
  ),
)
```

### Custom Vector Index Example
```python
# Image feature vectors with Euclidean distance
images = table_schema('images')
images = with_fields(
  images,
  string_field('filename'),
  array_field('features'),
)
images = with_indexes(
  images,
  mtree_index(
    'feature_idx',
    'features',
    512,  # Custom vector dimension
    distance=MTreeDistanceType.EUCLIDEAN,
    vector_type=MTreeVectorType.F32,
  ),
)
```

### Multiple Vector Indexes
```python
# Table with multiple vector fields
products = table_schema('products')
products = with_fields(
  products,
  string_field('name'),
  string_field('description'),
  array_field('text_embedding'),
  array_field('image_embedding'),
)
products = with_indexes(
  products,
  mtree_index(
    'text_idx',
    'text_embedding',
    1536,
    distance=MTreeDistanceType.COSINE,
    vector_type=MTreeVectorType.F32,
  ),
  mtree_index(
    'image_idx',
    'image_embedding',
    512,
    distance=MTreeDistanceType.EUCLIDEAN,
    vector_type=MTreeVectorType.F32,
  ),
)
```

### Generated SQL
- MTREE indexes generate SurrealDB DEFINE INDEX statements
- Example: `DEFINE INDEX embedding_idx ON TABLE documents FIELDS embedding MTREE DIMENSION 1536 DIST COSINE TYPE F32;`
- Migrations automatically handle creation and removal
- Use `surql migrate up` to apply MTREE indexes to database

### Vector Similarity Search
- Use query builder's `vector_search()` method: `Query().from_('documents').vector_search('embedding', vector, k=10, distance=VectorDistanceType.COSINE)`
- Generates SurrealQL: `SELECT * FROM documents WHERE embedding <|10,COSINE|> [0.1, 0.2, ...]`
- Search returns k nearest neighbors based on configured distance metric
- Works with any HNSW or MTREE indexed field

### Choosing Distance Metrics
- **COSINE**: Use for normalized vectors (embeddings from ML models like OpenAI, Cohere, sentence transformers)
- **EUCLIDEAN**: Use for general-purpose vectors, image features, unnormalized data
- **MANHATTAN**: Use for high-dimensional sparse vectors, grid-like data
- **MINKOWSKI**: Use for specialized distance calculations

### Choosing Vector Types
- **F32**: Recommended for ML embeddings (balances precision and memory)
- **F64**: Use when maximum precision is required
- **I16/I32/I64**: Use for quantized or integer-based vectors

### Performance Considerations
- HNSW and MTREE indexes improve query performance for similarity search
- Index creation time increases with vector dimension and dataset size
- F32 uses less memory than F64 while maintaining good precision for embeddings
- COSINE distance is optimized for normalized vectors

### Migration Example
```python
"""Migration: Add vector search to documents"""
from surql.schema import (
  table_schema,
  with_fields,
  with_indexes,
  mtree_index,
  array_field,
  string_field,
  MTreeDistanceType,
  MTreeVectorType,
)

def upgrade():
  """Add MTREE index for embeddings."""
  return [
    table_schema(
      'documents',
      fields=[
        string_field('title'),
        string_field('content'),
        array_field('embedding'),
      ],
      indexes=[
        mtree_index(
          'embedding_idx',
          'embedding',
          1536,
          distance=MTreeDistanceType.COSINE,
          vector_type=MTreeVectorType.F32,
        ),
      ],
    )
  ]

def downgrade():
  """Remove MTREE index."""
  return [
    table_schema(
      'documents',
      fields=[
        string_field('title'),
        string_field('content'),
        array_field('embedding'),
      ],
      indexes=[],  # Remove all indexes
    )
  ]
```

# Query Optimization Hints

## Overview
- Query hints guide SurrealDB's query planner to optimize query execution
- Rendered as SQL comments in SurrealQL for compatibility
- Five hint types: IndexHint, ParallelHint, TimeoutHint, FetchHint, ExplainHint
- Applied via fluent API methods on Query builder
- Later hints override earlier hints of the same type

## Hint Types

### IndexHint
- Import: `from surql.query.hints import IndexHint`
- Force/suggest specific index usage
- Create: `IndexHint(table='user', index='email_idx', force=False)`
- Query methods: `query.use_index('email_idx')` or `query.force_index('email_idx')`
- Output: `/* USE INDEX user.email_idx */` or `/* FORCE INDEX user.email_idx */`
- Use when: You know optimal index, query planner chooses suboptimal index

### ParallelHint
- Import: `from surql.query.hints import ParallelHint`
- Enable/configure parallel query execution
- Create: `ParallelHint(enabled=True, max_workers=4)`
- Query method: `query.parallel()` or `query.parallel(max_workers=8)`
- Output: `/* PARALLEL ON */`, `/* PARALLEL 8 */`, or `/* PARALLEL OFF */`
- Use when: Processing large datasets, complex aggregations
- Sweet spot: 4-8 workers typically optimal

### TimeoutHint
- Import: `from surql.query.hints import TimeoutHint`
- Set custom query timeout
- Create: `TimeoutHint(seconds=30.0)`
- Query method: `query.with_timeout(30.0)`
- Output: `/* TIMEOUT 30.0s */`
- Use when: Prevent long-running queries, set stricter timeouts for user queries
- Common values: Interactive (5-10s), Reports (30-60s), Batch (300+s)

### FetchHint
- Import: `from surql.query.hints import FetchHint`
- Control how records are fetched from database
- Strategies: `eager` (load all), `lazy` (on demand), `batch` (in chunks)
- Create: `FetchHint(strategy='batch', batch_size=1000)`
- Query method: `query.with_fetch('batch', batch_size=1000)`
- Output: `/* FETCH BATCH 1000 */`, `/* FETCH EAGER */`, `/* FETCH LAZY */`
- Eager: Best for small result sets (<1000 rows), high memory, low latency
- Batch: Best for medium/large datasets, medium memory/latency
- Lazy: Best for very large datasets, low memory, high latency

### ExplainHint
- Import: `from surql.query.hints import ExplainHint`
- Request query execution plan for performance analysis
- Create: `ExplainHint()` or `ExplainHint(full=True)`
- Query method: `query.explain()` or `query.explain(full=True)`
- Output: `/* EXPLAIN */` or `/* EXPLAIN FULL */`
- Use when: Debugging slow queries, understanding index usage, verifying optimizations

## Using Hints with Query Builder

### Single Hint
```python
from surql.query import Query

query = (
  Query()
    .select(['name', 'email'])
    .from_table('user')
    .where('age >= 18')
    .use_index('age_idx')
)
# Generates: /* USE INDEX user.age_idx */ SELECT name, email FROM user WHERE age >= 18
```

### Multiple Hints
```python
query = (
  Query()
    .select()
    .from_table('large_table')
    .where('status = "active"')
    .use_index('status_idx')
    .with_timeout(30.0)
    .parallel(max_workers=4)
    .with_fetch('batch', batch_size=1000)
)
# Generates all hints as comments before query
```

### with_hints() Method
```python
from surql.query.hints import IndexHint, TimeoutHint, ParallelHint

query = (
  Query()
    .select()
    .from_table('analytics_data')
    .where('date >= $start_date')
    .with_hints(
      IndexHint(table='analytics_data', index='date_idx'),
      TimeoutHint(seconds=60.0),
      ParallelHint(enabled=True, max_workers=6),
    )
)
```

## Hint Validation and Merging

### validate_hint()
- Import: `from surql.query.hints import validate_hint`
- Validate hint applicable to query context
- Usage: `errors = validate_hint(hint, table='user')`
- Returns: List of validation error messages (empty if valid)
- Checks: Index hint table matches query table

### merge_hints()
- Import: `from surql.query.hints import merge_hints`
- Merge multiple hints, resolving conflicts
- Usage: `merged = merge_hints([hint1, hint2, hint3])`
- Behavior: Later hints override earlier hints of same type
- Returns: Merged list with duplicates resolved

### render_hints()
- Import: `from surql.query.hints import render_hints`
- Render hints to SurrealQL comment string
- Usage: `sql = render_hints([TimeoutHint(seconds=30), ParallelHint(enabled=True)])`
- Returns: Space-separated hint comments

## Best Practices

### Profile Before Optimizing
- Always measure performance before adding hints
- Use `explain()` to understand current query behavior
- Only add hints if performance is unacceptable

### Test Hint Combinations
- Different combinations have varying effects
- Test worker counts: 2, 4, 8 with different batch sizes
- Measure and compare durations

### Conservative Timeouts
- Set timeouts that allow legitimate queries to complete
- Too strict may fail valid queries
- Be generous for batch operations

### Document Why Hints Are Used
- Add comments explaining optimization choices
- Example: "This query scans millions of records, so we force date index, enable parallel, use batch fetch"

### Hint Override Behavior
- Later hints override earlier hints of same type
- Final query uses last specified timeout, not first

## Usage Examples

### User Search Optimization
```python
async def search_users_optimized(search_term: str):
  query = (
    Query()
      .select(['id', 'username', 'email'])
      .from_table('user')
      .where('username CONTAINS $search OR email CONTAINS $search')
      .force_index('username_email_idx')
      .with_timeout(5.0)
      .with_fetch('batch', batch_size=100)
      .limit(100)
  )
  async with get_client(config) as client:
    return await query.execute(client, {'search': search_term})
```

### Large Dataset Processing
```python
async def process_event_logs(start_date: str):
  query = (
    Query()
      .select(['event_type', 'count() as total'])
      .from_table('events')
      .where('created_at >= $start_date')
      .group_by(['event_type'])
      .use_index('created_at_idx')
      .parallel(max_workers=8)
      .with_fetch('batch', batch_size=10000)
      .with_timeout(600.0)
  )
  async with get_client(config) as client:
    return await query.execute(client, {'start_date': start_date})
```

### Query Performance Analysis
```python
async def analyze_slow_query():
  # Step 1: Run with EXPLAIN
  analysis = (
    Query()
      .select()
      .from_table('orders')
      .where('total > 1000 AND status = "pending"')
      .explain(full=True)
  )
  
  async with get_client(config) as client:
    plan = await analysis.execute(client)
    
    # Step 2: Apply optimizations based on plan
    optimized = (
      Query()
        .select()
        .from_table('orders')
        .where('total > 1000 AND status = "pending"')
        .use_index('total_status_idx')
        .with_timeout(15.0)
        .parallel(max_workers=4)
    )
    
    result = await optimized.execute(client)
    return result
```

# Query Result Caching

## Cache Overview
- surqlprovides a flexible caching system for query results
- Supports multiple backends: in-memory LRU cache and Redis
- Automatic cache invalidation by table name
- TTL (time-to-live) support with customizable expiration
- Cache statistics and monitoring

## Cache Backends

### Memory Cache (Default)
- Import: `from surql.cache.backends import MemoryCache`
- In-memory LRU cache using `cachetools`
- Fast, no external dependencies
- Limited to single process
- Auto-eviction when max_size reached
- Example: `cache = MemoryCache(max_size=1000)`

### Redis Cache (Optional)
- Import: `from surql.cache.backends import RedisCache`
- Requires: `pip install "surql[cache]"` or `pip install redis>=5.0.0`
- Distributed caching across processes/servers
- Persistent storage option
- Better for production deployments
- Example: `cache = RedisCache(redis_url='redis://localhost:6379/0')`

## Cache Configuration

### CacheConfig
- Import: `from surql.cache.config import CacheConfig, CacheBackend`
- Backend selection: `CacheConfig(backend=CacheBackend.MEMORY)` or `CacheBackend.REDIS`
- TTL setting: `CacheConfig(default_ttl=300)` (seconds)
- Max size: `CacheConfig(max_size=1000)` (memory backend only)
- Redis URL: `CacheConfig(backend=CacheBackend.REDIS, redis_url='redis://localhost:6379/0')`
- Key prefix: `CacheConfig(key_prefix='myapp')` (for namespacing)

### Configure Global Cache
- Import: `from surql.cache import configure_cache`
- Set up: `configure_cache(CacheConfig(backend=CacheBackend.MEMORY, default_ttl=600))`
- Call once at application startup
- All cached queries use this configuration

## Cache Decorator

### @cache_query Decorator
- Import: `from surql.cache import cache_query`
- Basic usage: `@cache_query(ttl=60) async def get_users(): ...`
- Custom key: `@cache_query(key='user_list', ttl=300) async def get_users(): ...`
- No TTL: `@cache_query() async def get_static_data(): ...` (uses default)
- Parameter-based key: Key automatically includes function args/kwargs
- Works with async functions only

### Decorator Options
- `ttl`: Time-to-live in seconds (optional, uses default_ttl if not specified)
- `key`: Custom cache key (optional, auto-generated from function name and args)
- `table`: Table name for invalidation tracking (optional)
- Example: `@cache_query(ttl=120, key='active_users', table='user')`

## Manual Cache Management

### CacheManager
- Import: `from surql.cache import get_cache_manager`
- Get instance: `manager = get_cache_manager()`
- Get or set: `value = await manager.get_or_set('key', async_fn, ttl=300)`
- Set value: `await manager.set('key', value, ttl=300, table='user')`
- Get value: `value = await manager.get('key')`
- Delete key: `await manager.delete('key')`

### Cache Invalidation

#### By Key
- Import: `from surql.cache import invalidate`
- Invalidate key: `await invalidate('user_list')`
- Pattern matching: `await invalidate('user_*')` (all keys starting with 'user_')
- Explicit: `await get_cache_manager().delete('specific_key')`

#### By Table
- Invalidate all cached queries for a table: `await get_cache_manager().invalidate_table('user')`
- Automatically clears all cache entries associated with the table
- Useful after bulk updates or schema changes

#### Clear All
- Import: `from surql.cache import clear_cache`
- Clear everything: `await clear_cache()`
- Reset all cache entries
- Use with caution in production

## Cache Statistics

### CacheStats
- Import: `from surql.cache.config import CacheStats`
- Get stats: `stats = await get_cache_manager().get_stats()`
- Access metrics:
  * `stats.hits`: Number of cache hits
  * `stats.misses`: Number of cache misses
  * `stats.size`: Current cache size (number of entries)
  * `stats.evictions`: Number of evicted entries
  * `stats.hit_ratio`: Calculate hit/(hit+miss) ratio
- Monitor cache effectiveness

## Usage Examples

### Basic Setup
```python
from surql.cache import configure_cache, cache_query
from surql.cache.config import CacheConfig, CacheBackend

# Configure at startup
configure_cache(CacheConfig(
    backend=CacheBackend.MEMORY,
    default_ttl=300,
    max_size=1000,
))

# Cache a query function
@cache_query(ttl=60)
async def get_active_users():
    return await query_records('user', User, conditions=['is_active = true'])
```

### Redis Backend
```python
configure_cache(CacheConfig(
    backend=CacheBackend.REDIS,
    redis_url='redis://localhost:6379/0',
    key_prefix='myapp',
    default_ttl=600,
))
```

### Manual Caching
```python
from surql.cache import get_cache_manager

manager = get_cache_manager()

# Cache manually
async def get_user_stats(user_id: str):
    key = f'user_stats:{user_id}'
    
    # Try cache first
    cached = await manager.get(key)
    if cached:
        return cached
    
    # Fetch and cache
    stats = await compute_stats(user_id)
    await manager.set(key, stats, ttl=300, table='user')
    return stats
```

### Table-Based Invalidation
```python
# After updating users, invalidate all user-related caches
await update_record('user', 'alice', updated_data)
await get_cache_manager().invalidate_table('user')
```

### Performance Monitoring
```python
stats = await get_cache_manager().get_stats()
print(f"Cache hit ratio: {stats.hit_ratio:.2%}")
print(f"Total hits: {stats.hits}, misses: {stats.misses}")
```

# Real-Time Streaming

## Streaming Overview
- surqlsupports SurrealDB's LIVE queries for real-time updates
- Receive notifications when records are created, updated, or deleted
- WebSocket-based streaming (requires `ws://` or `wss://` connection)
- Both callback-based and async iteration patterns supported

## StreamingManager

### Creating Streaming Manager
- Import: `from surql.connection.streaming import StreamingManager`
- Create: `manager = StreamingManager(client)`
- Pass DatabaseClient instance
- Manager handles query lifecycle

### Create Live Query
- Import: `from surql.connection.streaming import LiveQuery`
- Create: `live_query = await manager.live('user', callback=on_change)`
- With conditions: `live_query = await manager.live('user', diff=True, callback=on_change)`
- Returns LiveQuery object with query UUID

### Stop Live Query
- Stop streaming: `await manager.kill(live_query.query_id)`
- Cleanup: `await manager.cleanup()`
- Automatically called on manager disposal

## LiveQuery

### Callback Pattern
```python
from surql.connection.streaming import StreamingNotification

async def on_user_change(notification: StreamingNotification):
    print(f"Action: {notification.action}")
    print(f"Record: {notification.result}")

live_query = await manager.live('user', callback=on_user_change)
# Keep application running to receive notifications
```

### Async Iterator Pattern
```python
live_query = await manager.live('user')

async for notification in live_query:
    if notification.action == 'CREATE':
        print(f"New user: {notification.result}")
    elif notification.action == 'UPDATE':
        print(f"Updated user: {notification.result}")
    elif notification.action == 'DELETE':
        print(f"Deleted user: {notification.result}")
```

### Diff Mode
- Enable: `live_query = await manager.live('user', diff=True)`
- Returns JSON Patch operations instead of full records
- Useful for bandwidth optimization
- Notification.result contains RFC 6902 JSON Patch array

## StreamingNotification

### Notification Properties
- `action`: Event type (`'CREATE'`, `'UPDATE'`, `'DELETE'`)
- `result`: Record data or diff (depends on diff mode)
- `query_id`: UUID of the live query (optional)

### Processing Notifications
```python
async def process_notification(notif: StreamingNotification):
    match notif.action:
        case 'CREATE':
            await handle_create(notif.result)
        case 'UPDATE':
            await handle_update(notif.result)
        case 'DELETE':
            await handle_delete(notif.result)
```

## Usage Examples

### Monitor Table Changes
```python
from surql.connection.streaming import StreamingManager

async with get_client(config) as client:
    manager = StreamingManager(client)
    
    async def on_change(notification):
        print(f"{notification.action}: {notification.result}")
    
    live_query = await manager.live('user', callback=on_change)
    
    # Application continues running
    await asyncio.sleep(3600)  # Run for 1 hour
    
    await manager.kill(live_query.query_id)
```

### Real-Time Dashboard
```python
from collections import defaultdict

stats = defaultdict(int)

async def update_stats(notification):
    stats[notification.action] += 1
    print(f"Creates: {stats['CREATE']}, Updates: {stats['UPDATE']}, Deletes: {stats['DELETE']}")

live_query = await manager.live('user', callback=update_stats)
```

### Conditional Streaming
```python
# Only stream active users (requires SurrealDB 2.0+)
live_query = await manager.live(
    'user',
    callback=on_change,
    # Note: Condition support may vary by SurrealDB version
)
```

# Authentication

## Authentication Overview
- surqlsupports all SurrealDB authentication types
- Root, Namespace, Database, and Scope authentication
- JWT token management with automatic refresh
- Credential validation and type safety

## AuthType Enum
- Import: `from surql.connection.auth import AuthType`
- `AuthType.ROOT`: Root-level access (highest privileges)
- `AuthType.NAMESPACE`: Namespace-level access
- `AuthType.DATABASE`: Database-level access
- `AuthType.SCOPE`: Scope-based authentication (custom user scopes)

## Credential Types

### RootCredentials
- Import: `from surql.connection.auth import RootCredentials`
- Create: `creds = RootCredentials(username='root', password='root')`
- Full system access
- Use for administrative operations

### NamespaceCredentials
- Import: `from surql.connection.auth import NamespaceCredentials`
- Create: `creds = NamespaceCredentials(namespace='myapp', username='admin', password='secret')`
- Access to specific namespace
- Cannot access other namespaces

### DatabaseCredentials
- Import: `from surql.connection.auth import DatabaseCredentials`
- Create: `creds = DatabaseCredentials(namespace='myapp', database='prod', username='user', password='pass')`
- Access to specific database within namespace
- Most common for application auth

### ScopeCredentials
- Import: `from surql.connection.auth import ScopeCredentials`
- Create: `creds = ScopeCredentials(namespace='myapp', database='prod', scope='user', params={'email': 'user@example.com', 'pass': 'password'})`
- Custom authentication scopes
- Flexible parameter passing
- Define scopes in SurrealDB schema

## AuthManager

### Creating AuthManager
- Import: `from surql.connection.auth import AuthManager`
- Create: `auth = AuthManager(client)`
- Pass DatabaseClient instance
- Manages authentication state

### Sign In
- Root: `token = await auth.signin(RootCredentials(username='root', password='root'))`
- Namespace: `token = await auth.signin(NamespaceCredentials(...))`
- Database: `token = await auth.signin(DatabaseCredentials(...))`
- Scope: `token = await auth.signin(ScopeCredentials(...))`
- Returns JWT token string

### Sign Up (Scope Only)
- Create user: `token = await auth.signup(ScopeCredentials(namespace='myapp', database='prod', scope='user', params={'email': '...', 'pass': '...'}))`
- Only works with scope authentication
- Creates new user in scope
- Returns JWT token

### Authenticate with Token
- Use token: `await auth.authenticate(token)`
- Validates and applies JWT token
- Use for session restoration
- Token must be valid and not expired

### Invalidate Session
- Sign out: `await auth.invalidate()`
- Clears authentication state
- Revokes current token
- Client becomes unauthenticated

### Check Authentication
- Get token: `token = auth.token`
- Check if authenticated: `if auth.token: print('Authenticated')`
- Token property returns current JWT or None

## Usage Examples

### Database Authentication
```python
from surql.connection.auth import AuthManager, DatabaseCredentials

async with get_client(config) as client:
    auth = AuthManager(client)
    
    creds = DatabaseCredentials(
        namespace='myapp',
        database='prod',
        username='app_user',
        password='secure_password'
    )
    
    token = await auth.signin(creds)
    print(f"Authenticated with token: {token[:20]}...")
    
    # Perform operations as authenticated user
    users = await query_records('user', User)
    
    # Sign out
    await auth.invalidate()
```

### Scope-Based User Registration
```python
from surql.connection.auth import ScopeCredentials

# Sign up new user
creds = ScopeCredentials(
    namespace='myapp',
    database='prod',
    scope='user',
    params={
        'email': 'newuser@example.com',
        'password': 'secure123',
        'name': 'New User'
    }
)

token = await auth.signup(creds)

# User is now authenticated
```

### Token Storage and Restoration
```python
# Save token
token = await auth.signin(creds)
save_to_storage(token)  # Your storage mechanism

# Later, restore session
saved_token = load_from_storage()
await auth.authenticate(saved_token)
```

# Connection Management

## Named Connections and Registry

### ConnectionRegistry
- Import: `from surql.connection.registry import ConnectionRegistry`
- Singleton pattern: `registry = ConnectionRegistry()`
- Manages multiple named database connections
- Thread-safe with async locks

### Register Named Connection
- Register: `await registry.register('analytics', analytics_config)`
- Get connection: `client = registry.get('analytics')`
- Set default: `await registry.set_default('analytics')`
- List all: `names = registry.list_connections()`
- Unregister: `await registry.unregister('analytics')`

### NamedConnectionConfig
- Import: `from surql.connection.config import NamedConnectionConfig`
- From environment: `config = NamedConnectionConfig.from_env('analytics')`
- Loads from `SURQL_ANALYTICS_*` or `DB_ANALYTICS_*` env vars
- Supports all ConnectionConfig options

### Multiple Connection Example
```python
from surql.connection.registry import ConnectionRegistry
from surql.connection.config import NamedConnectionConfig

# Load connections from environment
primary_config = NamedConnectionConfig.from_env('primary')
analytics_config = NamedConnectionConfig.from_env('analytics')
cache_config = NamedConnectionConfig.from_env('cache')

# Register connections
registry = ConnectionRegistry()
await registry.register('primary', primary_config)
await registry.register('analytics', analytics_config)
await registry.register('cache', cache_config)
await registry.set_default('primary')

# Use specific connection
analytics_client = registry.get('analytics')
stats = await analytics_client.execute('SELECT count() FROM events GROUP BY type')

# Default connection
default_client = registry.get()  # Returns 'primary'
```

### Named Connection Use Cases
- **Read Replicas**: Separate connections for read and write operations
- **Analytics Database**: Dedicated connection for analytics queries
- **Multi-Tenant**: Different connections per tenant
- **Microservices**: Service-specific database connections

## Connection Override Context

### connection_override Context Manager
- Import: `from surql.connection.context import connection_override`
- Override connection: `async with connection_override(test_client): result = await execute_query()`
- Temporary override for specific block
- Automatically restores previous connection
- Useful for testing and multi-database operations

### Override Example
```python
from surql.connection.context import connection_override, get_db

# Normal execution uses default client
result1 = await query_records('user', User)

# Override for specific operations
async with connection_override(analytics_client):
    # All queries in this block use analytics_client
    result2 = await query_records('events', Event)
    stats = await get_db().execute('SELECT count() FROM events')

# Back to default client
result3 = await query_records('user', User)
```

### Testing with Override
```python
import pytest
from surql.connection.context import connection_override

@pytest.mark.asyncio
async def test_query_logic():
    mock_client = create_mock_client()
    
    async with connection_override(mock_client):
        # Test code uses mock_client
        result = await my_query_function()
        assert result is not None
```

## Database Client

### Creating Client
- Import: `from surql.connection.client import DatabaseClient, get_client`
- From config: `client = DatabaseClient(config)`
- Context manager: `async with get_client(config) as client: ...`
- Manual: `client = DatabaseClient(config); await client.connect()`

### Connection Lifecycle
- Connect: `await client.connect()`
- Check status: `if client.is_connected: ...`
- Disconnect: `await client.disconnect()`
- Context manager handles automatically: `async with get_client(config) as client:`

### Client Operations
- Execute query: `result = await client.execute('SELECT * FROM user')`
- With parameters: `result = await client.execute('SELECT * FROM user WHERE age > $age', {'age': 18})`
- Select: `result = await client.select('user')`
- Create: `result = await client.create('user', data_dict)`
- Update: `result = await client.update('user:alice', data_dict)`
- Merge: `result = await client.merge('user:alice', partial_data)`
- Delete: `result = await client.delete('user:alice')`

## Connection Pooling
- Configured via `max_connections` in ConnectionConfig
- Default: 10 concurrent connections
- Automatic semaphore-based pooling
- Thread-safe operations

## Retry Logic
- Automatic retry on connection failures
- Exponential backoff strategy
- Configure via: `retry_max_attempts`, `retry_min_wait`, `retry_max_wait`, `retry_multiplier`
- Default: 3 attempts with 1s-10s backoff, 2x multiplier
- Applies to connection and query operations

## Context Management

### Using get_db()
- Import: `from surql.connection.context import get_db`
- Get current client: `client = get_db()`
- Used internally by CRUD operations
- Requires client to be set in context first

### Setting Context Client
- Import: `from surql.connection.context import set_db`
- Set client: `set_db(client)`
- Use in application startup
- Scope: Per async context

## Error Handling
- Import: `from surql.connection.client import DatabaseError, ConnectionError, QueryError`
- `ConnectionError`: Connection failures
- `QueryError`: Query execution failures
- `DatabaseError`: Base exception for all database errors
- Catch and handle: `try: ... except QueryError as e: ...`

## Transactions
- Import: `from surql.connection.transaction import Transaction, transaction`
- Create transaction: `tx = Transaction(client)`
- Begin: `await tx.begin()`
- Commit: `await tx.commit()`
- Cancel/Rollback: `await tx.cancel()`
- Context manager: `async with transaction(client) as tx: ...`
- Auto-commit on success, auto-rollback on exception
- Supports nested operations within transaction scope

# Schema Validation

## Overview
- surqlprovides schema validation to compare Python schemas against database
- Detects schema drift between code and deployed database
- Validates tables, fields, indexes, edges, and constraints
- Generates detailed validation reports with severity levels
- Integrates with CI/CD pipelines via CLI

## ValidationSeverity
- Import: `from surql.schema.validator import ValidationSeverity`
- `ValidationSeverity.ERROR`: Critical issues requiring immediate action
- `ValidationSeverity.WARNING`: Potential problems or inconsistencies
- `ValidationSeverity.INFO`: Informational messages about schema state

## ValidationResult
- Import: `from surql.schema.validator import ValidationResult`
- Properties:
  * `severity`: ValidationSeverity enum value
  * `message`: Human-readable description
  * `table`: Table name (optional)
  * `field`: Field name (optional)
  * `details`: Additional context dictionary (optional)

## validate_schema()
- Import: `from surql.schema.validator import validate_schema`
- Usage: `results = await validate_schema(code_tables, client)`
- Parameters:
  * `code_tables`: Dict of table schemas from Python code
  * `client`: DatabaseClient instance
- Returns: List of ValidationResult objects
- Validates:
  * Missing tables in database
  * Extra tables not in code
  * Field mismatches (type, assertion, default, readonly)
  * Missing or extra fields
  * Index differences
  * Edge constraint validation

## format_validation_report()
- Import: `from surql.schema.validator import format_validation_report`
- Usage: `report = format_validation_report(results)`
- Generates human-readable text report
- Groups results by severity
- Color-coded output for terminals

## CLI Integration
- Command: `surql schema validate`
- Options:
  * `--schema PATH`: Path to schema module
  * `--format FORMAT`: Output format (table/json)
  * `--fail-on-error`: Exit with code 1 if errors found
- Exit codes:
  * 0: No validation issues
  * 1: Validation errors found
  * 2: Command execution error

## Usage Example
```python
from surql.schema.validator import validate_schema, format_validation_report
from surql.schema.registry import get_all_tables

# Get schemas from registry
code_tables = get_all_tables()

# Validate against database
results = await validate_schema(code_tables, client)

# Generate report
if results:
    report = format_validation_report(results)
    print(report)
    
    # Check for errors
    errors = [r for r in results if r.severity == ValidationSeverity.ERROR]
    if errors:
        raise ValueError(f"Schema validation failed with {len(errors)} errors")
```

# Schema Visualization

## Overview
- surqlprovides schema visualization to generate diagrams of database structure
- Supports multiple output formats: Mermaid, GraphViz, ASCII
- Visualizes tables, fields, indexes, edges, and relationships
- Useful for documentation and understanding complex schemas

## OutputFormat
- Import: `from surql.schema.visualize import OutputFormat`
- `OutputFormat.MERMAID`: Mermaid.js ER diagram syntax
- `OutputFormat.GRAPHVIZ`: GraphViz DOT format
- `OutputFormat.ASCII`: Terminal-friendly ASCII art

## visualize_schema()
- Import: `from surql.schema.visualize import visualize_schema`
- Usage: `diagram = visualize_schema(tables, edges, format=OutputFormat.MERMAID)`
- Parameters:
  * `tables`: Dict of table schemas
  * `edges`: Dict of edge schemas (optional)
  * `format`: OutputFormat enum value
  * `show_fields`: Include field details (default: True)
  * `show_edges`: Include edge relationships (default: True)
- Returns: String containing diagram in specified format

## visualize_from_registry()
- Import: `from surql.schema.visualize import visualize_from_registry`
- Usage: `diagram = visualize_from_registry(format=OutputFormat.MERMAID)`
- Automatically retrieves schemas from global SchemaRegistry
- Convenient for visualizing entire schema

## CLI Integration
- Command: `surql schema visualize`
- Options:
  * `--schema PATH`: Path to schema module
  * `--format FORMAT`: Output format (mermaid/graphviz/ascii)
  * `--output FILE`: Write to file instead of stdout
  * `--no-fields`: Hide field details
  * `--no-edges`: Hide edge relationships
  * `--tables TABLE1,TABLE2`: Visualize specific tables only

## Mermaid Output
- Generates `erDiagram` syntax
- Shows tables, fields, types, constraints
- Displays relationships between tables
- Can be rendered in GitHub, GitLab, documentation sites

## GraphViz Output
- Generates DOT format
- Renders with `dot` command: `dot -Tpng schema.dot -o schema.png`
- Record-shaped nodes for tables
- Labeled edges for relationships

## ASCII Output
- Terminal-friendly box drawing
- Shows table structure and relationships
- No external tools required
- Useful for quick inspection

## Usage Example
```python
from surql.schema.visualize import visualize_from_registry, OutputFormat

# Generate Mermaid diagram
mermaid = visualize_from_registry(format=OutputFormat.MERMAID)
with open('schema.mmd', 'w') as f:
    f.write(mermaid)

# Generate ASCII diagram for terminal
ascii_diagram = visualize_from_registry(format=OutputFormat.ASCII)
print(ascii_diagram)
```

# Schema File Watcher

## Overview
- surqlprovides automatic schema change detection via file watching
- Monitors Python schema files for modifications
- Detects drift between code and database schemas
- Can automatically generate migrations on changes
- Integrates with development workflows

## SchemaWatcher
- Import: `from surql.migration.watcher import SchemaWatcher`
- Create: `watcher = SchemaWatcher(schema_path, migrations_path, debounce=1.0)`
- Parameters:
  * `schema_path`: Path to schema module or directory
  * `migrations_path`: Path to migrations directory
  * `debounce`: Delay in seconds before processing changes (default: 1.0)
  * `callback`: Optional async function called on schema changes

## Start Watching
- Start: `await watcher.start()`
- Monitors file system for changes to `.py` files
- Debounces rapid successive changes
- Calls callback when changes detected

## Stop Watching
- Stop: `await watcher.stop()`
- Cleanup: `await watcher.cleanup()`
- Automatically stops file monitoring

## SchemaChange
- Import: `from surql.migration.watcher import SchemaChange`
- Properties:
  * `type`: Change type (modified, created, deleted)
  * `table`: Affected table name (optional)
  * `timestamp`: When change was detected
  * `diff_summary`: Summary of changes

## CLI Integration
- Command: `surql schema watch`
- Options:
  * `--schema PATH`: Path to schema module (required)
  * `--migrations PATH`: Path to migrations directory (default: migrations/)
  * `--debounce SECONDS`: Debounce delay (default: 1.0)
  * `--auto-generate`: Automatically generate migrations on changes
  * `--no-prompt`: Don't prompt for confirmation (report only)
- Interactive mode prompts to generate migrations
- Reports changes and drift statistics

## Usage Example
```python
from surql.migration.watcher import SchemaWatcher

async def on_schema_change(change: SchemaChange):
    print(f"Schema changed: {change.table}")
    # Optionally auto-generate migration
    # await generate_migration(change)

watcher = SchemaWatcher(
    schema_path='myapp/schemas',
    migrations_path='migrations',
    debounce=1.0,
    callback=on_schema_change
)

await watcher.start()
# Keep application running...
await watcher.stop()
```

# Git Hooks Integration

## Overview
- surqlprovides Git pre-commit hooks for schema drift detection
- Prevents commits with unmigrated schema changes
- Generates pre-commit configuration files
- Integrates with CI/CD pipelines

## check_schema_drift()
- Import: `from surql.migration.hooks import check_schema_drift`
- Usage: `result = check_schema_drift(schema_path, migrations_path)`
- Parameters:
  * `schema_path`: Path to schema module
  * `migrations_path`: Path to migrations directory
  * `db_client`: Optional DatabaseClient (if None, compares timestamps)
- Returns: HookCheckResult with drift information
- Works without database connection by comparing file timestamps

## HookCheckResult
- Import: `from surql.migration.hooks import HookCheckResult`
- Properties:
  * `has_drift`: Boolean indicating if drift detected
  * `message`: Human-readable description
  * `modified_files`: List of modified schema files
  * `latest_migration_time`: Timestamp of latest migration
  * `details`: Additional context

## generate_precommit_config()
- Import: `from surql.migration.hooks import generate_precommit_config`
- Usage: `config = generate_precommit_config(schema_path, migrations_path)`
- Generates `.pre-commit-config.yaml` content
- Compatible with pre-commit framework
- Includes surql drift check hook

## CLI Integration
- Command: `surql schema check`
- Options:
  * `--schema PATH`: Path to schema module
  * `--migrations PATH`: Path to migrations directory
  * `--fail-on-drift`: Exit with code 1 if drift detected
  * `--show-diff`: Display detailed drift information
  * `--format FORMAT`: Output format (table/json)
- Exit codes:
  * 0: No drift detected
  * 1: Drift detected (with `--fail-on-drift`)
  * 2: Command execution error

- Command: `surql schema hook-config`
- Generates pre-commit configuration
- Options:
  * `--schema PATH`: Path to schema module
  * `--migrations PATH`: Path to migrations directory
  * `--output FILE`: Write to file (default: stdout)

## Pre-Commit Integration
```yaml
# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: surql-schema-check
        name: Check for schema drift
        entry: surql schema check --fail-on-drift
        language: system
        pass_filenames: false
        always_run: true
```

## Usage Example
```python
from surql.migration.hooks import check_schema_drift, generate_precommit_config

# Check for drift
result = check_schema_drift('myapp/schemas', 'migrations')
if result.has_drift:
    print(f"Schema drift detected: {result.message}")
    for file in result.modified_files:
        print(f"  - {file}")
    exit(1)

# Generate pre-commit config
config = generate_precommit_config('myapp/schemas', 'migrations')
with open('.pre-commit-config.yaml', 'w') as f:
    f.write(config)
```

# Migration Squashing

## Overview
- surqlprovides migration squashing to combine multiple migrations
- Reduces migration file count for better performance
- Optimizes redundant operations
- Validates safety before squashing

## squash_migrations()
- Import: `from surql.migration.squash import squash_migrations`
- Usage: `await squash_migrations(directory, from_version, to_version, output_path)`
- Parameters:
  * `directory`: Path to migrations directory
  * `from_version`: Starting migration version (optional)
  * `to_version`: Ending migration version (optional)
  * `output_path`: Path for squashed migration file (optional)
  * `optimize`: Enable statement optimization (default: True)
  * `keep_originals`: Preserve original files (default: False)
- Combines multiple up/down migrations into single file
- Removes redundant operations
- Preserves logical order

## optimize_statements()
- Import: `from surql.migration.squash import optimize_statements`
- Usage: `optimized = optimize_statements(statements)`
- Removes redundant operations:
  * Drop table followed by create table → single create
  * Add field then remove field → removed
  * Multiple modifications to same field → latest only
- Returns optimized list of statements

## validate_squash_safety()
- Import: `from surql.migration.squash import validate_squash_safety`
- Usage: `warnings = validate_squash_safety(migrations)`
- Checks for:
  * Data migrations (cannot be safely squashed)
  * Complex SurrealQL statements
  * Manual interventions required
- Returns list of warnings with severity levels

## CLI Integration
- Command: `surql migrate squash`
- Options:
  * `--migrations PATH`: Migrations directory (default: migrations/)
  * `--from VERSION`: Starting version (optional)
  * `--to VERSION`: Ending version (optional)
  * `--output FILE`: Output file path (optional)
  * `--dry-run`: Preview without writing
  * `--no-optimize`: Disable optimization
  * `--keep-originals`: Preserve original files
  * `--force`: Bypass safety warnings
- Exit codes:
  * 0: Success
  * 1: Squash failed
  * 2: User cancelled
  * 3: High severity warnings without `--force`

## Safety Considerations
- Data migrations should not be squashed
- Review squashed output before applying
- Test squashed migrations in development
- Keep backups of original migrations
- Use `--dry-run` to preview changes

## Usage Example
```python
from surql.migration.squash import squash_migrations, validate_squash_safety

# Validate safety first
from surql.migration.discovery import discover_migrations
migrations = discover_migrations('migrations')
warnings = validate_squash_safety(migrations[5:10])

if warnings:
    for warning in warnings:
        print(f"{warning.severity}: {warning.message}")

# Squash migrations 001-010
await squash_migrations(
    directory='migrations',
    from_version='001',
    to_version='010',
    output_path='migrations/squashed_001_010.py',
    optimize=True,
    keep_originals=True
)
```

# Migration System

## Migration File Structure

### File Naming Convention
- Format: `YYYYMMDD_HHMMSS_description.py`
- Example: `20260102_120000_create_user_table.py`
- Timestamp must be unique and sequential
- Description: lowercase with underscores

### Migration File Template
```python
"""Migration: Create user table"""
from surql.schema.fields import string_field, int_field, datetime_field
from surql.schema.table import table_schema, unique_index, TableMode

def upgrade():
  """Apply migration."""
  return [
    table_schema(
      'user',
      mode=TableMode.SCHEMAFULL,
      fields=[
        string_field('name'),
        string_field('email'),
        int_field('age'),
        datetime_field('created_at', default='time::now()'),
      ],
      indexes=[
        unique_index('email_idx', ['email']),
      ],
    )
  ]

def downgrade():
  """Rollback migration."""
  return [
    table_schema('user', drop=True)
  ]
```

### Migration Functions
- `upgrade()`: Returns list of schema definitions to apply
- `downgrade()`: Returns list of schema definitions to rollback
- Both must be present in migration file
- Return empty list `[]` if no operations needed

## Creating Migrations

### Via CLI
- Create migration: `surql migrate create "Add user table"`
- Creates file: `migrations/YYYYMMDD_HHMMSS_add_user_table.py`
- Edit file to add schema definitions
- Auto-generates timestamp

### Programmatically
- Import: `from surql.migration.generator import create_migration_file`
- Create: `path = create_migration_file('Add user table', 'migrations/')`
- Returns path to created file
- Edit file manually after creation

### Migration Directory
- Default location: `migrations/` in project root
- Create directory: `mkdir migrations`
- Custom location: Use `--directory` flag with CLI commands
- Include in version control: Commit migration files to Git

## Applying Migrations

### Apply All Pending
- Command: `surql migrate up`
- Applies all pending migrations in order
- Tracks applied migrations in `_migration_history` table
- Idempotent: Safe to run multiple times

### Apply Specific Number
- Command: `surql migrate up --steps 1`
- Apply next migration only: `--steps 1`
- Apply next 3: `--steps 3`
- Useful for gradual rollout

### Dry Run
- Command: `surql migrate up --dry-run`
- Preview what will be applied
- No database changes
- Shows SQL that would be executed

### With Custom Directory
- Command: `surql migrate up --directory ./db/migrations`
- Specify migration directory location
- Useful for multi-environment setups

## Rolling Back Migrations

### Rollback Last Migration
- Command: `surql migrate down`
- Rolls back most recent migration
- Executes downgrade function
- Requires confirmation by default
- Use `--yes` flag to skip confirmation

### Rollback Multiple
- Command: `surql migrate down --steps 3`
- Rollback last 3 migrations
- Applied in reverse order
- Use with caution

### Dry Run Rollback
- Command: `surql migrate down --dry-run`
- Preview rollback operations
- No database changes
- Recommended before actual rollback

## Migration Status and History

### Check Status
- Command: `surql migrate status`
- Shows applied vs pending migrations
- Table format by default
- JSON format: `surql migrate status --format json`

### View History
- Command: `surql migrate history`
- Shows all applied migrations
- Includes timestamps and execution time
- Stored in `_migration_history` table
- Supports `--directory` option for consistency

### Validate Migrations
- Command: `surql migrate validate`
- Checks migration files for errors
- Validates file naming
- Checks for duplicate versions

# Multi-Database Migration Orchestration

## Overview
- Deploy migrations across multiple database instances in a controlled, automated way
- Essential for multi-environment deployments, horizontal scaling, multi-tenant architectures
- Four deployment strategies: Sequential, Parallel, Rolling, Canary
- Health checking system validates environment readiness
- Auto-rollback on failures with safety features

## Core Concepts

### Environment
- Represents a single database instance or cluster
- Import: `from surql.orchestration import EnvironmentConfig`
- Create: `EnvironmentConfig(name='production', connection=conn_config, priority=1, tags={'production'}, require_approval=True, allow_destructive=False)`
- Properties: `name`, `connection`, `priority` (lower = higher priority), `tags`, `require_approval`, `allow_destructive`

### Environment Registry
- Import: `from surql.orchestration import EnvironmentRegistry`
- Singleton pattern manages all configured environments
- Register: `registry.register_environment(name='staging', connection=config, priority=50)`
- Get: `client = registry.get('analytics')`
- List: `names = registry.list_connections()`
- Set default: `await registry.set_default('analytics')`

### Migration Coordinator
- Import: `from surql.orchestration import MigrationCoordinator`
- Orchestrates deployments across environments
- Create: `coordinator = MigrationCoordinator(registry)`
- Deploy: `results = await coordinator.deploy_to_environments(environments=['staging', 'production'], migrations=migrations, strategy='sequential')`

## Environment Configuration

### Configuration File Format
- Create `environments.json`:
```json
{
  "environments": [
    {
      "name": "production",
      "connection": {
        "db_url": "ws://prod.example.com:8000/rpc",
        "db_ns": "production",
        "db": "main",
        "password": "${PROD_PASSWORD}"
      },
      "priority": 1,
      "tags": ["production", "critical"],
      "require_approval": true,
      "allow_destructive": false
    }
  ]
}
```

### Loading Configuration
- From file: `from surql.orchestration import configure_environments; configure_environments(Path('environments.json'))`
- Programmatically: `from surql.orchestration import register_environment; register_environment(name='production', connection=ConnectionConfig(...), priority=1)`
- Use environment variables for sensitive data: `"password": "${PROD_PASSWORD}"`

## Deployment Strategies

### Sequential Strategy
- Deploy one at a time in priority order
- Use cases: Multi-stage deployments (dev → staging → production), maximum safety
- Behavior: Deploys in priority order (lowest first), waits for each, stops on first failure
- CLI: `surql orchestrate deploy -e dev,staging,production --strategy sequential`

### Parallel Strategy
- Deploy to all environments simultaneously
- Use cases: Independent environments (different tenants, regions), fast deployments
- Behavior: Deploys to all at once, configurable concurrency limit
- CLI: `surql orchestrate deploy -e tenant1,tenant2,tenant3 --strategy parallel --max-concurrent 3`

### Rolling Strategy
- Deploy in batches, waiting for each batch to complete
- Use cases: Database replicas, maintaining service availability
- Behavior: Divides into batches, deploys one batch at a time
- CLI: `surql orchestrate deploy -e db1,db2,db3,db4 --strategy rolling --batch-size 2`

### Canary Strategy
- Deploy to small percentage first, then to rest
- Use cases: High-risk deployments, testing in production with limited impact
- Behavior: Deploys to specified percentage first, waits for verification
- CLI: `surql orchestrate deploy -e prod1,prod2,prod3,prod4,prod5 --strategy canary --canary-percent 20`

### Strategy Comparison
- Sequential: Slowest speed, highest safety, multi-stage deployments
- Parallel: Fastest speed, lowest safety, independent environments
- Rolling: Medium speed, high safety, replica sets
- Canary: Medium speed, highest safety, high-risk production deploys

## Health Checking
- Checks: Connectivity, authentication, migration table existence, database state
- CLI validate: `surql orchestrate validate`
- CLI deploy: `surql orchestrate deploy -e production --verify-health`
- Programmatic: `from surql.orchestration import HealthCheck; status = await HealthCheck().check_environment(env_config)`

## Best Practices
- Use configuration files stored in version control
- Always test with dry run: `surql orchestrate deploy -e production --dry-run`
- Use environment priorities to control deployment order
- Enable health checks in production
- Secure sensitive credentials with environment variables
- Keep production separate in isolated config files

# Schema Versioning and Rollback

## Overview
- Track schema evolution through snapshots at each migration
- Compare versions to identify differences
- Plan rollbacks with safety analysis (SAFE, DATA_LOSS, UNSAFE)
- Execute safe rollbacks to previous schema versions
- Auto-snapshot functionality for automatic snapshot creation

## Core Concepts

### Schema Snapshot
- Import: `from surql.migration.versioning import SchemaSnapshot`
- Captures complete database schema at specific version
- Properties: `version`, `created_at`, `tables`, `edges`, `indexes`, `checksum`, `migration_count`
- Create: `snapshot = await create_snapshot(client, version='20260109_120000', migration_count=5)`

### Rollback Plan
- Import: `from surql.migration.rollback import create_rollback_plan`
- Plan for safely rolling back to previous version
- Create: `plan = await create_rollback_plan(client, migrations, target_version='20260108_120000')`
- Properties: `from_version`, `to_version`, `migration_count`, `overall_safety`, `issues`, `requires_approval`

## Schema Snapshots

### Creating and Storing
- Create: `snapshot = await create_snapshot(client, version, migration_count)`
- Store: `await store_snapshot(client, snapshot)`
- Load: `snapshot = await load_snapshot(client, version)`
- List: `snapshots = await list_snapshots(client)`
- Compare: `diff = compare_snapshots(snapshot1, snapshot2)`

## Safety Levels

### SAFE
- No data loss expected
- Operations: Add/remove indexes, add fields with defaults, add tables

### DATA_LOSS
- Some data may be lost
- Operations: Remove fields (data lost), change field types, modify constraints

### UNSAFE
- Significant data loss likely
- Operations: Remove entire tables, drop multiple critical fields

### Safety Level Matrix
- Add table: Up=Safe, Down=UNSAFE (table drop)
- Add field: Up=Safe, Down=DATA_LOSS (field data lost)
- Add index: Up=Safe, Down=SAFE (index drop)
- Modify field type: Up=DATA_LOSS, Down=DATA_LOSS

## Rollback Planning and Execution

### Creating Rollback Plan
- Plan: `plan = await create_rollback_plan(client, migrations, target_version)`
- Analyzes: Migrations to rollback, safety issues, data loss risks
- Check safety: `if plan.overall_safety == RollbackSafety.SAFE: ...`

### Executing Rollbacks
- Execute: `result = await execute_rollback(client, plan)`
- Force unsafe: `result = await execute_rollback(client, plan, force=True)`
- Result properties: `success`, `rolled_back_count`, `actual_duration_ms`, `errors`

## Auto-Snapshots
- Enable: `from surql.migration.history import enable_auto_snapshots; enable_auto_snapshots()`
- Disable: `disable_auto_snapshots()`
- Behavior: Automatically creates snapshot after each successful migration
- Benefits: No manual steps, complete history, easy rollback

## Best Practices
- Create snapshots before major changes
- Always review rollback plans before executing
- Backup data before unsafe rollbacks
- Test rollbacks in development first
- Use auto-snapshots in production
- Document rollback procedures in migration docstrings
- Monitor rollback execution
- Keep snapshot history limited (periodic cleanup)

# CLI Commands

## Global Options
- `--help`: Show help message
- `--verbose, -v`: Enable verbose logging
- `--version`: Show version information

## Migration Commands

### surql migrate up
- Apply pending migrations to database
- Options: `--directory PATH`, `--steps N`, `--dry-run`, `--verbose`
- Example: `surql migrate up --steps 3 --dry-run`

### surql migrate down
- Rollback last applied migration(s)
- Options: `--directory PATH`, `--steps N` (default: 1), `--dry-run`, `--yes`, `--verbose`
- Example: `surql migrate down --steps 2 --yes`

### surql migrate status
- Show migration status (applied vs pending)
- Options: `--directory PATH`, `--format FORMAT` (table/json/yaml), `--verbose`
- Example: `surql migrate status --format json`

### surql migrate history
- Show applied migrations from database
- Options: `--directory PATH`, `--format FORMAT` (table/json/yaml), `--verbose`
- Example: `surql migrate history`

### surql migrate create
- Create new migration file
- Usage: `surql migrate create "Description"`
- Options: `--directory PATH`, `--verbose`
- Example: `surql migrate create "Add indexes"`

### surql migrate validate
- Validate migration files for errors
- Options: `--directory PATH`, `--verbose`
- Example: `surql migrate validate`

### surql migrate generate
- Generate migration from schema changes (auto-migration)
- Usage: `surql migrate generate "Description"`
- Options: `--schema PATH`, `--directory PATH`, `--verbose`
- Compares code schemas to database and generates migration

### surql migrate squash
- Combine multiple migrations into one
- Options: `--migrations PATH`, `--from VERSION`, `--to VERSION`, `--output FILE`, `--dry-run`, `--no-optimize`, `--keep-originals`, `--force`
- Example: `surql migrate squash --from 001 --to 010`

### surql migrate snapshot create
- Create schema snapshot at current version
- Options: `--directory PATH`, `--verbose`
- Example: `surql migrate snapshot create`

### surql migrate snapshot list
- List all stored schema snapshots
- Options: `--format FORMAT` (table/json), `--verbose`
- Example: `surql migrate snapshot list`

### surql migrate rollback plan
- Plan rollback to a target version
- Options: `--to VERSION` (required), `--directory PATH`, `--verbose`
- Example: `surql migrate rollback plan --to 20260108_120000`

### surql migrate rollback execute
- Execute rollback to a target version
- Options: `--to VERSION` (required), `--directory PATH`, `--force`, `--verbose`
- Example: `surql migrate rollback execute --to 20260108_120000`
- Force unsafe: `surql migrate rollback execute --to 20260107_120000 --force`

### surql migrate version show
- Show current schema version
- Options: `--directory PATH`, `--verbose`
- Example: `surql migrate version show`

### surql migrate version compare
- Compare two schema versions
- Usage: `surql migrate version compare VERSION1 VERSION2`
- Example: `surql migrate version compare 20260108_120000 20260109_120000`

## Orchestration Commands

### surql orchestrate deploy
- Deploy migrations to multiple environments
- Usage: `surql orchestrate deploy -e ENVIRONMENTS [OPTIONS]`
- Options:
  * `--environments, -e`: Comma-separated environment names (required)
  * `--strategy`: Deployment strategy (sequential/parallel/rolling/canary, default: sequential)
  * `--batch-size`: Batch size for rolling strategy (default: 1)
  * `--canary-percent`: Canary percentage for canary strategy (default: 10.0)
  * `--max-concurrent`: Max concurrent for parallel strategy (default: 5)
  * `--dry-run`: Simulate deployment without executing
  * `--skip-health-check`: Skip health verification
  * `--no-rollback`: Disable auto-rollback on failure
  * `--config`: Config file path (default: environments.json)
  * `--migrations-dir, -m`: Migrations directory (default: migrations)
- Examples:
  * Basic: `surql orchestrate deploy -e staging,production`
  * Rolling: `surql orchestrate deploy -e db1,db2,db3,db4 --strategy rolling --batch-size 2`
  * Canary: `surql orchestrate deploy -e prod1,prod2,prod3 --strategy canary --canary-percent 33`
  * Dry run: `surql orchestrate deploy -e production --dry-run`

### surql orchestrate status
- Check deployment status of environments
- Usage: `surql orchestrate status -e ENVIRONMENTS [OPTIONS]`
- Options: `--environments, -e` (required), `--config`
- Example: `surql orchestrate status -e staging,production`

### surql orchestrate validate
- Validate environment configuration and connectivity
- Usage: `surql orchestrate validate [OPTIONS]`
- Options: `--config` (default: environments.json)
- Example: `surql orchestrate validate`

## Schema Commands

### surql schema show
- Display database or table schema
- Usage: `surql schema show [TABLE]`
- Show all: `surql schema show`
- Show table: `surql schema show user`

### surql schema diff
- Show differences between code and database schemas
- Options: `--schema PATH`, `--table TABLE`, `--format FORMAT`
- Example: `surql schema diff --table user`

### surql schema generate
- Generate schema definitions from database
- Options: `--output PATH`, `--tables TABLE1,TABLE2`
- Example: `surql schema generate --output schemas/`

### surql schema tables
- List all tables in database
- Options: `--format FORMAT` (table/json/yaml)
- Example: `surql schema tables --format json`

### surql schema inspect
- Inspect specific table schema
- Usage: `surql schema inspect TABLE`
- Example: `surql schema inspect user`

### surql schema export
- Export schema to file
- Options: `--output FILE`, `--format FORMAT` (surql/json)
- Example: `surql schema export --output schema.surql`

### surql schema validate
- Validate code schemas against database
- Options: `--schema PATH`, `--format FORMAT`, `--fail-on-error`
- Example: `surql schema validate --fail-on-error`

### surql schema check
- Check for schema drift (for CI/CD)
- Options: `--schema PATH`, `--migrations PATH`, `--fail-on-drift`, `--show-diff`, `--format FORMAT`
- Example: `surql schema check --fail-on-drift`

### surql schema hook-config
- Generate Git pre-commit hook configuration
- Options: `--schema PATH`, `--migrations PATH`, `--output FILE`
- Example: `surql schema hook-config --output .pre-commit-config.yaml`

### surql schema watch
- Watch schema files and detect changes
- Options: `--schema PATH`, `--migrations  PATH`, `--debounce SECONDS`, `--auto-generate`, `--no-prompt`
- Example: `surql schema watch --schema myapp/schemas --auto-generate`

### surql schema visualize
- Generate visual diagram of schema
- Options: `--schema PATH`, `--format FORMAT` (mermaid/graphviz/ascii), `--output FILE`, `--no-fields`, `--no-edges`, `--tables TABLE1,TABLE2`
- Example: `surql schema visualize --format mermaid --output schema.mmd`

## Database Commands

### surql db init
- Initialize database with required tables
- Options: `--drop-existing`
- Example: `surql db init`

### surql db ping
- Check database connection
- Usage: `surql db ping`
- Returns connection status

### surql db info
- Show database information
- Usage: `surql db info`
- Shows connection details and statistics

### surql db reset
- Reset database (drop all tables)
- Options: `--yes` (skip confirmation)
- Example: `surql db reset --yes`
- WARNING: Destructive operation

### surql db query
- Execute raw SurrealQL query
- Usage: `surql db query "SELECT * FROM user"`
- Options: `--format FORMAT` (table/json/yaml)
- Example: `surql db query "SELECT count() FROM user"`

### surql db version
- Show SurrealDB server version
- Usage: `surql db version`
- Example: `surql db version`

## Output Formats
- `table`: Human-readable table (default)
- `json`: JSON format for scripting
- `yaml`: YAML format
- Use with: `--format json`

# Best Practices

## Schema Design

### Field Assertions
- Always validate string lengths: `assertion='string::len($value) > 0'`
- Use email validation: `assertion='string::is::email($value)'`
- Validate numeric ranges: `assertion='$value >= 0 AND $value <= 100'`
- Validate dates: `assertion='$value > time::now()'` for future dates
- Keep assertions simple and focused

### Default Values
- Use `time::now()` for timestamp fields
- Provide sensible defaults: `default='0'`, `default='[]'`, `default='true'`
- Make creation easier with defaults
- Use computed fields for derived values

### Readonly Fields
- Timestamps: `datetime_field('created_at', default='time::now()', readonly=True)`
- IDs and immutable data should be readonly
- Prevents accidental modification

### Indexes
- Add unique indexes for natural keys: `unique_index('email_idx', ['email'])`
- Index frequently queried fields
- Use search indexes for full-text search
- Don't over-index (overhead on writes)
- Add HNSW indexes (or MTREE) for vector fields used in similarity search

### Table Mode
- Use SCHEMAFULL for production: Enforces data integrity
- Use SCHEMALESS only when truly needed: Flexibility vs safety tradeoff
- Prefer typed schemas with validation

### Edge Modes
- Use EdgeMode.RELATION for graph relationships (recommended)
- Use SCHEMAFULL for edges with complex properties
- Use SCHEMALESS for flexible edge attributes

## Query Optimization

### Use Type-Safe Queries
- Define Pydantic models for all tables
- Use models with CRUD operations: `await query_records('user', User, ...)`
- Benefit from validation and autocomplete

### Limit Results
- Always use pagination for large datasets: `limit(100)`
- Implement offset-based pagination: `limit(page_size).offset((page - 1) * page_size)`
- Consider cursor-based pagination for large tables

### Select Specific Fields
- Don't SELECT * when you need specific fields: `Query().select(['id', 'name', 'email'])`
- Reduces data transfer
- Faster queries

### Filter Early
- Add WHERE conditions to reduce dataset: `where('status = "active"')`
- Combine conditions with AND/OR: Use operators
- Use indexes for filter fields

### Order and Group Efficiently
- Order by indexed fields when possible
- Use GROUP BY for aggregations: `select(['status', 'COUNT(*)']).group_by('status')`
- Combine with HAVING for filtered aggregations

### Use Caching
- Cache expensive queries with `@cache_query` decorator
- Invalidate cache when data changes
- Monitor cache hit ratio
- Use Redis for distributed caching

## Migration Strategies

### One Change Per Migration
- Keep migrations focused: Single logical unit of change
- Easier to understand and rollback
- Example: Don't mix table creation and data migration

### Always Write Downgrade
- Every migration needs rollback logic: `downgrade()` function
- Test rollbacks in development
- Ensure data safety with reversible operations

### Test Migrations Locally
- Run `migrate up` in development first
- Test `migrate down` to verify rollback
- Use `--dry-run` before production apply

### Version Control
- Commit migrations to git
- Never modify applied migrations: Create new migration instead
- Keep migration history linear

### Data Migrations
- Separate from schema migrations when possible
- Handle NULL values during transitions
- Consider doing in multiple steps for large datasets
- Test with production-like data volume

### Migration Order
- Create tables before edges
- Add fields before indexes on those fields
- Create tables before foreign key references
- Drop dependents before dropping tables

### Squash Old Migrations
- Periodically squash old migrations
- Improves migration performance
- Reduces migration count
- Test squashed migrations thoroughly

### Schema Drift Detection
- Use `surql schema check` in CI/CD
- Add pre-commit hooks for drift detection
- Watch schema files during development
- Regularly validate schemas

## Error Handling

### Catch Specific Exceptions
- Use specific exception types: `ConnectionError`, `QueryError`
- Handle connection failures gracefully
- Retry or log appropriately

### Validate Input
- Use Pydantic models for validation
- Validate before database operations
- Return clear error messages

### Log Operations
- Use structured logging (structlog included)
- Log important operations: Creates, updates, deletes
- Include context: User ID, table, operation type

### Transaction Safety
- Use transactions for multi-step operations
- Ensure atomicity with `async with transaction(client):`
- Rollback on any failure

### Monitor Performance
- Track query execution time
- Monitor cache effectiveness
- Log slow queries
- Use database statistics

# Architecture

## Project Structure
- `src/surql/schema/`: Schema definition layer with `fields.py`, `table.py`, `edge.py`, `validator.py`, `visualize.py`, `utils.py`
- `src/surql/migration/`: Migration system with `generator.py`, `executor.py`, `discovery.py`, `history.py`, `diff.py`, `squash.py`, `hooks.py`, `watcher.py`
- `src/surql/query/`: Query builder and ORM with `builder.py`, `crud.py`, `executor.py`, `graph.py`, `batch.py`, `results.py`, `expressions.py`
- `src/surql/connection/`: Database connection with `client.py`, `config.py`, `context.py`, `registry.py`, `auth.py`, `streaming.py`, `transaction.py`
- `src/surql/cache/`: Caching system with `backends.py`, `config.py`, `decorator.py`, `manager.py`
- `src/surql/types/`: Type definitions with `record_id.py`, `operators.py`
- `src/surql/cli/`: CLI commands with `migrate.py`, `schema.py`, `db.py`, `common.py`
- `tests/`: Test suite with unit and integration tests
- `docs/`: Documentation with guides and examples
- `migrations/`: User migration files (created by user)

## Key Modules

### src.schema.fields
- Field type definitions and builders
- Exports: `FieldType`, `FieldDefinition`, `string_field`, `int_field`, etc.
- Pure functions returning immutable `FieldDefinition` objects

### src.schema.table
- Table schema composition
- Exports: `TableMode`, `TableDefinition`, `IndexDefinition`, `EventDefinition`, `table_schema`, `index`, `unique_index`, `search_index`, `mtree_index`, `hnsw_index`, `HnswDistanceType`, `event`
- Functional composition helpers: `with_fields`, `with_indexes`, etc.

### src.schema.edge
- Edge/relationship schemas for graph
- Exports: `EdgeMode`, `EdgeDefinition`, `edge_schema`, `bidirectional_edge`, `typed_edge`
- Composition helpers: `with_edge_fields`, etc.

### src.schema.validator
- Schema validation against database
- Exports: `ValidationSeverity`, `ValidationResult`, `validate_schema`, `format_validation_report`
- Compares code schemas to deployed database

### src.schema.visualize
- Schema visualization and diagramming
- Exports: `OutputFormat`, `visualize_schema`, `visualize_from_registry`
- Supports Mermaid, GraphViz, ASCII formats

### src.schema.utils
- Shared schema utilities
- Exports: `fetch_db_tables`, `parse_table_info`
- Used by validator and watcher modules

### src.query.builder
- Immutable query builder with fluent API
- Exports: `Query` class with methods `select()`, `from_table()`, `where()`, `order_by()`, `limit()`, `vector_search()`, etc.
- Generates SurrealQL via `to_surql()`

### src.query.crud
- High-level CRUD operations
- Exports: `ReturnFormat`, `create_record`, `get_record`, `update_record`, `delete_record`, `query_records`, etc.
- Type-safe with Pydantic models

### src.query.graph
- Graph traversal and edge operations
- Exports: `GraphQuery`, `traverse`, `relate`, `unrelate`, `get_related_records`, `count_related`, `find_mutual_connections`, `find_shortest_path`, `get_neighbors`, `compute_degree`, etc.
- SurrealDB graph features wrapper

### src.query.batch
- Batch operations for bulk processing
- Exports: `upsert_many`, `relate_many`, `insert_many`, `delete_many`
- Efficient bulk operations

### src.query.results
- Query result extraction utilities
- Exports: `extract_result`, `extract_one`, `extract_scalar`, `has_results`
- Handles various SurrealDB response formats

### src.query.expressions
- Expression builders for queries
- Exports: `field`, `value`, `count`, `sum_`, `avg`, `min_`, `max_`, `concat`, `string_`, `array`, `time_now`, `rand_uuid`
- Type-safe expression construction

### src.connection.client
- Async database client wrapper
- Exports: `DatabaseClient`, `get_client` context manager
- Connection pooling and retry logic
- Exceptions: `ConnectionError`, `QueryError`, `DatabaseError`

### src.connection.config
- Connection configuration
- Exports: `ConnectionConfig`, `NamedConnectionConfig` Pydantic settings classes
- Environment variable loading with `DB_` and `SURQL_` prefixes

### src.connection.registry
- Named connection management
- Exports: `ConnectionRegistry` singleton
- Manages multiple database connections

### src.connection.auth
- Authentication management
- Exports: `AuthType`, `AuthManager`, `RootCredentials`, `NamespaceCredentials`, `DatabaseCredentials`, `ScopeCredentials`
- JWT token handling

### src.connection.streaming
- Real-time streaming support
- Exports: `StreamingManager`, `LiveQuery`, `StreamingNotification`
- WebSocket-based live queries

### src.connection.transaction
- Transaction management
- Exports: `Transaction`, `transaction` context manager
- BEGIN/COMMIT/CANCEL support

### src.cache.backends
- Cache backend implementations
- Exports: `CacheBackend`, `MemoryCache`, `RedisCache`
- LRU and distributed caching

### src.cache.config
- Cache configuration
- Exports: `CacheConfig`, `CacheBackend`, `CacheStats`
- TTL and size limits

### src.cache.decorator
- Cache decorator
- Exports: `cache_query` decorator
- Automatic query result caching

### src.cache.manager
- Cache management
- Exports: `CacheManager`
- Cache operations and invalidation

### src.migration.executor
- Migration execution engine
- Applies and rolls back migrations
- Tracks history in `_migration_history` table

### src.migration.discovery
- Migration file discovery
- Scans directory for migration files
- Validates naming and ordering

### src.migration.generator
- Migration file generation
- Auto-generates migrations from schema changes
- Creates timestamped migration files

### src.migration.squash
- Migration squashing
- Exports: `squash_migrations`, `optimize_statements`, `validate_squash_safety`
- Combines and optimizes migrations

### src.migration.hooks
- Git hooks integration
- Exports: `check_schema_drift`, `get_staged_schema_files`, `generate_precommit_config`
- Pre-commit drift detection

### src.migration.watcher
- Schema file watching
- Exports: `SchemaWatcher`, `SchemaChange`, `is_schema_file`, `detect_schema_drift`
- Automatic change detection

## Design Principles

### Functional Composition
- Prefer pure functions over classes
- Immutable data structures (Pydantic frozen models)
- Function composition over inheritance
- Example: `with_indexes(with_fields(table, ...), ...)`

### Type Safety
- Strict typing with mypy
- Runtime validation with Pydantic
- Generic type support: `Query[T]`, `async def query[T: BaseModel](...)`
- Type hints on all public functions

### Immutability
- All schema objects frozen (Pydantic `frozen=True`)
- Query builder returns new instances
- No mutation of state
- Predictable behavior

### Async-First
- All database operations async: `async def`, `await`
- Built with asyncio
- Context managers: `async with get_client(config):`
- Compatible with FastAPI, async frameworks

### Modular Design
- Small focused modules with single responsibility
- Clear separation of concerns
- Minimal coupling between layers
- Easy to test and extend

### Code-First Approach
- Schema defined in Python code
- Type safety from schema to query
- Migrations generated from schema changes
- Single source of truth in code

## Dependencies
- `pydantic>=2.12.0`: Data validation and settings
- `surrealdb>=1.0.7,<2.0.0`: SurrealDB Python client (v2.x only)
- `tenacity>=9.1.2`: Retry logic with exponential backoff
- `typer>=0.21.0`: CLI framework
- `structlog>=25.5.0`: Structured logging
- `python-ulid>=3.1.0`: ULID generation
- `cachetools>=5.0.0`: Memory cache LRU (required)
- `watchdog>=4.0.0`: File system watching for schema changes (required)
- `redis>=5.0.0`: Redis cache backend (optional, via [cache] extra)
- Dev: `mypy`, `pytest`, `ruff`, `pytest-cov`, `pytest-asyncio`, `types-cachetools`

## Testing
- Unit tests: `tests/test_*.py`
- Run tests: `pytest`
- With coverage: `pytest --cov=src`
- Test structure: Mirrors `src/` directory
- Uses pytest fixtures for database setup
- Total: 800+ tests covering all modules

## Development Workflow
- Format code: `ruff format src/`
- Lint code: `ruff check src/`
- Type check: `mypy src/`
- Run tests: `pytest`
- Coverage report: `pytest --cov=src --cov-report=html`
- Or use PowerShell helpers: `rvfmt`, `rvlint`, `rvtype`, `rvtest`, `rvcov`

## Extension Points
- Custom field types: Extend `FieldDefinition` and create builder functions
- Custom operators: Implement `Operator` class with `to_surql()` method
- Custom migration generators: Extend `src.migration.generator`
- Custom CLI commands: Add to `src.cli/` and register in `__main__.py`
- Custom cache backends: Implement `CacheBackend` abstract class
- Custom visualizers: Implement diagram generator with `generate()` method