Metadata-Version: 2.4
Name: pg2ch
Version: 0.1.1
Summary: Convert PostgreSQL DDL to ClickHouse DDL
Author-email: Guja Lomsadze <lomsadze.guja@gmail.com>
License: Apache-2.0
Project-URL: Homepage, https://github.com/GujaLomsadze/pg2ch
Project-URL: Repository, https://github.com/GujaLomsadze/pg2ch
Project-URL: Documentation, https://github.com/GujaLomsadze
Project-URL: Bug Reports, https://github.com/GujaLomsadze/pg2ch/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: sqlparse>=0.4.4
Requires-Dist: click>=8.0.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: pre-commit>=3.0.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Requires-Dist: build>=0.10.0; extra == "dev"
Dynamic: license-file
Dynamic: requires-python

# pg2ch - PostgreSQL to ClickHouse DDL Converter

[![PyPI version](https://badge.fury.io/py/pg2ch.svg)](https://badge.fury.io/py/pg2ch)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

> **Effortlessly migrate your PostgreSQL schemas to ClickHouse** 🔄

Convert PostgreSQL DDL statements to ClickHouse format with intelligent type mapping, constraint handling, and schema
optimization. Perfect for data migrations, analytics workflows, and multi-database architectures.

## Features

- 🎯 **Smart Type Mapping** - Automatic PostgreSQL → ClickHouse type conversion
- 🔑 **Primary Key Detection** - Generates optimal `ORDER BY` clauses
- 🛡️ **Constraint Handling** - Preserves `NOT NULL`, `UNIQUE`, `DEFAULT` values
- 📝 **Schema Support** - Handles `public.table` notation seamlessly
- 🔧 **CLI Interface** - Batch process multiple DDL files
- ✅ **DDL Validation** - Optional syntax validation with ClickHouse Local
- 📊 **Metadata Extraction** - Detailed table and column information

## 🚀 Quick Start

### Installation

```bash
pip install pg2ch
```

### Basic Usage

```python
from pg2ch import convert_ddl

# Define the PostgreSQL DDL as a string
postgres_ddl = """
CREATE TABLE IF NOT EXISTS public.transactions (
    transaction_id    BIGSERIAL PRIMARY KEY,
    user_id           INTEGER NOT NULL,
    account_id        INTEGER NOT NULL,
    transaction_type  VARCHAR(20) NOT NULL,
    amount            DECIMAL(15,2) NOT NULL,
    currency          CHAR(3) DEFAULT 'USD',
    description       TEXT,
    created_at        TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
"""

# Convert the PostgreSQL DDL to ClickHouse DDL
clickhouse_ddl = convert_ddl(postgres_ddl)

# Print the resulting ClickHouse DDL
print("--- PostgreSQL DDL ---")
print(postgres_ddl)
print("\n--- Converted ClickHouse DDL ---")
print(clickhouse_ddl)
```

**Output:**

```sql
--- PostgreSQL DDL ---
CREATE TABLE IF NOT EXISTS public.transactions
(
    transaction_id   BIGSERIAL PRIMARY KEY,
    user_id          INTEGER NOT NULL,
    account_id       INTEGER NOT NULL,
    transaction_type VARCHAR(20) NOT NULL,
    amount           DECIMAL(15, 2) NOT NULL,
    currency         CHAR(3) DEFAULT 'USD',
    description      TEXT,
    created_at       TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

--- Converted ClickHouse DDL ---
CREATE TABLE IF NOT EXISTS transactions
(
    transaction_id   Int64,
    user_id          Int32,
    account_id       Int32,
    transaction_type String,
    amount           Decimal64(4),
    currency         Nullable(String)   DEFAULT 'USD',
    description      Nullable(String),
    created_at       Nullable(DateTime) DEFAULT NOW()
)
    ENGINE = MergeTree()
        ORDER BY (transaction_id);
```

## 🎯 Advanced Usage

### Parse and Inspect Metadata

```python
...

parser = PostgreSQLParser()
tables = parser.parse_ddl(postgres_ddl)

for table in tables:
    # table.to_json()
    # table.to_dict()
    # table.print_json()
    # table.get_column()

    # Access individual table properties
    print(f"Table: {table.name}")
    print(f"Columns: {len(table.columns)}")
    print(f"Primary Keys: {table.primary_keys}")
```

### CLI Usage

```bash
# Convert a single file
pg2ch schema.sql

# Save to output file
pg2ch schema.sql --output clickhouse_schema.sql

# With validation (requires clickhouse-local)
pg2ch schema.sql --validate
```

### DDL Validation

```python
from pg2ch import convert_ddl, validate_clickhouse_ddl_with_local

postgres_ddl = """
CREATE TABLE IF NOT EXISTS public.transactions (
    transaction_id    BIGSERIAL PRIMARY KEY,
    user_id           INTEGER NOT NULL,
    account_id        INTEGER NOT NULL,
    transaction_type  VARCHAR(20) NOT NULL,
    amount            DECIMAL(15,2) NOT NULL,
    currency          CHAR(3) DEFAULT 'USD',
    description       TEXT,
    created_at        TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

"""

""" Run with 'validate=True' """

clickhouse_ddl = convert_ddl(postgres_ddl, validate=True)

""" or, use function """

is_valid, message = validate_clickhouse_ddl_with_local(clickhouse_ddl)
print(message)  # ✅ DDL is valid

```

## 📋 Type Mapping Reference

| PostgreSQL                 | ClickHouse     | Notes                     |
|----------------------------|----------------|---------------------------|
| `SERIAL`                   | `Int32`        | Auto-increment            |
| `BIGSERIAL`                | `Int64`        | Large auto-increment      |
| `VARCHAR(n)`               | `String`       | Variable length text      |
| `TEXT`                     | `String`       | Unlimited text            |
| `INTEGER`                  | `Int32`        | 32-bit integer            |
| `BIGINT`                   | `Int64`        | 64-bit integer            |
| `BOOLEAN`                  | `Bool`         | True/false values         |
| `DECIMAL(p,s)`             | `Decimal64(4)` | Fixed precision           |
| `TIMESTAMP`                | `DateTime`     | Date and time             |
| `TIMESTAMP WITH TIME ZONE` | `DateTime`     | Timezone-aware timestamps |
| `JSONB`                    | `String`       | JSON data as string       |
| `UUID`                     | `String`       | UUID as string            |

## 🧪 Testing

```bash
# Run all tests
pytest

# Run with coverage
pytest --cov=pg2ch

# Run specific test
pytest tests/test_transactions_table.py -v
```

## 📚 Documentation

- **[API Reference](docs/api.md)** - Detailed API documentation
- **[Usage Guide](docs/usage.md)** - Advanced usage patterns
- **[Contributing](docs/contributing.md)** - Development guidelines
- **[Changelog](CHANGELOG.md)** - Version history

## 📈 Roadmap

- [ ] **Foreign Key Support** - Convert foreign key constraints
- [ ] **Index Migration** - Transform PostgreSQL indexes
- [ ] **Partitioning Support** - Migrate table partitioning schemes
- [ ] **GUI Interface** - Web-based conversion tool

## 🐛 Known Limitations

- Complex constraints (CHECK with subqueries) are simplified
- PostgreSQL-specific functions may need manual adjustment
- Some advanced PostgreSQL types require custom mapping

## 📄 License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

---

<div align="center">

**Made with ❤️ for the data engineering community**

[⭐ Star on GitHub](https://github.com/GujaLomsadze/pg2ch) • [🐛 Report Bug](https://github.com/GujaLomsadze/pg2ch/issues)

</div>
