Metadata-Version: 2.4
Name: dbt-bridge
Version: 0.1.0
Summary: A dbt-native Reverse ETL tool powered by dlt to move data between databases and APIs.
Author-email: Antigravity <antigravity@example.com>
License: MIT
Keywords: dbt,dlt,etl,reverse-etl,data-engineering,duckdb
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: dbt-core>=1.5.0
Requires-Dist: dbt-duckdb>=1.5.0
Requires-Dist: dlt[parquet]>=0.4.0
Requires-Dist: pandas
Requires-Dist: sqlalchemy>=1.4.0
Provides-Extra: postgres
Requires-Dist: dlt[postgres]; extra == "postgres"
Requires-Dist: psycopg2-binary; extra == "postgres"
Provides-Extra: snowflake
Requires-Dist: dlt[snowflake]; extra == "snowflake"
Requires-Dist: snowflake-sqlalchemy; extra == "snowflake"
Provides-Extra: redshift
Requires-Dist: dlt[redshift]; extra == "redshift"
Requires-Dist: psycopg2-binary; extra == "redshift"
Provides-Extra: bigquery
Requires-Dist: dlt[bigquery]; extra == "bigquery"
Requires-Dist: db-dtypes; extra == "bigquery"
Provides-Extra: duckdb
Requires-Dist: dlt[duckdb]; extra == "duckdb"
Provides-Extra: athena
Requires-Dist: dlt[athena]; extra == "athena"
Provides-Extra: databricks
Requires-Dist: dlt[databricks]; extra == "databricks"
Provides-Extra: mssql
Requires-Dist: dlt[mssql]; extra == "mssql"
Requires-Dist: pyodbc; extra == "mssql"
Provides-Extra: synapse
Requires-Dist: dlt[synapse]; extra == "synapse"
Requires-Dist: pyodbc; extra == "synapse"
Provides-Extra: fabric
Requires-Dist: dlt[mssql]; extra == "fabric"
Requires-Dist: pyodbc; extra == "fabric"
Provides-Extra: trino
Requires-Dist: dlt[trino]; extra == "trino"
Provides-Extra: s3
Requires-Dist: dlt[s3]; extra == "s3"
Provides-Extra: gcs
Requires-Dist: dlt[gs]; extra == "gcs"
Provides-Extra: azure
Requires-Dist: dlt[az]; extra == "azure"
Provides-Extra: filesystem
Requires-Dist: dlt[filesystem]; extra == "filesystem"
Provides-Extra: all
Requires-Dist: dlt[postgres]; extra == "all"
Requires-Dist: psycopg2-binary; extra == "all"
Requires-Dist: dlt[snowflake]; extra == "all"
Requires-Dist: dlt[redshift]; extra == "all"
Requires-Dist: dlt[bigquery]; extra == "all"
Requires-Dist: dlt[duckdb]; extra == "all"
Requires-Dist: dlt[athena]; extra == "all"
Requires-Dist: dlt[databricks]; extra == "all"
Requires-Dist: dlt[mssql]; extra == "all"
Requires-Dist: pyodbc; extra == "all"
Requires-Dist: dlt[synapse]; extra == "all"
Requires-Dist: dlt[trino]; extra == "all"
Requires-Dist: dlt[s3]; extra == "all"
Requires-Dist: dlt[gs]; extra == "all"
Requires-Dist: dlt[az]; extra == "all"
Dynamic: license-file

# dbt-bridge

**A dbt-native Reverse ETL and Cross-Database Movement tool powered by `dlt`.**

`dbt-bridge` allows you to move data between databases, APIs, and warehouses directly within your dbt Python models. It leverages `dlt` (Data Load Tool) for robust, schema-aware data loading.

## Features

-   **Cross-Database Movement**: Move data from Postgres to Snowflake, S3 to BigQuery, etc.
-   **Reverse ETL**: Push your modeled dbt data to external destinations (Salesforce, HubSpot, Postgres, etc.).
-   **API Ingestion**: Fetch data from APIs, transform it with Pandas in-memory, and load it to your warehouse.
-   **The "Bridge Pattern"**: Extract -> Transform (Local) -> Load (Remote).
-   **Lineage**: Registers "Ghost Sources" in dbt so your lineage graph remains complete.

## Installation

Install `dbt-bridge` in your dbt environment. You must include the extras for your specific source/destination.

```bash
# Example: Snowflake destination, Postgres source
pip install "dbt-bridge[snowflake,postgres]"

# Example: All supported extras
pip install "dbt-bridge[all]"
```

### Supported Extras
-   **Warehouses**: `snowflake`, `bigquery`, `redshift`, `databricks`, `synapse`, `fabric`
-   **Databases**: `postgres`, `mssql`, `duckdb`, `trino`, `athena`
-   **Storage**: `s3`, `gcs`, `azure`, `filesystem`

## Usage Patterns

### 1. Database-to-Database Transfer

Move a table from a source database (e.g., Postgres) to your destination (e.g., Snowflake).

```python
import dbt_bridge
import dlt
from dlt.sources.sql_database import sql_database

def model(dbt, session):
    dbt.config(materialized='table', packages=['dbt-bridge', 'dlt', 'psycopg2-binary'])

    # 1. Define Source (e.g., Postgres)
    # Credentials loaded from secrets.toml
    source = sql_database(schema="public", table_names=["users"])
    
    # 2. Explicit Lineage (Important!)
    dbt.source("postgres_prod", "users")

    # 3. Define Destination (e.g., Snowflake)
    destination = dlt.destinations.snowflake()

    # 4. Transfer
    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=source,
        target_destination=destination,
        dataset_name="raw_postgres",
        table_name="users_synced"
    )
```

### 2. API-to-Database (with Transformation)

Fetch data from an API, transform it with Pandas, and load it.

```python
import dbt_bridge
import pandas as pd
from dlt.sources.helpers.rest_client import RESTClient

def model(dbt, session):
    dbt.config(materialized='table', packages=['dbt-bridge', 'dlt', 'pandas'])

    # 1. Fetch Data
    client = RESTClient(base_url="https://api.example.com")
    data_generator = client.paginate("/users")
    
    # 2. Convert to DataFrame & Transform
    # Use the helper to flatten/convert dlt generators to Pandas
    df = dbt_bridge.api_to_df(data_generator)
    df['email'] = df['email'].str.lower()

    # 3. Load
    destination = dlt.destinations.snowflake()
    
    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=df,
        target_destination=destination,
        dataset_name="raw_api",
        table_name="users"
    )
```

### 3. The "Bridge Pattern" (Cross-Database Transformation)

Extract from Source A -> Model Locally (DuckDB) -> Push to Destination B.

1.  **Ingest Model**: Python model fetches data from Source A and returns a DataFrame. dbt saves this as a local table.
2.  **Transform Model**: SQL model reads the local table and applies business logic.
3.  **Push Model**: Python model reads the final SQL model and pushes it to Destination B.

**Push Model Example:**
```python
import dbt_bridge
import dlt

def model(dbt, session):
    dbt.config(materialized='table')

    # 1. Read Final Model
    # Convert to Arrow/Pandas for dlt compatibility
    final_df = dbt.ref("final_users").arrow() 

    # 2. Push to Destination
    destination = dlt.destinations.snowflake()
    
    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=final_df,
        target_destination=destination,
        dataset_name="analytics_prod",
        table_name="final_report"
    )
```

## Configuration

`dlt` uses a `.dlt/secrets.toml` file (or environment variables) for credentials. Place this in your dbt project root.

```toml
[destination.snowflake.credentials]
database = "ANALYTICS"
password = "password"
username = "user"
host = "account_id" # Do not include .snowflakecomputing.com
warehouse = "COMPUTE_WH"

[sources.sql_database.credentials]
drivername = "postgresql"
database = "db_name"
password = "password"
username = "user"
host = "host"
port = 5432
```

## License

MIT
