Metadata-Version: 2.4
Name: dbt-bridge
Version: 0.1.3
Summary: A dbt-native Reverse ETL tool powered by dlt to move data between databases and APIs.
Author-email: Januka Peiris <jaypeiris91@gmail.com>
License: MIT
Keywords: dbt,dlt,etl,reverse-etl,data-engineering,duckdb
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: dbt-core>=1.5.0
Requires-Dist: dbt-duckdb>=1.5.0
Requires-Dist: dlt[parquet]>=0.4.0
Requires-Dist: pandas
Requires-Dist: sqlalchemy>=1.4.0
Provides-Extra: postgres
Requires-Dist: dlt[postgres]; extra == "postgres"
Requires-Dist: psycopg2-binary; extra == "postgres"
Provides-Extra: snowflake
Requires-Dist: dlt[snowflake]; extra == "snowflake"
Requires-Dist: snowflake-sqlalchemy; extra == "snowflake"
Provides-Extra: redshift
Requires-Dist: dlt[redshift]; extra == "redshift"
Requires-Dist: psycopg2-binary; extra == "redshift"
Provides-Extra: bigquery
Requires-Dist: dlt[bigquery]; extra == "bigquery"
Requires-Dist: db-dtypes; extra == "bigquery"
Provides-Extra: duckdb
Requires-Dist: dlt[duckdb]; extra == "duckdb"
Provides-Extra: athena
Requires-Dist: dlt[athena]; extra == "athena"
Provides-Extra: databricks
Requires-Dist: dlt[databricks]; extra == "databricks"
Provides-Extra: mssql
Requires-Dist: dlt[mssql]; extra == "mssql"
Requires-Dist: pyodbc; extra == "mssql"
Provides-Extra: synapse
Requires-Dist: dlt[synapse]; extra == "synapse"
Requires-Dist: pyodbc; extra == "synapse"
Provides-Extra: fabric
Requires-Dist: dlt[mssql]; extra == "fabric"
Requires-Dist: pyodbc; extra == "fabric"
Provides-Extra: trino
Requires-Dist: dlt[trino]; extra == "trino"
Provides-Extra: s3
Requires-Dist: dlt[s3]; extra == "s3"
Provides-Extra: gcs
Requires-Dist: dlt[gs]; extra == "gcs"
Provides-Extra: azure
Requires-Dist: dlt[az]; extra == "azure"
Provides-Extra: filesystem
Requires-Dist: dlt[filesystem]; extra == "filesystem"
Provides-Extra: all
Requires-Dist: dlt[postgres]; extra == "all"
Requires-Dist: psycopg2-binary; extra == "all"
Requires-Dist: dlt[snowflake]; extra == "all"
Requires-Dist: dlt[redshift]; extra == "all"
Requires-Dist: dlt[bigquery]; extra == "all"
Requires-Dist: dlt[duckdb]; extra == "all"
Requires-Dist: dlt[athena]; extra == "all"
Requires-Dist: dlt[databricks]; extra == "all"
Requires-Dist: dlt[mssql]; extra == "all"
Requires-Dist: pyodbc; extra == "all"
Requires-Dist: dlt[synapse]; extra == "all"
Requires-Dist: dlt[trino]; extra == "all"
Requires-Dist: dlt[s3]; extra == "all"
Requires-Dist: dlt[gs]; extra == "all"
Requires-Dist: dlt[az]; extra == "all"
Dynamic: license-file

dbt-bridge

A dbt-native data movement layer powered by dlt — for cross-database sync, API ingestion, and (yes) Reverse ETL.
Do everything inside dbt Python models, with full lineage in your DAG.

dbt-bridge lets you extract, transform, and load between any sources and destinations—all inside dbt. It uses dlt for schema-aware loading and uses dbt “Ghost Sources” to keep your lineage complete.

It’s basically:
Move data anywhere → keep everything in one DAG.

🚀 Features

Cross-Database Movement
Move data from Postgres → Snowflake, MySQL → BigQuery, DuckDB → S3, etc.

Reverse ETL (Optional, but supported)
Push your modeled dbt tables into operational systems or external databases.

API Ingestion
Pull data from REST APIs, transform using Pandas, and load it to your warehouse.

The “Bridge Pattern”
Extract → Model locally (DuckDB) → Push to another destination.

Lineage Support
Registers “Ghost Sources” so all upstream dependencies appear in dbt docs.

dbt Native
Runs as part of dbt run, not a separate process.

📦 Installation

Install the package with only the connectors you need:

pip install "dbt-bridge[snowflake,postgres]"


Or install everything:

pip install "dbt-bridge[all]"

Supported Extras

Warehouses:
snowflake, bigquery, redshift, databricks, synapse, fabric

Databases:
postgres, mssql, duckdb, trino, athena

Storage / Filesystems:
s3, gcs, azure, filesystem

🧪 Usage Examples
1. Database → Database Transfer (Postgres → Snowflake)
import dbt_bridge
import dlt
from dlt.sources.sql_database import sql_database

def model(dbt, session):
    dbt.config(materialized='table')

    source = sql_database(schema="public", table_names=["users"])
    dbt.source("postgres_prod", "users")  # lineage

    destination = dlt.destinations.snowflake()

    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=source,
        target_destination=destination,
        dataset_name="raw_postgres",
        table_name="users_synced",
    )

2. API → Warehouse (with Pandas Transform)
import dbt_bridge
from dlt.sources.helpers.rest_client import RESTClient

def model(dbt, session):
    dbt.config(materialized='table')

    client = RESTClient(base_url="https://api.example.com")
    raw = client.paginate("/users")

    df = dbt_bridge.api_to_df(raw)
    df["email"] = df["email"].str.lower()

    destination = dlt.destinations.snowflake()

    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=df,
        target_destination=destination,
        dataset_name="raw_api",
        table_name="users",
    )

3. The Bridge Pattern (Extract → SQL Transform → Push)

Step 1: Ingest (Python Model)
Saved locally via DuckDB.

Step 2: Transform (SQL Model)
Standard dbt SQL logic.

Step 3: Push (Python Model)
Send the final result to another destination.

import dbt_bridge
import dlt

def model(dbt, session):
    dbt.config(materialized='table')

    final_df = dbt.ref("int_active_users").arrow()

    destination = dlt.destinations.snowflake()

    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=final_df,
        target_destination=destination,
        dataset_name="analytics_prod",
        table_name="active_users",
    )

🔧 Configuration

dlt reads credentials from .dlt/secrets.toml in your dbt project root:

[destination.snowflake.credentials]
username = "user"
password = "password"
database = "ANALYTICS"
host = "account_id"
warehouse = "COMPUTE_WH"

[sources.sql_database.credentials]
drivername = "postgresql"
host = "localhost"
port = 5432
database = "source_db"
username = "user"
password = "password"

� Incremental Loading

dbt-bridge supports three loading strategies:

**1. Replace (Full Refresh)**
```python
return dbt_bridge.transfer(
    dbt=dbt,
    source_data=df,
    target_destination=destination,
    dataset_name="raw_api",
    table_name="users",
    write_disposition="replace"  # Default
)
```

**2. Append (Add Only)**
```python
return dbt_bridge.transfer(
    dbt=dbt,
    source_data=df,
    target_destination=destination,
    dataset_name="raw_api",
    table_name="users",
    write_disposition="append"  # Never deletes
)
```

**3. Merge (Upsert)**
```python
return dbt_bridge.transfer(
    dbt=dbt,
    source_data=df,
    target_destination=destination,
    dataset_name="raw_api",
    table_name="users",
    write_disposition="merge",
    primary_key="id"  # Required for merge
)
```

For composite keys, pass a list:
```python
primary_key=["user_id", "date"]
```

�📘 License

MIT
