datablade Documentation
This folder contains the human-facing documentation for datablade.
Quick start
Install:
pip install git+https://github.com/brentwc/data-prep.git
Basic usage:
import pandas as pd
from datablade.dataframes import read_file_smart, clean_dataframe_columns
from datablade.sql import Dialect, generate_create_table, generate_create_table_from_parquet
df = read_file_smart("data.csv", verbose=True)
df = clean_dataframe_columns(df)
ddl = generate_create_table(df, table="my_table", dialect=Dialect.POSTGRES)
print(ddl)
# Or: generate DDL from a Parquet file schema without materializing rows
# Note: nested Parquet types (struct/list/map/union) are dropped with a warning.
ddl2 = generate_create_table_from_parquet(
"events.parquet",
table="events",
dialect=Dialect.POSTGRES,
)
print(ddl2)
Guides
- See docs/USAGE.md for the main usage guide (file reading, streaming, SQL, IO, logging).
- See docs/TESTING.md for running tests locally.