datablade / API Surface

API surface

Public modules, classes, and functions.

Module index

Browse modules and jump into details.

Class index

Click a class to open its definition.

Function index

Click a function to open its definition.

FunctionModuleDocstring
try_cast_string_columns_to_numericdatablade.dataframes.framesYes
clean_dataframe_columnsdatablade.dataframes.framesYes
generate_parquet_schemadatablade.dataframes.framesYes
pandas_to_parquet_tabledatablade.dataframes.framesYes
generate_sql_server_create_table_stringdatablade.dataframes.framesYes
write_to_file_and_sqldatablade.dataframes.framesYes
read_file_chunkeddatablade.dataframes.readersYes
read_file_iterdatablade.dataframes.readersYes
read_file_to_parquetsdatablade.dataframes.readersYes
stream_to_parquetsdatablade.dataframes.readersYes
read_file_smartdatablade.dataframes.readersYes
list_docsdatablade.docsYes
read_docdatablade.docsYes
write_docsdatablade.docsYes
write_sitedatablade.docsYes
maindatablade.docsYes
getdatablade.io.jsonYes
getdatablade.io.zipYes
bulk_load_sqlserverdatablade.sql.bulk_loadYes
bulk_load_postgresdatablade.sql.bulk_loadYes
bulk_load_mysqldatablade.sql.bulk_loadYes
bulk_load_duckdbdatablade.sql.bulk_loadYes
bulk_loaddatablade.sql.bulk_loadYes
write_dataframe_and_loaddatablade.sql.bulk_loadYes
generate_create_tabledatablade.sql.ddlYes
generate_create_table_from_parquetdatablade.sql.ddl_pyarrowYes
quote_identifierdatablade.sql.quotingYes
flattendatablade.utils.listsYes
get_loggerdatablade.utils.loggingYes
configure_loggingdatablade.utils.loggingYes
logdatablade.utils.loggingYes
log_debugdatablade.utils.loggingYes
log_infodatablade.utils.loggingYes
log_warningdatablade.utils.loggingYes
log_errordatablade.utils.loggingYes
print_verbosedatablade.utils.loggingYes
sql_quotenamedatablade.utils.stringsYes
pathingdatablade.utils.stringsYes

Module details

Expanded views of public classes and functions.

datablade.blade

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\blade.py · 1 classes · 0 functions

Optional facade class for datablade. The canonical API is module-level functions (e.g., datablade.dataframes.read_file_iter). This module provides a small co...

Classes

Blade 8 methods
class Blade:

Convenience facade for common datablade workflows. Stores default options that are threaded through to the underlying functions.

Methods

  • read(self, file_path, **read_kwargs)
  • iter(self, file_path, *, chunksize=None, **read_kwargs)
  • partition_to_parquets(self, file_path, output_dir, *, output_prefix='part', rows_per_file=None, convert_types=None, **read_kwargs)
  • stream_to_parquets(self, file_path, output_dir, *, output_prefix='part', rows_per_file=None, convert_types=None, **read_kwargs)
  • clean(self, df)
  • cast_numeric(self, df)
  • create_table_sql(self, df, *, catalog=None, schema=None, table='table', drop_existing=True, dialect=Dialect.SQLSERVER)
  • create_table_sql_from_parquet(self, parquet_path, *, catalog=None, schema=None, table='table', drop_existing=True, dialect=Dialect.SQLSERVER)

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\blade.py:30

datablade.dataframes.frames

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\frames.py · 0 classes · 6 functions

Utilities for dataframes.frames.

Functions

try_cast_string_columns_to_numeric(df=None, convert_partial=False, verbose=False)
def try_cast_string_columns_to_numeric(df=None, convert_partial=False, verbose=False):

Attempt to cast DataFrame string columns to numeric values where possible. Args: df: The DataFrame to process. If None, returns None. convert_partial: If Tru...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\frames.py:71

clean_dataframe_columns(df=None, verbose=False)
def clean_dataframe_columns(df=None, verbose=False):

Clean the DataFrame columns by flattening MultiIndex, converting to strings, and removing duplicates. Args: df: The DataFrame to clean. If None, returns None...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\frames.py:128

generate_parquet_schema(df=None, verbose=False)
def generate_parquet_schema(df=None, verbose=False):

Generate a PyArrow Schema from a pandas DataFrame with optimized data types. Args: df: The DataFrame to generate the schema from. If None, returns None. verb...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\frames.py:171

pandas_to_parquet_table(df=None, convert=True, partial=False, preserve_index=False, verbose=False)
def pandas_to_parquet_table(df=None, convert=True, partial=False, preserve_index=False, verbose=False):

Generate a PyArrow Table from a pandas DataFrame with automatic type conversion. Args: df: The DataFrame to convert. If None, returns None. convert: If True,...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\frames.py:264

generate_sql_server_create_table_string(df=None, catalog='database', schema='dbo', table='table', dropexisting=True, verbose=False)
def generate_sql_server_create_table_string(df=None, catalog='database', schema='dbo', table='table', dropexisting=True, verbose=False):

Generate a SQL Server CREATE TABLE statement from a pandas DataFrame. Args: df: The DataFrame to generate the schema from. If None, returns None. catalog: Th...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\frames.py:350

write_to_file_and_sql(df, file_path, table_name, sql_server, database, username, password, verbose=False)
def write_to_file_and_sql(df, file_path, table_name, sql_server, database, username, password, verbose=False):

Write a DataFrame to a CSV file and import it to SQL Server using BCP. Args: df: The DataFrame to write. file_path: Path where the CSV file will be saved. ta...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\frames.py:402

datablade.dataframes.readers

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\readers.py · 0 classes · 5 functions

Memory-aware file reading utilities with Polars support. This module provides intelligent file reading that: - Estimates memory requirements before loading -...

Functions

read_file_chunked(file_path, chunksize=None, memory_fraction=0.5, verbose=False, **read_kwargs)
def read_file_chunked(file_path, chunksize=None, memory_fraction=0.5, verbose=False, **read_kwargs):

Read a file in chunks, automatically determining chunk size based on available memory. Args: file_path: Path to the file to read. chunksize: Optional explici...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\readers.py:138

read_file_iter(file_path, chunksize=None, memory_fraction=0.5, verbose=False, **read_kwargs)
def read_file_iter(file_path, chunksize=None, memory_fraction=0.5, verbose=False, **read_kwargs):

Stream a file as an iterator of DataFrame chunks. This is the "never materialize" API: unlike read_file_smart(), this function does not concatenate chunks in...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\readers.py:219

read_file_to_parquets(file_path, output_dir, output_prefix='part', rows_per_file=None, memory_fraction=0.5, convert_types=True, verbose=False, **read_kwargs)
def read_file_to_parquets(file_path, output_dir, output_prefix='part', rows_per_file=None, memory_fraction=0.5, convert_types=True, verbose=False, **read_kwargs):

Read a large file and write it to multiple Parquet files if it doesn't fit in memory. Args: file_path: Path to the input file. output_dir: Directory where Pa...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\readers.py:315

stream_to_parquets(file_path, output_dir, output_prefix='part', rows_per_file=None, memory_fraction=0.5, convert_types=True, verbose=False, **read_kwargs)
def stream_to_parquets(file_path, output_dir, output_prefix='part', rows_per_file=None, memory_fraction=0.5, convert_types=True, verbose=False, **read_kwargs):

Stream a file and write it to Parquet partitions without materializing. This helper is the "no concat" companion to read_file_to_parquets(). It uses read_fil...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\readers.py:376

read_file_smart(file_path, use_polars=True, memory_fraction=0.5, verbose=False, **read_kwargs)
def read_file_smart(file_path, use_polars=True, memory_fraction=0.5, verbose=False, **read_kwargs):

Intelligently read a file, using Polars for large files if available. For files that fit in memory, reads directly. For large files, uses Polars lazy evaluat...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\dataframes\readers.py:441

datablade.docs

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\docs\__init__.py · 0 classes · 5 functions

Public exports for docs.

Functions

list_docs()
def list_docs():

List docs.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\docs\__init__.py:17

read_doc(name)
def read_doc(name):

Read doc.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\docs\__init__.py:22

write_docs(target_dir)
def write_docs(target_dir):

Write docs.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\docs\__init__.py:34

write_site(target_dir)
def write_site(target_dir):

Write site.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\docs\__init__.py:47

main(argv=None)
def main(argv=None):

Main.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\docs\__init__.py:100

datablade.io.json

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\io\json.py · 0 classes · 1 functions

Utilities for io.json.

Functions

get(url, verbose=False, **kwargs)
def get(url, verbose=False, **kwargs):

Get JSON data from a URL using HTTP GET request. Args: url: The URL to fetch JSON data from (must be non-empty string). verbose: If True, prints error messag...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\io\json.py:10

datablade.io.zip

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\io\zip.py · 0 classes · 1 functions

Utilities for io.zip.

Functions

get(url, path=None, verbose=False, **kwargs)
def get(url, path=None, verbose=False, **kwargs):

Download a ZIP file from a URL and either extract it to a path or return as BytesIO. Args: url: The URL of the ZIP file to download. path: Optional path wher...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\io\zip.py:13

datablade.sql.bulk_load

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\bulk_load.py · 0 classes · 6 functions

Bulk loading utilities for SQL databases. Provides dialect-aware bulk loading from files to database tables. Supports SQL Server (BCP), PostgreSQL (COPY), My...

Functions

bulk_load_sqlserver(file_path, table_name, database, server, username, password, schema='dbo', delimiter=',', verbose=False)
def bulk_load_sqlserver(file_path, table_name, database, server, username, password, schema='dbo', delimiter=',', verbose=False):

Bulk load a file into SQL Server using BCP. Args: file_path: Path to the data file. table_name: Target table name. database: Database name. server: SQL Serve...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\bulk_load.py:52

bulk_load_postgres(file_path, table_name, database, schema='public', connection_string=None, delimiter=',', header=True, verbose=False)
def bulk_load_postgres(file_path, table_name, database, schema='public', connection_string=None, delimiter=',', header=True, verbose=False):

Generate a PostgreSQL COPY command for bulk loading. Args: file_path: Path to the data file. table_name: Target table name. database: Database name. schema:...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\bulk_load.py:125

bulk_load_mysql(file_path, table_name, database, delimiter=',', enclosed_by='"', lines_terminated_by='\\n', ignore_lines=1, verbose=False)
def bulk_load_mysql(file_path, table_name, database, delimiter=',', enclosed_by='"', lines_terminated_by='\\n', ignore_lines=1, verbose=False):

Generate a MySQL LOAD DATA command for bulk loading. Args: file_path: Path to the data file. table_name: Target table name. database: Database name. delimite...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\bulk_load.py:170

bulk_load_duckdb(file_path, table_name, database='memory', schema='main', verbose=False)
def bulk_load_duckdb(file_path, table_name, database='memory', schema='main', verbose=False):

Generate a DuckDB COPY command for bulk loading. Args: file_path: Path to the data file. table_name: Target table name. database: Database name (default: mem...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\bulk_load.py:218

bulk_load(file_path, table_name, database, dialect=Dialect.SQLSERVER, schema=None, server=None, username=None, password=None, delimiter=',', verbose=False)
def bulk_load(file_path, table_name, database, dialect=Dialect.SQLSERVER, schema=None, server=None, username=None, password=None, delimiter=',', verbose=False):

Bulk load a file to a database table using the appropriate dialect method. Args: file_path: Path to the data file. table_name: Target table name. database: D...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\bulk_load.py:260

write_dataframe_and_load(df, file_path, table_name, database, dialect=Dialect.SQLSERVER, schema=None, server=None, username=None, password=None, delimiter=',', verbose=False)
def write_dataframe_and_load(df, file_path, table_name, database, dialect=Dialect.SQLSERVER, schema=None, server=None, username=None, password=None, delimiter=',', verbose=False):

Write a DataFrame to a file and bulk load it to a database. Args: df: The DataFrame to write. file_path: Path where the file will be saved. table_name: Targe...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\bulk_load.py:340

datablade.sql.ddl

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\ddl.py · 0 classes · 1 functions

Utilities for sql.ddl.

Functions

generate_create_table(df, catalog=None, schema=None, table='table', drop_existing=True, dialect=Dialect.SQLSERVER, verbose=False)
def generate_create_table(df, catalog=None, schema=None, table='table', drop_existing=True, dialect=Dialect.SQLSERVER, verbose=False):

Generate a CREATE TABLE statement for the given dialect. Args: df: Source DataFrame. catalog: Optional catalog/database name. schema: Optional schema name (d...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\ddl.py:162

datablade.sql.ddl_pyarrow

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\ddl_pyarrow.py · 0 classes · 1 functions

Utilities for sql.ddl_pyarrow.

Functions

generate_create_table_from_parquet(parquet_path, catalog=None, schema=None, table='table', drop_existing=True, dialect=Dialect.SQLSERVER, verbose=False)
def generate_create_table_from_parquet(parquet_path, catalog=None, schema=None, table='table', drop_existing=True, dialect=Dialect.SQLSERVER, verbose=False):

Generate a CREATE TABLE statement from a Parquet file schema. This reads the Parquet schema only (via PyArrow) and does not materialize data. Columns whose P...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\ddl_pyarrow.py:209

datablade.sql.dialects

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\dialects.py · 1 classes · 0 functions

Utilities for sql.dialects.

Classes

Dialect 0 methods
class Dialect(str, Enum):

Supported SQL dialects for datablade DDL helpers.

No public methods listed.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\dialects.py:6

datablade.sql.quoting

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\quoting.py · 0 classes · 1 functions

Utilities for sql.quoting.

Functions

quote_identifier(name, dialect=Dialect.SQLSERVER)
def quote_identifier(name, dialect=Dialect.SQLSERVER):

Quote an identifier for the given SQL dialect. Args: name: Identifier to quote; must be non-empty string. dialect: Target SQL dialect. Returns: Quoted identi...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\sql\quoting.py:8

datablade.utils.lists

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\lists.py · 0 classes · 1 functions

Utilities for utils.lists.

Functions

flatten(nest)
def flatten(nest):

Flatten a nested list recursively to a single-level list. Args: nest: A potentially nested list structure. Returns: A flat list containing all elements from...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\lists.py:6

datablade.utils.logging

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py · 0 classes · 8 functions

Logging utilities for datablade. Provides a configurable logger that can be used across all modules. By default, logs to console at INFO level. Users can con...

Functions

get_logger()
def get_logger():

Get the datablade logger instance. Returns: The configured datablade logger.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py:34

configure_logging(level=logging.INFO, handler=None, format_string=None, *, log_file=None, format=None)
def configure_logging(level=logging.INFO, handler=None, format_string=None, *, log_file=None, format=None):

Configure the datablade logger. Args: level: Logging level (e.g., logging.DEBUG, logging.INFO). handler: Optional custom handler. If None, uses StreamHandler...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py:45

log(message, level=logging.INFO, verbose=True)
def log(message, level=logging.INFO, verbose=True):

Log a message at the specified level if verbose is True. Args: message: The message to log (converted to string). level: Logging level (default: INFO). verbo...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py:102

log_debug(message, verbose=True)
def log_debug(message, verbose=True):

Log a DEBUG level message.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py:125

log_info(message, verbose=True)
def log_info(message, verbose=True):

Log an INFO level message.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py:130

log_warning(message, verbose=True)
def log_warning(message, verbose=True):

Log a WARNING level message.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py:135

log_error(message, verbose=True)
def log_error(message, verbose=True):

Log an ERROR level message.

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py:140

print_verbose(message, verbose=True)
def print_verbose(message, verbose=True):

Print a message if verbose is True. This is a backward-compatible alias for log_info. Args: message: The message to print (converted to string). verbose: If...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\logging.py:146

datablade.utils.strings

C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\strings.py · 0 classes · 2 functions

Utilities for utils.strings.

Functions

sql_quotename(name=None, brackets=True, ticks=False, verbose=False)
def sql_quotename(name=None, brackets=True, ticks=False, verbose=False):

Quote a SQL Server name string with brackets or ticks. Args: name: The name to quote. Must be a non-empty string. brackets: If True, wraps the name in square...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\strings.py:9

pathing(input, verbose=False)
def pathing(input, verbose=False):

Standardize and validate a path string or Path object. Args: input: The path to standardize (string or pathlib.Path). Must not be None. verbose: If True, pri...

Source: C:\Users\BrentCarpenetti\Documents\repositories\datablade\src\datablade\utils\strings.py:55