dran.storage package#

Submodules#

dran.storage.db_introspection module#

dran.storage.db_introspection.record_exists(conn, table, key_field, key_value)[source]#

Fast existence check using an indexed lookup (UNIQUE field).

Returns True if a record exists, else False.

Parameters:
Return type:

bool

dran.storage.db_introspection.ensure_processed_files_table(conn)[source]#

Ensure a small registry table exists for processed files.

This enables fast de-duplication across path changes and symlinks.

Parameters:

conn (Connection)

Return type:

None

dran.storage.db_introspection.processed_file_exists_by_path(conn, filepath)[source]#
Parameters:
Return type:

bool

dran.storage.db_introspection.processed_file_hashes_by_size(conn, file_size)[source]#
Parameters:
Return type:

List[str]

dran.storage.db_introspection.insert_processed_file(conn, *, file_hash, file_size, file_mtime, filepath, filename)[source]#
Parameters:
Return type:

None

dran.storage.sqlite_connection module#

dran.storage.sqlite_connection.get_connection(db_path, log=None)[source]#

Open a SQLite connection with pragmatic defaults for local workloads.

Settings applied: - WAL mode for better concurrent reads/writes - synchronous NORMAL for balanced durability and speed - busy_timeout to reduce “database is locked” failures

Parameters:
Return type:

Connection

dran.storage.sqlite_repository module#

dran.storage.sqlite_repository.insert_dict(conn, table, item)[source]#

Insert a dict into table and return inserted row id.

Parameters:
Return type:

int

dran.storage.sqlite_repository.fetch_row(conn, table, row_id)[source]#

Fetch a row and reconstruct arrays from BLOBs where possible.

Parameters:
Return type:

dict[str, Any]

dran.storage.sqlite_repository.get_existing_keys(conn, table, key)[source]#

Load all existing values of a key into a set for fast membership checks.

Parameters:
Return type:

set[Any]

dran.storage.sqlite_repository.save_record(conn, table, item, *, create_table_fn=None)[source]#

Insert one record. Returns row id.

create_table_fn is optional and lets callers ensure schema before insert.

Parameters:
Return type:

int

dran.storage.sqlite_schema module#

dran.storage.sqlite_schema.infer_sqlite_type(value)[source]#

Infer an SQLite column type from a sample value.

Uses: - BLOB for non-scalar NumPy arrays - REAL for int/float scalars - TEXT for everything else

Parameters:

value (Any)

Return type:

str

dran.storage.sqlite_schema.ensure_table_from_dict(conn, table, sample, unique_field='FILENAME')[source]#

Create a table if it does not exist.

Column names are taken from sample keys. Each column type is inferred from sample values.

unique_field is used as a UNIQUE constraint if it exists in sample.

Parameters:
Return type:

None

dran.storage.sqlite_types module#

dran.storage.sqlite_types.array_to_blob(arr)[source]#

Encode a NumPy array as bytes for SQLite storage.

Uses np.save into an in-memory buffer, preserving dtype and shape.

Parameters:

arr (ndarray)

Return type:

bytes

dran.storage.sqlite_types.blob_to_array(blob)[source]#

Decode stored bytes back into a NumPy array.

Parameters:

blob (bytes)

Return type:

ndarray

dran.storage.sqlite_types.normalize_for_schema(value)[source]#

Normalize values for schema inference.

SQLite column typing is coarse. This converts NumPy scalars and 0-D arrays into Python scalars so the type checks behave as expected.

Parameters:

value (Any)

Return type:

Any

dran.storage.sqlite_types.normalize_for_storage(value)[source]#

Prepare a value for SQLite insertion.

Rules: - 0-D NumPy arrays -> Python scalar - NumPy scalars -> Python scalar - N-D NumPy arrays (shape != ()) -> BLOB - Everything else unchanged

Parameters:

value (Any)

Return type:

Any

Module contents#

dran.storage.get_connection(db_path, log=None)[source]#

Open a SQLite connection with pragmatic defaults for local workloads.

Settings applied: - WAL mode for better concurrent reads/writes - synchronous NORMAL for balanced durability and speed - busy_timeout to reduce “database is locked” failures

Parameters:
Return type:

Connection

dran.storage.ensure_table_from_dict(conn, table, sample, unique_field='FILENAME')[source]#

Create a table if it does not exist.

Column names are taken from sample keys. Each column type is inferred from sample values.

unique_field is used as a UNIQUE constraint if it exists in sample.

Parameters:
Return type:

None

dran.storage.insert_dict(conn, table, item)[source]#

Insert a dict into table and return inserted row id.

Parameters:
Return type:

int

dran.storage.fetch_row(conn, table, row_id)[source]#

Fetch a row and reconstruct arrays from BLOBs where possible.

Parameters:
Return type:

dict[str, Any]

dran.storage.get_existing_keys(conn, table, key)[source]#

Load all existing values of a key into a set for fast membership checks.

Parameters:
Return type:

set[Any]

dran.storage.save_record(conn, table, item, *, create_table_fn=None)[source]#

Insert one record. Returns row id.

create_table_fn is optional and lets callers ensure schema before insert.

Parameters:
Return type:

int

dran.storage.array_to_blob(arr)[source]#

Encode a NumPy array as bytes for SQLite storage.

Uses np.save into an in-memory buffer, preserving dtype and shape.

Parameters:

arr (ndarray)

Return type:

bytes

dran.storage.blob_to_array(blob)[source]#

Decode stored bytes back into a NumPy array.

Parameters:

blob (bytes)

Return type:

ndarray

dran.storage.normalize_for_schema(value)[source]#

Normalize values for schema inference.

SQLite column typing is coarse. This converts NumPy scalars and 0-D arrays into Python scalars so the type checks behave as expected.

Parameters:

value (Any)

Return type:

Any

dran.storage.normalize_for_storage(value)[source]#

Prepare a value for SQLite insertion.

Rules: - 0-D NumPy arrays -> Python scalar - NumPy scalars -> Python scalar - N-D NumPy arrays (shape != ()) -> BLOB - Everything else unchanged

Parameters:

value (Any)

Return type:

Any