Documentation
Timefence v0.9.1
Timefence is a temporal correctness layer for ML training data. It guarantees that no future data leaks into your training set by enforcing point-in-time correctness at the moment of dataset construction.
It is not a feature store, data orchestrator, data quality framework, or ML pipeline.
Timefence does one thing: ensures feature_time < label_time for every row.
Installation
Timefence requires Python 3.9+ and pip.
pip install timefence
Optional extras:
# Jupyter notebook support
pip install timefence[notebook]
# Development dependencies
pip install timefence[dev]
Dependencies: duckdb>=1.0.0, click>=8.0.0, rich>=13.0.0, pandas>=1.5.0. No Spark, no cloud infrastructure.
Quickstart
Generate a complete example project with synthetic data and planted leakage scenarios:
timefence quickstart churn-example
cd churn-example
This creates a self-contained directory with:
timefence.yaml— Project configurationfeatures.py— 4 feature definitionsdata/— Synthetic parquet files (users, transactions, labels)data/train_LEAKY.parquet— Pre-built dataset with planted leakageREADME.md— Next-step instructions
Now audit the leaky dataset:
timefence audit data/train_LEAKY.parquet
Then build a clean one:
timefence build -o train_CLEAN.parquet
Temporal Correctness
The core invariant Timefence enforces is:
feature_time < label_time - embargo
For every row in your training set, the feature value used must have been available strictly before the label event time (minus any embargo buffer). This prevents future data leakage — the most common and hardest-to-detect source of inflated ML metrics.
In inclusive join mode, the condition relaxes to:
feature_time <= label_time - embargo
Embargo
Real-world ML pipelines have latency: data arrives late, ETL jobs run on schedules, and features take time to compute. The embargo parameter models this lag.
rolling_spend = timefence.Feature(
source=transactions,
sql="SELECT ...",
embargo="1d" # Feature available 1 day after event
)
With embargo="1d", a feature recorded at 2024-03-15 10:00 is only eligible for labels at 2024-03-16 10:00 or later. This prevents optimistic leakage from features that wouldn't actually be available in production.
Accepted formats: "30d", "1d12h", "6h", "30m", "15s". See Duration Format.
Join Logic
Timefence performs an as-of join (also called a point-in-time join) for each feature. For each label row, it finds the most recent feature value that satisfies the temporal constraint.
Two modes are available:
| Mode | Condition | Use When |
|---|---|---|
"strict" | feature_time < label_time | Default. No same-timestamp leakage. |
"inclusive" | feature_time <= label_time | When same-timestamp is safe (e.g., static attributes). |
Additional constraints:
max_lookback— Maximum age of a feature value (default:"365d"). Features older than this are treated as missing.max_staleness— If set, features older than this threshold are treated as missing even if within the lookback window. Must satisfy:max_lookback >= max_staleness > embargo.
Time-Based Splits
Build separate train/validation/test files split by label time:
result = timefence.build(
labels=labels,
features=[rolling_spend, user_country],
output="dataset.parquet",
splits={
"train": ("2023-01-01", "2024-01-01"),
"valid": ("2024-01-01", "2024-07-01"),
"test": ("2024-07-01", "2025-01-01"),
},
)
# result.splits = {"train": Path(...), "valid": Path(...), "test": Path(...)}
Each split file contains only labels whose label_time falls within the given range. All temporal correctness guarantees still apply per-row.
Caching & Store
Timefence can track builds and cache intermediate results using a local Store:
store = timefence.Store(".timefence")
result = timefence.build(
labels=labels,
features=[rolling_spend],
output="train.parquet",
store=store,
)
Cache keys are computed from content hashes of source files, feature definitions, and build parameters. If nothing has changed, subsequent builds return cached results in milliseconds.
Guide: Audit Existing Data
Already have a training dataset? Audit it for temporal leakage without rebuilding.
Step 1: Inspect your data
timefence inspect data/train.parquet
This shows column types and suggests which columns are likely keys and timestamps.
Step 2: Run the audit
import timefence
transactions = timefence.Source(
path="data/transactions.parquet",
keys=["user_id"],
timestamp="created_at",
)
rolling_spend = timefence.Feature(
source=transactions,
columns=["amount"],
embargo="1d",
)
report = timefence.audit(
data="data/train.parquet",
features=[rolling_spend],
keys=["user_id"],
label_time="label_time",
)
print(report)
# Per-feature breakdown: clean/leaky row counts, severity, max leakage
Step 3: Inspect results
# Check overall status
report.has_leakage # True/False
report.leaky_features # ["rolling_spend_30d"]
report.clean_features # ["user_country"]
# Per-feature details
detail = report["rolling_spend_30d"]
detail.leaky_row_count # 1520
detail.leaky_row_pct # 0.304
detail.severity # "HIGH"
detail.max_leakage # timedelta(days=15)
# Export
report.to_html("audit_report.html")
report.to_json("audit_report.json")
Guide: Build from Scratch
Build a point-in-time correct training dataset from raw data.
import timefence
# 1. Define sources
transactions = timefence.Source(
path="data/transactions.parquet",
keys=["user_id"],
timestamp="created_at",
)
users = timefence.Source(
path="data/users.parquet",
keys=["user_id"],
timestamp="updated_at",
)
# 2. Define features
rolling_spend = timefence.Feature(
source=transactions,
sql="""
SELECT user_id, created_at,
SUM(amount) OVER (
PARTITION BY user_id
ORDER BY created_at
RANGE INTERVAL 30 DAYS PRECEDING
) AS spend_30d
FROM {source}
""",
name="rolling_spend_30d",
embargo="1d",
)
user_country = timefence.Feature(
source=users,
columns=["country"],
)
# 3. Define labels
labels = timefence.Labels(
path="data/labels.parquet",
keys=["user_id"],
label_time="label_time",
target="churned",
)
# 4. Build
result = timefence.build(
labels=labels,
features=[rolling_spend, user_country],
output="train_CLEAN.parquet",
)
print(result)
# rows: 5000, columns: 4, duration: 0.8s
Guide: CI/CD Integration
Add temporal correctness checks to your CI pipeline.
Using the CLI
# Exits with code 1 if leakage is found
timefence audit data/train.parquet --strict
Using Python
report = timefence.audit(
data="data/train.parquet",
features=[rolling_spend, user_country],
keys=["user_id"],
label_time="label_time",
)
# Raises TimefenceLeakageError if any leakage detected
report.assert_clean()
GitHub Actions example
- name: Audit training data
run: |
pip install timefence
timefence audit data/train.parquet \
--features features.py \
--keys user_id \
--label-time label_time \
--strict
Python API Reference
class Source
Defines a historical data source (Parquet, CSV, or DataFrame).
timefence.Source(
path=None,
*,
keys,
timestamp,
name=None,
format=None,
delimiter=",",
timestamp_format=None,
df=None,
)
| path | str | Path | None | Path to the data file. Mutually exclusive with df. |
| keys | str | list[str] | Column name(s) representing the entity (e.g., "user_id"). |
| timestamp | str | Column name containing the valid-at timestamp. |
| name | str | None | Human-readable name. Defaults to filename stem. |
| format | str | None | "parquet" or "csv". Auto-detected from file extension. |
| delimiter | str | CSV delimiter. Default: ",". |
| timestamp_format | str | None | Optional strftime format for parsing timestamps (CSV only). |
| df | DataFrame | None | Pass a DataFrame directly instead of a file path. Mutually exclusive with path. |
Convenience aliases
# Explicit format selection
transactions = timefence.ParquetSource("data/tx.parquet", keys="user_id", timestamp="ts")
events = timefence.CSVSource("data/events.csv", keys="user_id", timestamp="ts")
SQLSource
Define a source via a SQL query against DuckDB:
source = timefence.SQLSource(
query="SELECT * FROM read_parquet('data/*.parquet')",
keys="user_id",
timestamp="created_at",
name="all_transactions",
connection=None, # Optional: DuckDB database path
)
class Feature
A named signal derived from a Source. Exactly one of columns, sql, or transform must be provided.
timefence.Feature(
source,
*,
columns=None,
sql=None,
transform=None,
name=None,
embargo=None,
key_mapping=None,
on_duplicate="error",
)
| source | Source | SQLSource | The data source object. |
| columns | str | list | dict | None | Mode 1: Select columns directly. Pass a dict to rename: {"source_col": "feature_col"}. |
| sql | str | Path | None | Mode 2: SQL query or path to .sql file. Use {source} placeholder for the source table. |
| transform | Callable | None | Mode 3: Python function (conn, source_table) -> DataFrame. |
| name | str | None | Feature name. Auto-derived from columns, filename, or function name when possible. Required for inline SQL strings. |
| embargo | str | timedelta | None | Computation lag buffer (e.g., "1d", "12h"). See Embargo. |
| key_mapping | dict[str, str] | None | Map label key names to source key names: {"user_id": "customer_id"}. |
| on_duplicate | str | "error" (default, raises) or "keep_any" (silently keeps one row) when duplicate (key, feature_time) pairs exist. |
Feature modes
# Mode 1: Column selection
country = timefence.Feature(source=users, columns=["country"])
# Mode 2: SQL (inline)
spend = timefence.Feature(
source=transactions,
sql="""
SELECT user_id, created_at,
SUM(amount) OVER (
PARTITION BY user_id ORDER BY created_at
RANGE INTERVAL 30 DAYS PRECEDING
) AS spend_30d
FROM {source}
""",
name="rolling_spend_30d",
embargo="1d",
)
# Mode 2: SQL (file)
spend = timefence.Feature(
source=transactions,
sql=Path("features/rolling_spend.sql"),
)
# Mode 3: Python transform
def compute_score(conn, source_table):
return conn.sql(f"""
SELECT user_id, created_at AS feature_time,
raw_score * 2.5 AS adjusted_score
FROM {source_table}
""")
score = timefence.Feature(source=transactions, transform=compute_score)
class Labels
Defines the prediction target: which entities, at what times, and what outcome.
timefence.Labels(
*,
path=None,
df=None,
keys,
label_time,
target,
)
| path | str | Path | None | Path to labels file (Parquet). Mutually exclusive with df. |
| df | DataFrame | None | DataFrame with labels. Mutually exclusive with path. |
| keys | str | list[str] | Entity key column(s). Must match the keys used in features. |
| label_time | str | Column name for the label event timestamp. |
| target | str | list[str] | Prediction target column(s) (e.g., "churned"). |
labels = timefence.Labels(
path="data/labels.parquet",
keys=["user_id"],
label_time="label_time",
target="churned",
)
class FeatureSet
Group features for reuse across builds and audits.
base_features = timefence.FeatureSet(
name="churn_features_v1",
features=[rolling_spend, user_country, login_count],
)
# Use in build/audit like individual features
result = timefence.build(
labels=labels,
features=[base_features, new_experimental_feature],
output="train.parquet",
)
Supports iteration (for f in feature_set) and len().
function build()
Constructs a point-in-time correct training dataset.
timefence.build(
labels,
features,
output=None,
*,
max_lookback="365d",
max_staleness=None,
join="strict",
on_missing="null",
splits=None,
store=None,
flatten_columns=False,
progress=None,
) -> BuildResult
| labels | Labels | Label definition. |
| features | list[Feature | FeatureSet] | Features to join. |
| output | str | Path | None | Output file path. If None, no file is written (result available in memory). |
| max_lookback | str | timedelta | Maximum feature age. Default: "365d". |
| max_staleness | str | timedelta | None | Max feature age before treating as missing. Must satisfy max_lookback >= max_staleness > embargo. |
| join | str | "strict" (<) or "inclusive" (<=). Default: "strict". |
| on_missing | str | "null" (keep row with NULLs) or "skip" (drop row). Default: "null". |
| splits | dict | None | Time-based splits: {"train": ("start", "end"), ...}. See Splits. |
| store | Store | None | Build tracking and caching. Pass a Store instance. See Store. |
| flatten_columns | bool | Strip feature name prefix from output columns. Default: False. |
| progress | Callable | None | Callback invoked with a status message at each step. Useful for progress bars. Default: None. |
Returns: BuildResult
| .output_path | str | None | Path to the output file. |
| .stats | BuildStats | .row_count, .column_count, .feature_stats (matched/missing/cached counts), .duration_seconds. |
| .sql | str | The exact SQL query executed. |
| .splits | dict[str, Path] | None | Split output file paths (when splits was provided). |
| .manifest | dict | Full build manifest (JSON-serializable). |
| .validate() | bool | Re-run audit on the output to verify correctness. |
| .explain() | str | Human-readable join logic summary. |
function audit()
Scan a dataset for temporal leakage. Two modes are available.
Mode 1: Rebuild-and-compare (full audit)
report = timefence.audit(
data="data/train.parquet",
features=[rolling_spend, user_country],
keys=["user_id"],
label_time="label_time",
max_lookback="365d",
max_staleness=None,
join="strict",
)
Mode 2: Temporal check (lightweight)
report = timefence.audit.temporal(
data="data/train.parquet",
feature_time_columns={
"spend_30d": "spend_computed_at",
"country": "country_updated_at",
},
label_time="label_time",
)
| data | str | Path | Any | Dataset to audit (file path or DataFrame). |
| features | list[Feature | FeatureSet] | None | Feature definitions (Mode 1 only). |
| keys | str | list[str] | None | Entity key column(s) (Mode 1 only). |
| label_time | str | None | Label time column name. |
| feature_time_columns | dict[str, str] | None | Mode 2 only: {feature_name: time_column}. |
| max_lookback | str | timedelta | Maximum feature age. Default: "365d". |
| max_staleness | str | timedelta | None | Max feature age before treating as missing. Default: None. |
| join | str | "strict" (<) or "inclusive" (<=). Default: "strict". |
Returns: AuditReport
| .has_leakage | bool | True if any feature is leaky. |
| .clean_features | list[str] | Feature names with no leakage. |
| .leaky_features | list[str] | Feature names with leakage. |
| .total_rows | int | Total rows scanned. |
| .mode | str | "rebuild" (Mode 1) or "temporal" (Mode 2). |
| [name] | FeatureAuditDetail | Per-feature detail via report["feature_name"]. |
| .assert_clean() | None | Raises TimefenceLeakageError if leakage found. |
| .to_html(path) | None | Export interactive HTML report. |
| .to_json(path) | None | Export JSON report. |
FeatureAuditDetail fields
| .clean | bool | No leakage detected. |
| .leaky_row_count | int | Number of leaky rows. |
| .leaky_row_pct | float | Fraction of rows with leakage (0.0 – 1.0). |
| .severity | str | "HIGH", "MEDIUM", "LOW", or "OK". |
| .max_leakage | timedelta | None | Largest time violation. |
| .median_leakage | timedelta | None | Median time violation. |
| .total_rows | int | Total rows examined for this feature. |
| .null_rows | int | Rows where the feature value was NULL. |
| .leaky_rows | DataFrame | None | Sample of leaky rows (up to 1,000 rows). |
Severity levels: HIGH = >5% leaky rows or max leakage >7 days. MEDIUM = 1–5% or 1–7 days. LOW = <1% and <1 day. OK = no leakage.
function explain()
Preview the join logic that build() will use, without executing any queries.
result = timefence.explain(
labels=labels,
features=[rolling_spend, user_country],
max_lookback="365d",
max_staleness=None,
join="strict",
)
print(result) # Human-readable join plan
Returns: ExplainResult
| .label_count | int | Number of label rows. |
| .plan | list[dict] | Per-feature join plan with name, source, join_condition, window, embargo_str, strategy, sql. |
function diff()
Compare two training datasets for schema drift and value changes.
result = timefence.diff(
old="train_v1.parquet",
new="train_v2.parquet",
keys=["user_id"],
label_time="label_time",
atol=1e-10, # Absolute tolerance for numeric comparison
rtol=1e-7, # Relative tolerance for numeric comparison
)
Returns: DiffResult
| .old_rows | int | Row count in old dataset. |
| .new_rows | int | Row count in new dataset. |
| .schema_changes | list[dict] | Schema changes: type (+/-/~), column, detail. |
| .value_changes | dict[str, dict] | Per-column: changed_count, changed_pct, mean_delta, max_delta. |
class Store
Local directory for build tracking and caching.
store = timefence.Store(".timefence")
# Pass to build()
result = timefence.build(labels=labels, features=features, store=store)
# List past builds
builds = store.list_builds() # Newest first
# Get a specific build
manifest = store.get_build("abc123")
| path | str | Path | Store directory. Default: ".timefence". |
Methods
| .save_build(manifest) | Path | Save build manifest, return manifest path. |
| .list_builds() | list[dict] | List all builds (newest first). |
| .get_build(build_id) | dict | None | Get a specific build manifest by ID. |
| .content_hash(path) | str | Compute SHA-256 hash of file. |
CLI Reference
Global Options
These flags apply to all commands when placed before the subcommand name.
timefence -v <command> # Verbose: show generated SQL and build details
timefence --debug <command> # Debug: full output including DuckDB internals
| Flag | Description |
|---|---|
| -v, --verbose | Show generated SQL, cache status, and label/feature details |
| --debug | Full debug output: ASOF fallback reasons, hash details, DuckDB internals |
timefence audit
Scan a dataset for temporal leakage.
# Basic audit
timefence audit data/train.parquet
# With explicit options
timefence audit data/train.parquet \
--features features.py \
--keys user_id \
--label-time label_time
# CI mode: exit code 1 if leakage found
timefence audit data/train.parquet --strict
# Export reports
timefence audit data/train.parquet --html report.html
timefence audit data/train.parquet --json
| data | Positional. Path to the dataset file. |
| --features | Path to Python file with feature definitions. |
| --keys | Key column(s), comma-separated. |
| --label-time | Label time column name. |
| --strict | Exit with code 1 if leakage found (for CI/CD). |
| --html FILE | Export interactive HTML report. |
| --json | Output as JSON. |
timefence build
Build a point-in-time correct training set. Uses timefence.yaml defaults if available.
# Basic build
timefence build \
--labels data/labels.parquet \
--features features.py \
-o train.parquet
# With all options
timefence build \
--labels data/labels.parquet \
--features features.py \
-o train.parquet \
--max-lookback 365d \
--max-staleness 30d \
--on-missing null \
--join-mode strict
# Time-based splits
timefence build \
--labels data/labels.parquet \
--features features.py \
-o train.parquet \
--split train:2023-01-01:2024-01-01 \
--split test:2024-01-01:2025-01-01
# Dry run (preview plan without executing)
timefence build \
--labels data/labels.parquet \
--features features.py \
-o train.parquet \
--dry-run
| --labels | Path to labels file (required unless in config). |
| --features | Path to features Python file (required unless in config). |
| -o, --output | Output path for training set (required). |
| --max-lookback | Maximum lookback window (e.g., "365d"). |
| --max-staleness | Maximum feature staleness. |
| --on-missing | null or skip. |
| --join-mode | strict or inclusive. |
| --split | Time split: name:start:end (repeatable). |
| --dry-run | Show plan without executing. |
| --flatten | Strip feature name prefix from output columns. |
| --json | Output build manifest as JSON. |
timefence explain
Preview join logic without executing any queries.
# Full explain
timefence explain \
--labels data/labels.parquet \
--features features.py
# Single feature
timefence explain --features features.py:rolling_spend_30d
# JSON output
timefence explain --features features.py --json
| --labels | Path to labels file. |
| --features | Path to features Python file. Append :name for single feature. |
| --max-lookback | Maximum lookback window. |
| --join-mode | strict or inclusive. |
| --json | Output as JSON. |
timefence diff
Compare two datasets for value changes or schema drift.
timefence diff train_v1.parquet train_v2.parquet \
--keys user_id \
--label-time label_time
# Custom tolerance
timefence diff v1.parquet v2.parquet \
--keys user_id \
--label-time label_time \
--atol 0.01 \
--rtol 0.001
# JSON output
timefence diff v1.parquet v2.parquet \
--keys user_id \
--label-time label_time \
--json
| old_path | Positional. Path to first dataset. |
| new_path | Positional. Path to second dataset. |
| --keys | Key column(s), comma-separated (required). |
| --label-time | Label time column (required). |
| --atol | Absolute tolerance for numeric comparison (default: 1e-10). |
| --rtol | Relative tolerance for numeric comparison (default: 1e-7). |
| --json | Output as JSON. |
timefence inspect
Analyze a data file and suggest which columns are keys and timestamps.
timefence inspect data/transactions.parquet
timefence inspect data/events.csv --json
| path | Positional. Path to data file (Parquet or CSV). |
| --json | Output as JSON. |
Output includes column names, types, uniqueness percentage, and auto-detected key/timestamp suggestions.
timefence quickstart
Generate a self-contained example project with synthetic data and leakage scenarios.
# Default churn example
timefence quickstart churn-example
# Minimal version
timefence quickstart myproject --minimal
| project_name | Positional. Directory name to create. |
| --template | Example template. Default: "churn". |
| --minimal | Generate a smaller example. |
timefence catalog
List all features defined in a project.
timefence catalog --features features.py
timefence catalog --features features.py --json
| --features | Path to features Python file. |
| --json | Output as JSON. |
Output shows Name, Source, Keys, Embargo, and Mode for each feature.
timefence doctor
Diagnose project setup and common issues.
timefence doctor
timefence doctor --json
| --json | Output as JSON. |
Checks: config file, DuckDB installation, feature file validity, source file accessibility, label schema compatibility, duplicate keys, and column name conflicts.
timefence init
Initialize a project with a timefence.yaml config file.
# Current directory
timefence init
# Specific directory
timefence init my-project/
| path | Positional (optional). Directory to create config in. Default: current directory. |
Configuration (timefence.yaml)
Timefence looks for a timefence.yaml in the current directory. All fields are optional — CLI flags and Python API arguments take precedence.
name: churn-model
version: "1.0"
# Feature file(s)
features:
- features.py
# Label configuration
labels:
path: data/labels.parquet
keys: [user_id]
label_time: label_time
target: [churned]
# Default parameters
defaults:
max_lookback: 365d
join: strict # "strict" or "inclusive"
on_missing: "null" # "null" or "skip"
# Store directory for build tracking
store: .timefence/
# Output directory (relative paths in build resolve against this)
output:
dir: artifacts/
Precedence: CLI flags > Python API arguments > timefence.yaml > built-in defaults.
Errors
All Timefence errors follow a consistent format: WHAT happened, WHY it happened, WHERE in the data, and HOW TO FIX it.
TimefenceSchemaError
Raised when expected columns are missing or types are mismatched between source, labels, and feature definitions.
TimefenceDuplicateError
Raised when duplicate (key, timestamp) pairs exist in a source and on_duplicate="error" (default). Includes example rows.
TimefenceTimezoneError
Raised when mixing timezone-aware and timezone-naive timestamps across sources and labels.
TimefenceConfigError
Raised for invalid parameter combinations (e.g., max_staleness > max_lookback).
TimefenceLeakageError
Raised by report.assert_clean() when temporal leakage is detected. Includes a summary of leaky features.
TimefenceValidationError
Raised for general input validation failures (e.g., providing both path and df).
Duration Format
Timefence accepts human-readable duration strings wherever a time duration is expected (embargo, max_lookback, max_staleness).
| Format | Example | Meaning |
|---|---|---|
Nd | "30d" | 30 days |
Nh | "6h" | 6 hours |
Nm | "30m" | 30 minutes |
Ns | "15s" | 15 seconds |
| Combined | "1d12h" | 1 day and 12 hours |
| Zero | "0d" or "0" | No duration |
You can also pass a Python timedelta object directly in the Python API.