Documentation

Timefence v0.9.1

Timefence is a temporal correctness layer for ML training data. It guarantees that no future data leaks into your training set by enforcing point-in-time correctness at the moment of dataset construction.

It is not a feature store, data orchestrator, data quality framework, or ML pipeline. Timefence does one thing: ensures feature_time < label_time for every row.

Installation

Timefence requires Python 3.9+ and pip.

pip install timefence

Optional extras:

# Jupyter notebook support
pip install timefence[notebook]

# Development dependencies
pip install timefence[dev]

Dependencies: duckdb>=1.0.0, click>=8.0.0, rich>=13.0.0, pandas>=1.5.0. No Spark, no cloud infrastructure.

Quickstart

Generate a complete example project with synthetic data and planted leakage scenarios:

timefence quickstart churn-example
cd churn-example

This creates a self-contained directory with:

  • timefence.yaml — Project configuration
  • features.py — 4 feature definitions
  • data/ — Synthetic parquet files (users, transactions, labels)
  • data/train_LEAKY.parquet — Pre-built dataset with planted leakage
  • README.md — Next-step instructions

Now audit the leaky dataset:

timefence audit data/train_LEAKY.parquet

Then build a clean one:

timefence build -o train_CLEAN.parquet

Temporal Correctness

The core invariant Timefence enforces is:

feature_time < label_time - embargo

For every row in your training set, the feature value used must have been available strictly before the label event time (minus any embargo buffer). This prevents future data leakage — the most common and hardest-to-detect source of inflated ML metrics.

In inclusive join mode, the condition relaxes to:

feature_time <= label_time - embargo

Embargo

Real-world ML pipelines have latency: data arrives late, ETL jobs run on schedules, and features take time to compute. The embargo parameter models this lag.

rolling_spend = timefence.Feature(
    source=transactions,
    sql="SELECT ...",
    embargo="1d"  # Feature available 1 day after event
)

With embargo="1d", a feature recorded at 2024-03-15 10:00 is only eligible for labels at 2024-03-16 10:00 or later. This prevents optimistic leakage from features that wouldn't actually be available in production.

Accepted formats: "30d", "1d12h", "6h", "30m", "15s". See Duration Format.

Join Logic

Timefence performs an as-of join (also called a point-in-time join) for each feature. For each label row, it finds the most recent feature value that satisfies the temporal constraint.

Two modes are available:

ModeConditionUse When
"strict"feature_time < label_timeDefault. No same-timestamp leakage.
"inclusive"feature_time <= label_timeWhen same-timestamp is safe (e.g., static attributes).

Additional constraints:

  • max_lookback — Maximum age of a feature value (default: "365d"). Features older than this are treated as missing.
  • max_staleness — If set, features older than this threshold are treated as missing even if within the lookback window. Must satisfy: max_lookback >= max_staleness > embargo.

Time-Based Splits

Build separate train/validation/test files split by label time:

result = timefence.build(
    labels=labels,
    features=[rolling_spend, user_country],
    output="dataset.parquet",
    splits={
        "train": ("2023-01-01", "2024-01-01"),
        "valid": ("2024-01-01", "2024-07-01"),
        "test":  ("2024-07-01", "2025-01-01"),
    },
)

# result.splits = {"train": Path(...), "valid": Path(...), "test": Path(...)}

Each split file contains only labels whose label_time falls within the given range. All temporal correctness guarantees still apply per-row.

Caching & Store

Timefence can track builds and cache intermediate results using a local Store:

store = timefence.Store(".timefence")

result = timefence.build(
    labels=labels,
    features=[rolling_spend],
    output="train.parquet",
    store=store,
)

Cache keys are computed from content hashes of source files, feature definitions, and build parameters. If nothing has changed, subsequent builds return cached results in milliseconds.


Guide: Audit Existing Data

Already have a training dataset? Audit it for temporal leakage without rebuilding.

Step 1: Inspect your data

timefence inspect data/train.parquet

This shows column types and suggests which columns are likely keys and timestamps.

Step 2: Run the audit

import timefence

transactions = timefence.Source(
    path="data/transactions.parquet",
    keys=["user_id"],
    timestamp="created_at",
)

rolling_spend = timefence.Feature(
    source=transactions,
    columns=["amount"],
    embargo="1d",
)

report = timefence.audit(
    data="data/train.parquet",
    features=[rolling_spend],
    keys=["user_id"],
    label_time="label_time",
)

print(report)
# Per-feature breakdown: clean/leaky row counts, severity, max leakage

Step 3: Inspect results

# Check overall status
report.has_leakage        # True/False
report.leaky_features     # ["rolling_spend_30d"]
report.clean_features     # ["user_country"]

# Per-feature details
detail = report["rolling_spend_30d"]
detail.leaky_row_count    # 1520
detail.leaky_row_pct      # 0.304
detail.severity           # "HIGH"
detail.max_leakage        # timedelta(days=15)

# Export
report.to_html("audit_report.html")
report.to_json("audit_report.json")

Guide: Build from Scratch

Build a point-in-time correct training dataset from raw data.

import timefence

# 1. Define sources
transactions = timefence.Source(
    path="data/transactions.parquet",
    keys=["user_id"],
    timestamp="created_at",
)

users = timefence.Source(
    path="data/users.parquet",
    keys=["user_id"],
    timestamp="updated_at",
)

# 2. Define features
rolling_spend = timefence.Feature(
    source=transactions,
    sql="""
        SELECT user_id, created_at,
        SUM(amount) OVER (
            PARTITION BY user_id
            ORDER BY created_at
            RANGE INTERVAL 30 DAYS PRECEDING
        ) AS spend_30d
        FROM {source}
    """,
    name="rolling_spend_30d",
    embargo="1d",
)

user_country = timefence.Feature(
    source=users,
    columns=["country"],
)

# 3. Define labels
labels = timefence.Labels(
    path="data/labels.parquet",
    keys=["user_id"],
    label_time="label_time",
    target="churned",
)

# 4. Build
result = timefence.build(
    labels=labels,
    features=[rolling_spend, user_country],
    output="train_CLEAN.parquet",
)

print(result)
# rows: 5000, columns: 4, duration: 0.8s

Guide: CI/CD Integration

Add temporal correctness checks to your CI pipeline.

Using the CLI

# Exits with code 1 if leakage is found
timefence audit data/train.parquet --strict

Using Python

report = timefence.audit(
    data="data/train.parquet",
    features=[rolling_spend, user_country],
    keys=["user_id"],
    label_time="label_time",
)

# Raises TimefenceLeakageError if any leakage detected
report.assert_clean()

GitHub Actions example

- name: Audit training data
  run: |
    pip install timefence
    timefence audit data/train.parquet \
      --features features.py \
      --keys user_id \
      --label-time label_time \
      --strict

Python API Reference

class Source

Defines a historical data source (Parquet, CSV, or DataFrame).

timefence.Source(
    path=None,
    *,
    keys,
    timestamp,
    name=None,
    format=None,
    delimiter=",",
    timestamp_format=None,
    df=None,
)
path str | Path | None Path to the data file. Mutually exclusive with df.
keys str | list[str] Column name(s) representing the entity (e.g., "user_id").
timestamp str Column name containing the valid-at timestamp.
name str | None Human-readable name. Defaults to filename stem.
format str | None "parquet" or "csv". Auto-detected from file extension.
delimiter str CSV delimiter. Default: ",".
timestamp_format str | None Optional strftime format for parsing timestamps (CSV only).
df DataFrame | None Pass a DataFrame directly instead of a file path. Mutually exclusive with path.

Convenience aliases

# Explicit format selection
transactions = timefence.ParquetSource("data/tx.parquet", keys="user_id", timestamp="ts")
events = timefence.CSVSource("data/events.csv", keys="user_id", timestamp="ts")

SQLSource

Define a source via a SQL query against DuckDB:

source = timefence.SQLSource(
    query="SELECT * FROM read_parquet('data/*.parquet')",
    keys="user_id",
    timestamp="created_at",
    name="all_transactions",
    connection=None,  # Optional: DuckDB database path
)

class Feature

A named signal derived from a Source. Exactly one of columns, sql, or transform must be provided.

timefence.Feature(
    source,
    *,
    columns=None,
    sql=None,
    transform=None,
    name=None,
    embargo=None,
    key_mapping=None,
    on_duplicate="error",
)
source Source | SQLSource The data source object.
columns str | list | dict | None Mode 1: Select columns directly. Pass a dict to rename: {"source_col": "feature_col"}.
sql str | Path | None Mode 2: SQL query or path to .sql file. Use {source} placeholder for the source table.
transform Callable | None Mode 3: Python function (conn, source_table) -> DataFrame.
name str | None Feature name. Auto-derived from columns, filename, or function name when possible. Required for inline SQL strings.
embargo str | timedelta | None Computation lag buffer (e.g., "1d", "12h"). See Embargo.
key_mapping dict[str, str] | None Map label key names to source key names: {"user_id": "customer_id"}.
on_duplicate str "error" (default, raises) or "keep_any" (silently keeps one row) when duplicate (key, feature_time) pairs exist.

Feature modes

# Mode 1: Column selection
country = timefence.Feature(source=users, columns=["country"])

# Mode 2: SQL (inline)
spend = timefence.Feature(
    source=transactions,
    sql="""
        SELECT user_id, created_at,
        SUM(amount) OVER (
            PARTITION BY user_id ORDER BY created_at
            RANGE INTERVAL 30 DAYS PRECEDING
        ) AS spend_30d
        FROM {source}
    """,
    name="rolling_spend_30d",
    embargo="1d",
)

# Mode 2: SQL (file)
spend = timefence.Feature(
    source=transactions,
    sql=Path("features/rolling_spend.sql"),
)

# Mode 3: Python transform
def compute_score(conn, source_table):
    return conn.sql(f"""
        SELECT user_id, created_at AS feature_time,
               raw_score * 2.5 AS adjusted_score
        FROM {source_table}
    """)

score = timefence.Feature(source=transactions, transform=compute_score)

class Labels

Defines the prediction target: which entities, at what times, and what outcome.

timefence.Labels(
    *,
    path=None,
    df=None,
    keys,
    label_time,
    target,
)
path str | Path | None Path to labels file (Parquet). Mutually exclusive with df.
df DataFrame | None DataFrame with labels. Mutually exclusive with path.
keys str | list[str] Entity key column(s). Must match the keys used in features.
label_time str Column name for the label event timestamp.
target str | list[str] Prediction target column(s) (e.g., "churned").
labels = timefence.Labels(
    path="data/labels.parquet",
    keys=["user_id"],
    label_time="label_time",
    target="churned",
)

class FeatureSet

Group features for reuse across builds and audits.

base_features = timefence.FeatureSet(
    name="churn_features_v1",
    features=[rolling_spend, user_country, login_count],
)

# Use in build/audit like individual features
result = timefence.build(
    labels=labels,
    features=[base_features, new_experimental_feature],
    output="train.parquet",
)

Supports iteration (for f in feature_set) and len().

function build()

Constructs a point-in-time correct training dataset.

timefence.build(
    labels,
    features,
    output=None,
    *,
    max_lookback="365d",
    max_staleness=None,
    join="strict",
    on_missing="null",
    splits=None,
    store=None,
    flatten_columns=False,
    progress=None,
) -> BuildResult
labels Labels Label definition.
features list[Feature | FeatureSet] Features to join.
output str | Path | None Output file path. If None, no file is written (result available in memory).
max_lookback str | timedelta Maximum feature age. Default: "365d".
max_staleness str | timedelta | None Max feature age before treating as missing. Must satisfy max_lookback >= max_staleness > embargo.
join str "strict" (<) or "inclusive" (<=). Default: "strict".
on_missing str "null" (keep row with NULLs) or "skip" (drop row). Default: "null".
splits dict | None Time-based splits: {"train": ("start", "end"), ...}. See Splits.
store Store | None Build tracking and caching. Pass a Store instance. See Store.
flatten_columns bool Strip feature name prefix from output columns. Default: False.
progress Callable | None Callback invoked with a status message at each step. Useful for progress bars. Default: None.

Returns: BuildResult

.output_path str | None Path to the output file.
.stats BuildStats .row_count, .column_count, .feature_stats (matched/missing/cached counts), .duration_seconds.
.sql str The exact SQL query executed.
.splits dict[str, Path] | None Split output file paths (when splits was provided).
.manifest dict Full build manifest (JSON-serializable).
.validate() bool Re-run audit on the output to verify correctness.
.explain() str Human-readable join logic summary.

function audit()

Scan a dataset for temporal leakage. Two modes are available.

Mode 1: Rebuild-and-compare (full audit)

report = timefence.audit(
    data="data/train.parquet",
    features=[rolling_spend, user_country],
    keys=["user_id"],
    label_time="label_time",
    max_lookback="365d",
    max_staleness=None,
    join="strict",
)

Mode 2: Temporal check (lightweight)

report = timefence.audit.temporal(
    data="data/train.parquet",
    feature_time_columns={
        "spend_30d": "spend_computed_at",
        "country": "country_updated_at",
    },
    label_time="label_time",
)
data str | Path | Any Dataset to audit (file path or DataFrame).
features list[Feature | FeatureSet] | None Feature definitions (Mode 1 only).
keys str | list[str] | None Entity key column(s) (Mode 1 only).
label_time str | None Label time column name.
feature_time_columns dict[str, str] | None Mode 2 only: {feature_name: time_column}.
max_lookback str | timedelta Maximum feature age. Default: "365d".
max_staleness str | timedelta | None Max feature age before treating as missing. Default: None.
join str "strict" (<) or "inclusive" (<=). Default: "strict".

Returns: AuditReport

.has_leakage bool True if any feature is leaky.
.clean_features list[str] Feature names with no leakage.
.leaky_features list[str] Feature names with leakage.
.total_rows int Total rows scanned.
.mode str "rebuild" (Mode 1) or "temporal" (Mode 2).
[name] FeatureAuditDetail Per-feature detail via report["feature_name"].
.assert_clean() None Raises TimefenceLeakageError if leakage found.
.to_html(path) None Export interactive HTML report.
.to_json(path) None Export JSON report.

FeatureAuditDetail fields

.cleanboolNo leakage detected.
.leaky_row_countintNumber of leaky rows.
.leaky_row_pctfloatFraction of rows with leakage (0.0 – 1.0).
.severitystr"HIGH", "MEDIUM", "LOW", or "OK".
.max_leakagetimedelta | NoneLargest time violation.
.median_leakagetimedelta | NoneMedian time violation.
.total_rowsintTotal rows examined for this feature.
.null_rowsintRows where the feature value was NULL.
.leaky_rowsDataFrame | NoneSample of leaky rows (up to 1,000 rows).

Severity levels: HIGH = >5% leaky rows or max leakage >7 days. MEDIUM = 1–5% or 1–7 days. LOW = <1% and <1 day. OK = no leakage.

function explain()

Preview the join logic that build() will use, without executing any queries.

result = timefence.explain(
    labels=labels,
    features=[rolling_spend, user_country],
    max_lookback="365d",
    max_staleness=None,
    join="strict",
)

print(result)  # Human-readable join plan

Returns: ExplainResult

.label_count int Number of label rows.
.plan list[dict] Per-feature join plan with name, source, join_condition, window, embargo_str, strategy, sql.

function diff()

Compare two training datasets for schema drift and value changes.

result = timefence.diff(
    old="train_v1.parquet",
    new="train_v2.parquet",
    keys=["user_id"],
    label_time="label_time",
    atol=1e-10,  # Absolute tolerance for numeric comparison
    rtol=1e-7,   # Relative tolerance for numeric comparison
)

Returns: DiffResult

.old_rowsintRow count in old dataset.
.new_rowsintRow count in new dataset.
.schema_changeslist[dict]Schema changes: type (+/-/~), column, detail.
.value_changesdict[str, dict]Per-column: changed_count, changed_pct, mean_delta, max_delta.

class Store

Local directory for build tracking and caching.

store = timefence.Store(".timefence")

# Pass to build()
result = timefence.build(labels=labels, features=features, store=store)

# List past builds
builds = store.list_builds()  # Newest first

# Get a specific build
manifest = store.get_build("abc123")
pathstr | PathStore directory. Default: ".timefence".

Methods

.save_build(manifest)PathSave build manifest, return manifest path.
.list_builds()list[dict]List all builds (newest first).
.get_build(build_id)dict | NoneGet a specific build manifest by ID.
.content_hash(path)strCompute SHA-256 hash of file.

CLI Reference

Global Options

These flags apply to all commands when placed before the subcommand name.

timefence -v <command>       # Verbose: show generated SQL and build details
timefence --debug <command>  # Debug: full output including DuckDB internals
FlagDescription
-v, --verboseShow generated SQL, cache status, and label/feature details
--debugFull debug output: ASOF fallback reasons, hash details, DuckDB internals

timefence audit

Scan a dataset for temporal leakage.

# Basic audit
timefence audit data/train.parquet

# With explicit options
timefence audit data/train.parquet \
  --features features.py \
  --keys user_id \
  --label-time label_time

# CI mode: exit code 1 if leakage found
timefence audit data/train.parquet --strict

# Export reports
timefence audit data/train.parquet --html report.html
timefence audit data/train.parquet --json
dataPositional. Path to the dataset file.
--featuresPath to Python file with feature definitions.
--keysKey column(s), comma-separated.
--label-timeLabel time column name.
--strictExit with code 1 if leakage found (for CI/CD).
--html FILEExport interactive HTML report.
--jsonOutput as JSON.

timefence build

Build a point-in-time correct training set. Uses timefence.yaml defaults if available.

# Basic build
timefence build \
  --labels data/labels.parquet \
  --features features.py \
  -o train.parquet

# With all options
timefence build \
  --labels data/labels.parquet \
  --features features.py \
  -o train.parquet \
  --max-lookback 365d \
  --max-staleness 30d \
  --on-missing null \
  --join-mode strict

# Time-based splits
timefence build \
  --labels data/labels.parquet \
  --features features.py \
  -o train.parquet \
  --split train:2023-01-01:2024-01-01 \
  --split test:2024-01-01:2025-01-01

# Dry run (preview plan without executing)
timefence build \
  --labels data/labels.parquet \
  --features features.py \
  -o train.parquet \
  --dry-run
--labelsPath to labels file (required unless in config).
--featuresPath to features Python file (required unless in config).
-o, --outputOutput path for training set (required).
--max-lookbackMaximum lookback window (e.g., "365d").
--max-stalenessMaximum feature staleness.
--on-missingnull or skip.
--join-modestrict or inclusive.
--splitTime split: name:start:end (repeatable).
--dry-runShow plan without executing.
--flattenStrip feature name prefix from output columns.
--jsonOutput build manifest as JSON.

timefence explain

Preview join logic without executing any queries.

# Full explain
timefence explain \
  --labels data/labels.parquet \
  --features features.py

# Single feature
timefence explain --features features.py:rolling_spend_30d

# JSON output
timefence explain --features features.py --json
--labelsPath to labels file.
--featuresPath to features Python file. Append :name for single feature.
--max-lookbackMaximum lookback window.
--join-modestrict or inclusive.
--jsonOutput as JSON.

timefence diff

Compare two datasets for value changes or schema drift.

timefence diff train_v1.parquet train_v2.parquet \
  --keys user_id \
  --label-time label_time

# Custom tolerance
timefence diff v1.parquet v2.parquet \
  --keys user_id \
  --label-time label_time \
  --atol 0.01 \
  --rtol 0.001

# JSON output
timefence diff v1.parquet v2.parquet \
  --keys user_id \
  --label-time label_time \
  --json
old_pathPositional. Path to first dataset.
new_pathPositional. Path to second dataset.
--keysKey column(s), comma-separated (required).
--label-timeLabel time column (required).
--atolAbsolute tolerance for numeric comparison (default: 1e-10).
--rtolRelative tolerance for numeric comparison (default: 1e-7).
--jsonOutput as JSON.

timefence inspect

Analyze a data file and suggest which columns are keys and timestamps.

timefence inspect data/transactions.parquet
timefence inspect data/events.csv --json
pathPositional. Path to data file (Parquet or CSV).
--jsonOutput as JSON.

Output includes column names, types, uniqueness percentage, and auto-detected key/timestamp suggestions.

timefence quickstart

Generate a self-contained example project with synthetic data and leakage scenarios.

# Default churn example
timefence quickstart churn-example

# Minimal version
timefence quickstart myproject --minimal
project_namePositional. Directory name to create.
--templateExample template. Default: "churn".
--minimalGenerate a smaller example.

timefence catalog

List all features defined in a project.

timefence catalog --features features.py
timefence catalog --features features.py --json
--featuresPath to features Python file.
--jsonOutput as JSON.

Output shows Name, Source, Keys, Embargo, and Mode for each feature.

timefence doctor

Diagnose project setup and common issues.

timefence doctor
timefence doctor --json
--jsonOutput as JSON.

Checks: config file, DuckDB installation, feature file validity, source file accessibility, label schema compatibility, duplicate keys, and column name conflicts.

timefence init

Initialize a project with a timefence.yaml config file.

# Current directory
timefence init

# Specific directory
timefence init my-project/
pathPositional (optional). Directory to create config in. Default: current directory.

Configuration (timefence.yaml)

Timefence looks for a timefence.yaml in the current directory. All fields are optional — CLI flags and Python API arguments take precedence.

name: churn-model
version: "1.0"

# Feature file(s)
features:
  - features.py

# Label configuration
labels:
  path: data/labels.parquet
  keys: [user_id]
  label_time: label_time
  target: [churned]

# Default parameters
defaults:
  max_lookback: 365d
  join: strict          # "strict" or "inclusive"
  on_missing: "null"    # "null" or "skip"

# Store directory for build tracking
store: .timefence/

# Output directory (relative paths in build resolve against this)
output:
  dir: artifacts/

Precedence: CLI flags > Python API arguments > timefence.yaml > built-in defaults.

Errors

All Timefence errors follow a consistent format: WHAT happened, WHY it happened, WHERE in the data, and HOW TO FIX it.

TimefenceSchemaError

Raised when expected columns are missing or types are mismatched between source, labels, and feature definitions.

TimefenceDuplicateError

Raised when duplicate (key, timestamp) pairs exist in a source and on_duplicate="error" (default). Includes example rows.

TimefenceTimezoneError

Raised when mixing timezone-aware and timezone-naive timestamps across sources and labels.

TimefenceConfigError

Raised for invalid parameter combinations (e.g., max_staleness > max_lookback).

TimefenceLeakageError

Raised by report.assert_clean() when temporal leakage is detected. Includes a summary of leaky features.

TimefenceValidationError

Raised for general input validation failures (e.g., providing both path and df).

Duration Format

Timefence accepts human-readable duration strings wherever a time duration is expected (embargo, max_lookback, max_staleness).

FormatExampleMeaning
Nd"30d"30 days
Nh"6h"6 hours
Nm"30m"30 minutes
Ns"15s"15 seconds
Combined"1d12h"1 day and 12 hours
Zero"0d" or "0"No duration

You can also pass a Python timedelta object directly in the Python API.