Skip to content

How Pilz Works

This chapter explains Pilz's algorithm in detail. The key innovation is multi-dimensional feature cuts that naturally capture correlations.

Core Innovation: Multi-Dimensional Cuts

Unlike traditional trees that split on one feature at a time, Pilz can split on feature combinations directly:

flowchart TB
    subgraph "Traditional Tree"
        A[Data] --> B{X split?}
        B -->|Yes| C[X is high]
        B -->|No| D{X is low}
        D --> E{Y split?}
        E -->|Yes| F[Y is high]
        E -->|No| G[Y is low]
    end

    subgraph "Pilz Multi-Dimensional"
        P[Data] --> Q{X AND Ycombination?}
        Q --> R["X=0,Y=0"]
        Q --> S["X=0,Y=1"]
        Q --> T["X=1,Y=0"]
        Q --> U["X=1,Y=1"]
    end

    style Q fill:#ccffcc```

![Diagram](images/how_pilz_works_1.svg)

## The Algorithm Step by Step

### Step 1: Feature Binning (Categorization)

Every feature is binned into `n_cat` categories:

```mermaid
flowchart TB
    subgraph Input
        I1[Categorical: job, city, status]
        I2[Numerical: age, balance, score]
    end

    subgraph Process
        B1[Categorize]
    end```

![Diagram](images/how_pilz_works_2.svg)

mermaid

flowchart TB

    subgraph Input

        I1[Categorical: job, city, status]

        I2[Numerical: age, balance, score]

    end



    subgraph Process

        B1[Categorize]

    end



    subgraph Output

        O1["Bin 0, Bin 1, ..., Bin (n_cat-1)"]

    end



    I1 --> B1

    I2 --> B1

    B1 --> O1



    style B1 fill:#e0f0ff```

Features:**

- Sort values and calculate quantile boundaries

- With `n_cat=2`, use the median as the single cut point

### Step 2: Build Correlation Tables

For each split, Pilz builds tables for feature combinations:

```mermaid
flowchart LR
    subgraph "With n_dims=2"
        F1[Feature X] --> T1["X=0, Y=0target=15, non-target=85"]
        F2[Feature Y] --> T2["X=0, Y=1target=45, non-target=55"]

        T1 --> C[Correlation Table]
        T2 --> C
    end```

![Diagram](images/how_pilz_works_3.svg)

Example for features X, Y with binary bins:

| Combination | Target Count | Non-Target Count | Target Rate |

|-------------|-----

```mermaid
flowchart LR
    subgraph "With n_dims=2"
        F1[Feature X] --> T1["X=0, Y=0target=15, non-target=85"]
        F2[Feature Y] --> T2["X=0, Y=1target=45, non-target=55"]

        T1 --> C[Correlation Table]
        T2 --> C
    end```

![Diagram](images/how_pilz_works_4.svg)

 is scored:

```mermaid
flowchart TB
    subgraph Scoring
        S["Target Rate =target / (target + non-target)"]
    end

    subgraph Classification
        H["High rate > 0.5 + neutral_faktor"] --> R[Right: Likely target]
        L["Low rate < 0.5 - neutral_faktor"] --> L2[Left: Likely non-target]
        M[Medium rate] --> N[Neutral: Uncertain]
    end

    S --> H
    S --> L
    S --> M

    style R fill:#ccffcc
    style L fill:#ffcccc
    style N fill:#ffff99```

![Diagram](images/how_pilz_works_5.svg)

**Example:**

- Target rate 80% → **Right** (clearly target)

- Target rate 15% → **Left** (clearly non-targe

```mermaid
flowchart TB
    subgraph Scoring
        S["Target Rate =target / (target + non-target)"]
    end

    subgraph Classification
        H["High rate > 0.5 + neutral_faktor"] --> R[Right: Likely target]
        L["Low rate < 0.5 - neutral_faktor"] --> L2[Left: Likely non-target]
        M[Medium rate] --> N[Neutral: Uncertain]
    end

    S --> H
    S --> L
    S --> M

    style R fill:#ccffcc
    style L fill:#ffcccc
    style N fill:#ffff99```

![Diagram](images/how_pilz_works_6.svg)

af has statistically significant data.

### Step 5: Downsampling

Pilz always uses a **balanced downsampled** subset:

```mermaid
flowchart LR
    subgraph Original
        O1[100K rowsTarget: 10K, Non-target: 90K]
    end

    subgraph Downsampled
        D1[10K rowsTarget: 5K, Non-target: 5K]
    end

    O1 -->|"Balance target"| D1
    O1 -->|"Balance non-target"| D1

    style D1 fill:#ffff99```

![Diagram](images/how_pilz_works_7.svg)

mermaid

flowchart TB

    A[Node] --> B{Minimum events?}

    B -->|No| C[Split again]

    B -->|Yes| D[Create leaf node]



    C --> E[Apply Left filter]

    C --> F[Apply Neutral filter]

    C --> G[Apply Right filter]



    E --> A

    F --> A

    G --> A



    style B fill:#e0f0ff

    style D fill:#ccffcc```

-->|"Yes - Clear"| R[Right BranchHigh target rate]

    B -->|"No - Unclear"| N[Neutral BranchContinue splitting]

    B -->|"Yes - Clear"| L[Left BranchLow target rate]



    R --> R2["Target Rate > 0.8"]

    N --> N2[Target Rate ~0.5]

    L --> L2["Target Rate < 0.2"]



    N2 --> R3[Split Again]

    N2 --> L3[Split Again]



    style N fill:#ffff99

    style

```mermaid
flowchart LR
    subgraph Original
        O1[100K rowsTarget: 10K, Non-target: 90K]
    end

    subgraph Downsampled
        D1[10K rowsTarget: 5K, Non-target: 5K]
    end

    O1 -->|"Balance target"| D1
    O1 -->|"Balance non-target"| D1

    style D1 fill:#ffff99```

![Diagram](images/how_pilz_works_8.svg)

target = target_class)

            spores = train_pilz(target_filter, [], "", settings)

            SAVE(Pilz(spores, target_class), tree_idx)

FUNCTION TRAIN_PILZ(target_filter, path_filters, depth, settings):

    # 1. Read downsampled data

    train_df = READ_DATA(target_filter, path_filters, settings)



    # 2. Stop if minimum events reached

    IF train_df.size < settings.min_eval_fit:

        RETURN [CREATE_L

```mermaid
flowchart TD
    A[Data] --> B{Feature combinationclear discrimination?}

    B -->|"Yes - Clear"| R[Right BranchHigh target rate]
    B -->|"No - Unclear"| N[Neutral BranchContinue splitting]
    B -->|"Yes - Clear"| L[Left BranchLow target rate]

    R --> R2["Target Rate > 0.8"]
    N --> N2[Target Rate ~0.5]
    L --> L2["Target Rate < 0.2"]

    N2 --> R3[Split Again]
    N2 --> L3[Split Again]

    style N fill:#ffff99
    style N2 fill:#ffff99```

![Diagram](images/how_pilz_works_9.svg)

left_filter, neutral_filter, right_filter = best.get_filters()



    IF left_filter IS NULL AND right_filter IS NULL:

        RETURN [CREATE_LEAF(path_filters, depth, train_df)]



    # 6. Recurse on three branches

    left_spores = train_pilz(target_filter, path_filters + [left_filter], depth + "l", settings)

    neutral_spores = []

    right_spores = []



    IF neutral_filter:

        neutral_spores = train_pilz(target_filter, path_filters + [neutral_filter], depth + "n", settings)

    IF right_filter:

        right_spores = train_pilz(target_filter, path_filters + [right_filter], depth + "r", settings)



    RETURN left_spores + neutral_spores + right_spores

FUNCTION CATEGORIZE(feature, train_df, n_cat):

    IF feature.statistical == "categorial":

        RETURN CATEGORIZE_CATEGORICAL(feature, train_df, n_cat)

    ELSE:

        RETURN CATEGORIZE_NUMERICAL(feature, train_df, n_cat)

FUNCTION CATEGORIZE_NUMERICAL(feature, train_df, n_cat):

    sorted = SORT_BY(feature.name)

    cum_weights = CUMULATIVE_SUM(weights)



    cuts = []

    FOR i IN 1..n_cat:

        quantile = i / n_cat

        cut_point = FIND_QUANTILE(sorted, cum_weights, quantile)

        cuts.append(cut_point)



    RETURN CategorizedFeature(feature, cuts)

FUNCTION FIND_BEST_SPLIT(train_df, settings):

    scored_features = []



    # Score individual features

    FOR feature IN train_df.features:

        score = CALCULATE_DISCRIMINATION(feature, train_df)

        scored_features.append((feature, score))



    scored_features = SORT_BY(scored_features, DESC)

    best = scored_features[0]



    # Try combinations if n_dims > 1

    FOR dim IN 2..settings.n_dims:

        FOR combination IN COMBINATIONS(scored_features, dim):

            combined = CREATE_COMBINED_FEATURE(combination)



            # Build correlation table

            table = BUILD_CORRELATION_TABLE(combined, train_df)



            # Calculate discrimination score

            score = CALCULATE_DISCRIMINATION(table)



            IF score > best.score:

                best = (combined, score)



    RETURN best

Key Distinctions from Traditional Trees

| Aspect | Traditional Trees | Pilz |

|--------|------------------|------|

| Feature selection | Single feature per split | Feature combinations |

| Correlation handling | Multiple shallow splits | Single deep cut |

| Correlation table | N/A | Built at each node |

| Downsampling | Sometimes | Always, balanced |

| Neutral branch | No | Yes |

Model Structure

A Pilz model consists of spores (leaf nodes):

```mermaid classDiagram class Pilz { +list~Spore~ spores +str target +get_sql }

class Spore {
    +list~str~ cut           # SQL WHERE conditions
    +float score             # Target rate at this leaf
    +str depth               # Path 
}

Pilz "1" --> "*" Spore```

Diagram

Each spore represents a leaf with:

  • cut: List of conditions that lead here

  • score: Confidence (target rate)

  • depth: Path notation (l=left, n=neutral, r=right)

Summary

| Step | What Happens |

|------|--------------|

| 1. Bin features | Categorical: group by target rate; Numerical: quantile bins |

| 2. Build correlation table | For each feature/combination, count target vs non-target |

| 3. Determine branches | High rate → Right, Low rate → Left, Medium → Neutral |

| 4. Recurse | Repeat until min_eval_fit or max_depth |

| 5. Downsample | Always use balanced subset |

Next Steps

```mermaid classDiagram class Pilz { +list~Spore~ spores +str target +get_sql }

class Spore {
    +list~str~ cut           # SQL WHERE conditions
    +float score             # Target rate at this leaf
    +str depth               # Path 
}

Pilz "1" --> "*" Spore```

Diagram