Skip to content

Introduction to Pilz

What is Pilz?

Pilz (German for "mushroom" / "fungus") is a machine learning library for classification tasks. The name reflects its nature: like mushrooms are neither plants nor animals, Pilz is neither a traditional decision tree nor a neural network - it's something in between that captures the best of both worlds.

flowchart TB
    subgraph Nature
        P[Plants] --> M[Mushrooms/Fungi]
        A[Animals] --> M
    end

    subgraph ML
        NN[Neural Networks] --> Pilz[Pilz]
        DT[Decision Trees] --> Pilz
    end

    style M fill:#ccffcc
    style Pilz fill:#e0f0ff```

![Diagram](images/introduction_1.svg)

Unlike traditional ML tools that create "black box" models, Pilz generates **readable SQL rules** that you can directly deploy to your database.

The core innovation of Pilz is its ability to handle **feature correlations naturally** through multi-dimensional cuts.

## The Core Idea: Multi-Dimensional Feature Cuts

### The Problem with Traditional Approaches

Traditional machine learning approaches have limitations:

```mermaid
flowchart TD
    subgraph "Traditional Tree"
        A[Data] --> B{Split on X?}
        B -->|Yes| C[X is high]
        B -->|No| D{X is low}
        D --> E{Y split?}
    end```

![Diagram](images/introduction_2.svg)

mermaid

flowchart TD

    subgraph "Traditional Tree"

        A[Data] --> B{Split on X?}

        B -->|Yes| C[X is high]

        B -->|No| D{X is low}

        D --> E{Y split?}

    end



    A -->|"Problem"| P[Correlation: X + Y togetherpredict better than alone]



    style P fill:#ffcccc```

### Pilz's Solution: Cut in Multiple Dimensions

Pilz can cut on **feature combinations** directly:

```mermaid
flowchart TD
    subgraph "Pilz Multi-Dimensional Cut"
        A[Data] --> B{Split on X AND Ycombination?}

        B -->|"X=low, Y=low"| C[Group 1]
        B -->|"X=low, Y=high"| D[Group 2]
        B -->|"X=high, Y=low"| E[Group 3]
        B -->|"X=high, Y=high"| F[Group 4]
    end

    style B fill:#ccffcc```

![Diagram](images/introduction_3.svg)

This naturally captures corre

```mermaid
flowchart TD
    subgraph "Pilz Multi-Dimensional Cut"
        A[Data] --> B{Split on X AND Ycombination?}

        B -->|"X=low, Y=low"| C[Group 1]
        B -->|"X=low, Y=high"| D[Group 2]
        B -->|"X=high, Y=low"| E[Group 3]
        B -->|"X=high, Y=high"| F[Group 4]
    end

    style B fill:#ccffcc```

![Diagram](images/introduction_4.svg)

: A, BBin 1: CBin 2: D, E]

    end



    subgraph Numerical_Features

        N1[Values: 1, 2, ..., 100] --> N2[Equal-size bins]

        N2 --> N3[Bin 0: 1-50Bin 1: 51-100]

    end```

- **Categorical**: Groups similar categories together until `n_cat` bins reached

- **Numerical**: Creates quantile bins (with `n_cat=2`, uses the median)

### Step 2: Build Correlation Tables

For `n_dims=2`, Pilz builds 

```mermaid
flowchart TB
    subgraph Categorical_Features
        C1[Categories: A, B, C, D, E] --> C2[Group until n_cat]
        C2 --> C3[Bin 0: A, BBin 1: CBin 2: D, E]
    end

    subgraph Numerical_Features
        N1[Values: 1, 2, ..., 100] --> N2[Equal-size bins]
        N2 --> N3[Bin 0: 1-50Bin 1: 51-100]
    end```

![Diagram](images/introduction_5.svg)

   T4["X=1, Y=1: target=80, non-target=20"]

    end```

### Step 3: Find Best Split

Pilz evaluates which combinations best distinguish target from non-target:

```mermaid
flowchart TB
    subgraph "Score Each Combination"
        S1["Target Rate = target / (target + non-target)"]
    end

    S1 --> C[Compare all combinations]
    C -->|"Best discrimination"| B[Best split found]
    C -->|"Unclear"| N[Go to neutral]```

![Diagram](images/introduction_6.svg)

- **

```mermaid
flowchart LR
    subgraph "Feature X: [0,1]"
    end

    subgraph "Feature Y: [0,1]"
    end

    subgraph "Correlation Table"
        T1["X=0, Y=0: target=15, non-target=85"]
        T2["X=0, Y=1: target=45, non-target=55"]
        T3["X=1, Y=0: target=60, non-target=40"]
        T4["X=1, Y=1: target=80, non-target=20"]
    end```

![Diagram](images/introduction_7.svg)

tyle B fill:#e0f0ff

    style D fill:#ccffcc```

### Step 5: Downsampling

To handle everything appropriately, Pilz always uses a **downsampled subset** of the original data:

```mermaid
flowchart LR
    A[Start] --> B[Process]
    B --> C[End]

    style A fill:#e0f0ff
    style C fill:#ccffcc```

![Diagram](images/introduction_8.svg)

mermaid

flowchart TB

    subgraph "Score Each Combination"

        S1["Target Rate = target / (target + non-target)"]

    end



    S1 --> C[Compare all combinations]

    C -->|"Best discrimination"| B[Best split found]

    C -->|"Unclear"| N[Go to neutral]```

 fast.

## Why This Matters

### Natural Correlation Handling

```mermaid
flowchart LR
    subgraph "Example: Churn"
        C1["Contract=MonthlyTechSupport=No"]
        C2["Both togetherchurn=85%"]

        C1 --> C2
    end

    C1 -->|"Traditional treeneeds 2 splits"| T[Split 1 + Split 2]

    style C2 fill:#ccffcc```

![Diagram](images/introduction_9.svg)

When features correlate, Pil

```mermaid
flowchart TB
    A[Split 1] --> B{Minimum eventsreached?}
    B -->|No| C[Split again]
    B -->|Yes| D[Create leaf]
    C --> E[Refine Neutral]
    E --> A

    style B fill:#e0f0ff
    style D fill:#ccffcc```

![Diagram](images/introduction_10.svg)

ontract = 'Two year' THEN 0.12

    ELSE 0.35

END

Comparison with Other Approaches

| Aspect | Neural Networks | Traditional Trees | Pilz |

|--------|----------------|-------------------|----------|

| Feature correlations | Learned implicitly | Multiple splits | Single cut |

| Interpretability | Low | Medium

```mermaid flowchart LR subgraph Original_Data O1[100,000 rows] end

subgraph Downsampled
    D1[Target: 5,000]
    D2[Non-target: 5,000]
end

O1 -->|"Sample target"| D1
O1 -->|"Sample non-target"| D2

style D1 fill:#ffff99
style D2 fill:#ffff99```

Diagram

n in database

  • Tabular data - Customer data, transactions

✗ Not Ideal For

  • Image/audio/video - Use neural networks

  • Very large datasets - May need tuning

  • No correlations - Simpler models may suffice

Core Parameters

| Parameter | What it controls |

|-----------|-----------------|

| n_dims | D

```mermaid flowchart LR subgraph "Example: Churn" C1["Contract=MonthlyTechSupport=No"] C2["Both togetherchurn=85%"]

    C1 --> C2
end

C1 -->|"Traditional treeneeds 2 splits"| T[Split 1 + Split 2]

style C2 fill:#ccffcc```

Diagram

g 2. First Project - Try the Iris example 3. Feature Categorization - Deep dive into binning 4. Multi-Dimensional Splits - Deep dive into n_dims


Pilz: Capturing feature correlations naturally through multi-dimensional cuts.