Metadata-Version: 2.3
Name: reflex-mui-datagrid
Version: 0.2.3
Summary: Reflex wrapper for the MUI X DataGrid (v8) React component with polars LazyFrame support
Author: antonkulaga
Author-email: antonkulaga <antonkulaga@gmail.com>
Requires-Dist: polars>=1.39.2
Requires-Dist: reflex>=0.8.28
Requires-Dist: typer>=0.24.1
Requires-Dist: polars-bio>=0.26.0 ; python_full_version < '3.15' and extra == 'bio'
Requires-Python: >=3.12
Provides-Extra: bio
Description-Content-Type: text/markdown

# reflex-mui-datagrid

Reflex wrapper for the [MUI X DataGrid](https://mui.com/x/react-data-grid/) (v8) React component, with built-in [polars](https://pola.rs/) LazyFrame support and optional genomic data visualization via [polars-bio](https://biodatageeks.org/polars-bio/).

[![PyPI](https://img.shields.io/pypi/v/reflex-mui-datagrid?logo=pypi&logoColor=white)](https://pypi.org/project/reflex-mui-datagrid/)
[![Python](https://img.shields.io/pypi/pyversions/reflex-mui-datagrid?logo=python&logoColor=white)](https://pypi.org/project/reflex-mui-datagrid/)
[![Downloads](https://img.shields.io/pypi/dm/reflex-mui-datagrid?logo=pypi&logoColor=white)](https://pypi.org/project/reflex-mui-datagrid/)
[![GitHub](https://img.shields.io/badge/GitHub-dna--seq%2Freflex--mui--datagrid-181717?logo=github)](https://github.com/dna-seq/reflex-mui-datagrid)

![Genomic Variants DataGrid](docs/screenshot_genomic_variants.jpg)

## Installation

```bash
uv add reflex-mui-datagrid
```

For CLI usage, you can run the tool as `biogrid` (see CLI section below).

For genomic data support (VCF/BAM files), install with the `[bio]` extra:

```bash
uv add "reflex-mui-datagrid[bio]"
```

Requires Python >= 3.12, Reflex >= 0.8.27, and polars >= 1.0.

## CLI VCF Viewer (No Boilerplate)

The package includes a CLI entrypoint that can launch an interactive viewer
for VCF and other tabular formats. This is the fastest way to explore a VCF
without writing any app code.

Install as a global `uv` tool (with genomic support):

```bash
uv tool install "reflex-mui-datagrid[bio]"
```

This installs both commands:
- `reflex-mui-datagrid` (full name)
- `biogrid` (bio-focused alias)

Open a VCF in your browser:

```bash
reflex-mui-datagrid path/to/variants.vcf
# bio-focused alias
biogrid path/to/variants.vcf
```

Useful options:

```bash
reflex-mui-datagrid path/to/variants.vcf --limit 5000 --port 3005 --title "Tumor Cohort VCF"
# bio-focused alias
biogrid path/to/variants.vcf --limit 5000 --port 3005 --title "Tumor Cohort VCF"
```

The CLI auto-detects file formats by extension and currently supports:
- Genomics (via `polars-bio`): `vcf`, `bam`, `gff`, `bed`, `fasta`, `fastq`
- Tabular: `csv`, `tsv`, `parquet`, `json`, `ndjson`, `ipc`/`arrow`/`feather`

## Quick Start

The fastest way to visualize a polars DataFrame or LazyFrame is the `show_dataframe` helper:

```python
import polars as pl
import reflex as rx
from reflex_mui_datagrid import show_dataframe

df = pl.read_csv("my_data.csv")

def index() -> rx.Component:
    return show_dataframe(df, height="500px")

app = rx.App()
app.add_page(index)
```

That single call handles column type detection, dropdown filters for low-cardinality columns, row IDs, JSON serialization, and the MUI toolbar -- all automatically.

### With State (for reactive updates)

For grids that update in response to user actions, use `lazyframe_to_datagrid` inside a `rx.State` event handler:

```python
import polars as pl
import reflex as rx
from reflex_mui_datagrid import data_grid, lazyframe_to_datagrid

class State(rx.State):
    rows: list[dict] = []
    columns: list[dict] = []

    def load_data(self) -> None:
        lf = pl.LazyFrame({
            "id": [1, 2, 3],
            "name": ["Alice", "Bob", "Charlie"],
            "score": [95, 82, 91],
        })
        self.rows, col_defs = lazyframe_to_datagrid(lf)
        self.columns = [c.dict() for c in col_defs]

def index() -> rx.Component:
    return data_grid(
        rows=State.rows,
        columns=State.columns,
        show_toolbar=True,
        height="400px",
    )

app = rx.App()
app.add_page(index, on_load=State.load_data)
```

## Core Features

- **MUI X DataGrid v8** (Community edition, MIT) with `@mui/material` v7
- **No pagination by default** -- all rows are scrollable; MUI's built-in row virtualisation only renders visible DOM rows, keeping scrolling smooth for large datasets
- **No 100-row limit** -- the Community edition's artificial page-size cap is removed via a small JS patch; pass `pagination=True` to re-enable pagination with any page size
- **`show_dataframe()` helper** -- one-liner to turn any polars DataFrame or LazyFrame into a fully-featured interactive grid
- **Polars LazyFrame integration** -- `lazyframe_to_datagrid()` converts any LazyFrame to DataGrid-ready rows and column definitions in one call
- **Automatic column type detection** -- polars dtypes map to DataGrid types (`number`, `boolean`, `date`, `dateTime`, `string`) with sensible default widths per dtype (e.g. boolean 80px, numeric 110px, dates 140px, strings flex)
- **Automatic dropdown filters** -- low-cardinality string columns and `Categorical`/`Enum` dtypes become `singleSelect` columns with dropdown filters
- **JSON-safe serialization** -- temporal columns become ISO strings, `List` columns become comma-joined strings, `Struct` columns become strings
- **`ColumnDef` model** with snake_case Python attrs that auto-convert to camelCase JS props
- **Expandable row detail panels** -- click a chevron to reveal additional fields below any row, with configurable badge rendering and custom colors (uses MUI X virtualizer's `setPanels` API)
- **Event handlers** for row click, cell click, sorting, filtering, pagination, and row selection
- **Auto-sized container** -- `WrappedDataGrid` wraps the grid in a `<div>` with configurable `width`/`height`
- **Row identification** -- `row_id_field` parameter for custom row ID, auto-generated `__row_id__` column when no `id` column exists

## The `show_dataframe` Helper

`show_dataframe` is designed for polars users who want to quickly visualize a DataFrame without wiring up Reflex state. It accepts a `pl.DataFrame` or `pl.LazyFrame` and returns a ready-to-render component:

```python
from reflex_mui_datagrid import show_dataframe

# Basic usage -- just pass a DataFrame
grid = show_dataframe(df)

# With options
grid = show_dataframe(
    df,
    height="600px",
    density="compact",
    show_toolbar=True,
    limit=1000,                          # collect at most 1000 rows
    column_descriptions={"score": "Final exam score (0-100)"},
    show_description_in_header=True,     # show descriptions as subtitles
    column_header_height=70,             # taller headers for subtitles
)
```

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `data` | `LazyFrame \| DataFrame` | *required* | The polars data to visualize |
| `height` | `str` | `"600px"` | CSS height of the grid container |
| `width` | `str` | `"100%"` | CSS width of the grid container |
| `show_toolbar` | `bool` | `True` | Show MUI toolbar (columns, filters, density, export) |
| `density` | `str \| None` | `None` | `"comfortable"`, `"compact"`, or `"standard"` |
| `limit` | `int \| None` | `None` | Max rows to collect from LazyFrame |
| `column_descriptions` | `dict \| None` | `None` | `{column: description}` for header tooltips |
| `show_description_in_header` | `bool` | `False` | Show descriptions as subtitles in headers |
| `column_header_height` | `int \| None` | `None` | Header height in px (useful with description subtitles) |
| `checkbox_selection` | `bool` | `False` | Show checkbox column for row selection |
| `on_row_click` | `EventHandler \| None` | `None` | Handler called when a row is clicked |

**When to use `show_dataframe` vs `lazyframe_to_datagrid`:**

- Use `show_dataframe` for quick prototyping, static dashboards, or when you just want to see your data.
- Use `lazyframe_to_datagrid` inside `rx.State` when the grid data needs to change in response to user actions (filtering server-side, loading different files, etc.).

## Genomic Data Visualization

[polars-bio](https://biodatageeks.org/polars-bio/) is a bioinformatics library that reads genomic file formats (VCF, BAM, GFF, FASTA, FASTQ, and more) as native polars LazyFrames. Since `show_dataframe` accepts any polars LazyFrame, you get an interactive genomic data browser in two lines of code -- no boilerplate needed.

![Genomic Variants Grid](docs/screenshot_genomic_variants.jpg)

### Extra Dependencies

Install with the `[bio]` extra to pull in polars-bio:

```bash
uv add "reflex-mui-datagrid[bio]"
```

This adds [polars-bio](https://pypi.org/project/polars-bio/) >= 0.23.0, which provides `scan_vcf()`, `scan_bam()`, `scan_gff()`, and other genomic file readers -- all returning standard polars LazyFrames.

If you only want quick interactive exploration, the CLI is the simplest option:

```bash
reflex-mui-datagrid variants.vcf
```

### Quick VCF Visualization (two lines)

Because `polars_bio.scan_vcf()` returns a polars LazyFrame, you can pass it straight to `show_dataframe`:

```python
import polars_bio as pb
from reflex_mui_datagrid import show_dataframe

lf = pb.scan_vcf("variants.vcf")  # polars LazyFrame

def index() -> rx.Component:
    return show_dataframe(lf, density="compact", height="540px")
```

That is all you need -- column types, dropdown filters for low-cardinality fields like `filter` and genotype, row IDs, and the MUI toolbar are all set up automatically.

### VCF with Column Descriptions from Headers

For richer display, `bio_lazyframe_to_datagrid` automatically extracts column descriptions from VCF INFO/FORMAT headers and shows them as tooltips or subtitles in the column headers:

```python
import polars_bio as pb
import reflex as rx
from reflex_mui_datagrid import bio_lazyframe_to_datagrid, data_grid

class State(rx.State):
    rows: list[dict] = []
    columns: list[dict] = []

    def load_vcf(self) -> None:
        lf = pb.scan_vcf("variants.vcf")
        self.rows, col_defs = bio_lazyframe_to_datagrid(lf)
        self.columns = [c.dict() for c in col_defs]

def index() -> rx.Component:
    return data_grid(
        rows=State.rows,
        columns=State.columns,
        show_toolbar=True,
        show_description_in_header=True,  # VCF descriptions as subtitles
        density="compact",
        column_header_height=70,
        height="540px",
    )

app = rx.App()
app.add_page(index, on_load=State.load_vcf)
```

`bio_lazyframe_to_datagrid` merges three sources of column descriptions:
1. **VCF specification** -- standard fields (chrom, start, ref, alt, qual, filter, etc.)
2. **INFO fields** -- descriptions from the file's `##INFO` header lines
3. **FORMAT fields** -- descriptions from the file's `##FORMAT` header lines

## Server-Side Scroll-Loading (Large Datasets)

For datasets too large to load into the browser at once (millions of rows), the `LazyFrameGridMixin` provides a complete server-side solution with scroll-driven infinite loading, filtering, and sorting -- all backed by a polars LazyFrame that is never fully collected into memory.

### Quick Example

```python
from pathlib import Path
from reflex_mui_datagrid import LazyFrameGridMixin, lazyframe_grid, scan_file

class MyState(LazyFrameGridMixin, rx.State):
    def load_data(self):
        lf, descriptions = scan_file(Path("my_genome.vcf"))
        yield from self.set_lazyframe(lf, descriptions)

def index() -> rx.Component:
    return rx.box(
        rx.button("Load", on_click=MyState.load_data, loading=MyState.lf_grid_loading),
        rx.cond(MyState.lf_grid_loaded, lazyframe_grid(MyState)),
    )
```

That's it -- you get server-side filtering, sorting, and infinite scroll-loading with no additional wiring.

### `scan_file` -- Auto-Detect File Format

`scan_file` opens any supported file as a polars LazyFrame and extracts column descriptions where available:

```python
from reflex_mui_datagrid import scan_file

# VCF -- auto-extracts column descriptions from headers
lf, descriptions = scan_file(Path("variants.vcf"))

# Parquet -- no descriptions, but LazyFrame is ready
lf, descriptions = scan_file(Path("data.parquet"))

# Also supports: .csv, .tsv, .json, .ndjson, .ipc, .arrow, .feather
```

### `LazyFrameGridMixin` -- State Mixin

`LazyFrameGridMixin` is a Reflex **state mixin** (`mixin=True`) that provides all the state variables and event handlers needed for server-side browsing. Inherit from it **and** `rx.State` in your state class -- each subclass gets its own independent set of `lf_grid_*` vars, so multiple grids on the same page do not interfere:

```python
class MyState(LazyFrameGridMixin, rx.State):
    # Your own state vars
    file_available: bool = False

    def load_data(self):
        lf, descriptions = scan_file(Path("data.parquet"))
        yield from self.set_lazyframe(lf, descriptions, chunk_size=500)
```

**State variables** (all prefixed `lf_grid_` to avoid collisions):

| Variable | Type | Description |
|----------|------|-------------|
| `lf_grid_rows` | `list[dict]` | Currently loaded rows |
| `lf_grid_columns` | `list[dict]` | Column definitions |
| `lf_grid_row_count` | `int` | Total rows matching current filter |
| `lf_grid_loading` | `bool` | Loading indicator |
| `lf_grid_loaded` | `bool` | Whether data has been loaded |
| `lf_grid_stats` | `str` | Last refresh timing info |
| `lf_grid_selected_info` | `str` | Detail string for clicked row |

**`set_lazyframe` parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `lf` | `pl.LazyFrame` | *required* | The LazyFrame to browse |
| `descriptions` | `dict[str, str] \| None` | `None` | Column descriptions for tooltips |
| `chunk_size` | `int` | `200` | Rows per scroll chunk |
| `value_options_max_unique` | `int` | `500` | Max distinct values for dropdown filter (queried from full dataset) |
| `eager_value_options_row_limit` | `int` | `50000` | Row count threshold for eager value options computation |
| `column_overrides` | `dict[str, dict[str, Any]] \| None` | `None` | Per-column property overrides (widths, renderers, etc.) |

### Column Overrides

The `column_overrides` parameter lets you customize auto-generated column definitions without reaching into internal cache. Overrides are applied before storing in the cache, so they survive all internal operations (value options computation, filter upgrades).

Keys are field names, values are dicts of camelCase `ColumnDef` properties:

```python
class MyState(LazyFrameGridMixin, rx.State):
    def load_data(self):
        lf = pl.scan_parquet("pgs_scores.parquet")
        yield from self.set_lazyframe(lf, column_overrides={
            # Render PGS IDs as links to the PGS Catalog
            "pgs_id": {
                "width": 140,
                "cellRendererType": "url",
                "cellRendererConfig": {
                    "baseUrl": "https://www.pgscatalog.org/score/",
                    "suffixUrl": "/",
                    "color": "#1565c0",
                },
            },
            # Custom widths for numeric columns
            "n_variants": {"width": 110},
            # Flexible width for text columns
            "trait_reported": {"minWidth": 150, "flex": 2},
            # Hide internal columns
            "ftp_link": {"hide": True},
        })
```

Supported override properties include `width`, `minWidth`, `maxWidth`, `flex`, `hide`, `cellRendererType`, `cellRendererConfig`, `type`, `headerName`, and any other `ColumnDef` attribute in camelCase.

### Cell Renderers

Columns can use built-in cell renderers via `cellRendererType` and `cellRendererConfig`. These work both in `column_overrides` (for server-side grids) and in `ColumnDef` (for client-side grids).

**URL renderer** (`cellRendererType: "url"`):

| Config key | Type | Default | Description |
|------------|------|---------|-------------|
| `baseUrl` | `str` | `""` | Prefix prepended to the cell value |
| `suffixUrl` | `str` | `""` | Suffix appended after the cell value |
| `labelField` | `str` | -- | Row field to use as link text (defaults to cell value) |
| `target` | `str` | `"_blank"` | HTML `target` attribute |
| `color` | `str` | `"inherit"` | CSS color for the link |

The URL is constructed as `baseUrl + cellValue + suffixUrl`. For example, with `baseUrl: "https://www.ebi.ac.uk/gwas/variants/"` and a cell value of `rs2032563`, the link points to `https://www.ebi.ac.uk/gwas/variants/rs2032563`.

**Badge renderer** (`cellRendererType: "badge"`):

| Config key | Type | Description |
|------------|------|-------------|
| `color` | `str` | Default text color |
| `bgColor` | `str` | Default background color |
| `colorMap` | `dict` | `{value: color}` per-value text colors |
| `bgColorMap` | `dict` | `{value: bgColor}` per-value background colors |
| `borderRadius` | `str` | CSS border-radius (default `"16px"`) |
| `padding` | `str` | CSS padding (default `"4px 8px"`) |

**Progress bar renderer** (`cellRendererType: "progress_bar"`):

| Config key | Type | Default | Description |
|------------|------|---------|-------------|
| `minValue` | `number` | `0` | Minimum value for scaling |
| `maxValue` | `number` | `100` | Maximum value for scaling |
| `color` | `str` | `"#1976d2"` | Bar fill color |
| `trackColor` | `str` | `"#e0e0e0"` | Track background color |
| `height` | `str` | `"8px"` | Bar height |
| `showValue` | `bool` | `true` | Show numeric value next to bar |

### `lazyframe_grid` -- Pre-Wired UI Component

Returns a `data_grid(...)` with all server-side handlers already connected:

```python
from reflex_mui_datagrid import lazyframe_grid, lazyframe_grid_stats_bar, lazyframe_grid_detail_box

def my_page() -> rx.Component:
    return rx.fragment(
        lazyframe_grid_stats_bar(MyState),   # row count + timing bar
        lazyframe_grid(MyState, height="600px"),
        lazyframe_grid_detail_box(MyState),  # clicked row details
    )
```

**`lazyframe_grid` parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `state_cls` | `type` | *required* | Your state class inheriting `LazyFrameGridMixin` |
| `height` | `str` | `"600px"` | CSS height |
| `width` | `str` | `"100%"` | CSS width |
| `density` | `str` | `"compact"` | Grid density |
| `column_header_height` | `int` | `70` | Header height in px |
| `scroll_end_threshold` | `int` | `260` | Pixels from bottom to trigger next chunk |
| `show_toolbar` | `bool` | `True` | Show MUI toolbar |
| `show_description_in_header` | `bool` | `True` | Show column descriptions as subtitles |
| `debug_log` | `bool` | `True` | Browser console debug logging |
| `on_row_click` | `EventHandler \| None` | `None` | Override default row-click handler |
| `detail_columns` | `list[str] \| None` | `None` | Fields to show in the expandable detail panel |
| `detail_height` | `int \| None` | `None` | Fixed pixel height for the detail panel (auto-computed if `None`) |
| `detail_labels` | `dict[str, str] \| None` | `None` | `{field: label}` display labels for detail panel fields |
| `detail_badge_fields` | `list[str] \| None` | `None` | Fields rendered as pipe-delimited colored badges |
| `detail_badge_colors` | `dict[str, list[str]] \| None` | `None` | Custom `{text: [fg, bg]}` badge colors |

### Expandable Row Detail Panels

The DataGrid supports expandable detail panels that show additional data below each row when clicked. This brings MUI X Pro's detail panel capability to the Community edition using the virtualizer's `setPanels` API.

When `detail_columns` is provided, an expander chevron column appears. Clicking it reveals a panel below that row with the specified fields displayed as label-value pairs.

```python
data_grid(
    rows=State.rows,
    columns=State.columns,
    detail_columns=["description", "notes", "tags"],
    detail_labels={
        "description": "Full Description",
        "notes": "Internal Notes",
        "tags": "Categories",
    },
    detail_height=180,
    height="600px",
)
```

**Badge fields** render pipe-delimited values (`"High | Urgent | Critical"`) as colored inline badges instead of plain text. The built-in color heuristic maps common terms (risk levels, quality tiers, percentiles) to semantic colors. You can override colors per badge text:

```python
data_grid(
    rows=State.rows,
    columns=State.columns,
    detail_columns=["summary", "risk_hint", "notes"],
    detail_badge_fields=["summary", "risk_hint"],
    detail_badge_colors={
        "High Risk": ["#c62828", "#ffcdd2"],
        "Low Risk": ["#2e7d32", "#c8e6c9"],
    },
    height="600px",
)
```

With `lazyframe_grid`, the same props are available:

```python
lazyframe_grid(
    MyState,
    detail_columns=["interpretation", "reference_source"],
    detail_badge_fields=["interpretation"],
    detail_labels={"interpretation": "Clinical Interpretation"},
)
```

**Detail panel props reference:**

| Prop | Type | Description |
|------|------|-------------|
| `detail_columns` | `list[str]` | Field names to display in the detail panel. Can reference any key in the row data, not just visible grid columns. When provided, an expander chevron column is added. |
| `detail_height` | `int` | Fixed pixel height for the panel. When omitted, auto-computed from the number of detail columns. |
| `detail_labels` | `dict[str, str]` | `{field: label}` mapping for display labels. Falls back to the column's `headerName` or the raw field name. |
| `detail_badge_fields` | `list[str]` | Fields whose values are split on `\|` and rendered as colored badges. |
| `detail_badge_colors` | `dict[str, list[str]]` | Custom colors keyed by badge text. Each value is `[foreground_color, background_color]`. Unmatched text falls back to the built-in heuristic. |

### Multiple Independent Grids

Because `LazyFrameGridMixin` is a Reflex mixin (`mixin=True`), you can have multiple independent grids on the same page -- each subclass gets its own `lf_grid_*` state vars:

```python
class ParquetGrid(LazyFrameGridMixin, rx.State):
    def load(self):
        yield from self.set_lazyframe(pl.scan_parquet("data.parquet"))

class CsvGrid(LazyFrameGridMixin, rx.State):
    def load(self):
        yield from self.set_lazyframe(pl.scan_csv("data.csv"))

# ParquetGrid.lf_grid_rows and CsvGrid.lf_grid_rows are independent
```

### How It Works

1. `set_lazyframe` stores the LazyFrame in a module-level cache (never serialised into Reflex state), computes the schema, total row count, and low-cardinality filter options from a bounded sample.
2. Only the first chunk of rows is collected and sent to the frontend.
3. As the user scrolls near the bottom, `handle_lf_grid_scroll_end` collects the next chunk and appends it.
4. Filter and sort changes reset to page 0 and re-query the LazyFrame with Polars expressions -- no full-table collect.

## Running the Example

The project uses [uv workspaces](https://docs.astral.sh/uv/concepts/projects/workspaces/). The example app is a workspace member with a `demo` entrypoint:

```bash
uv sync
uv run demo
```

The demo has six tabs:

| Tab | Description |
|-----|-------------|
| **PRS Results** | Polygenic Risk Scores with expandable detail panels (colored badges, interpretation, percentiles) |
| **PRS (Lazy + Overrides)** | Same PRS data via `LazyFrameGridMixin` with `column_overrides` -- PGS IDs as clickable links to the PGS Catalog, custom column widths |
| **Employee Data** | 20-row inline polars LazyFrame with sorting, dropdown filters, checkbox selection |
| **Genomic Variants (VCF)** | 793 variants loaded via `polars_bio.scan_vcf()`, column descriptions from VCF headers |
| **Longevity Map** | Server-side parquet browsing via `LazyFrameGridMixin`, rsIDs linked to GWAS Catalog |
| **Full Genome (Server-Side)** | ~4.5M variants with server-side scroll-loading, filtering, and sorting via `LazyFrameGridMixin` |

![Employee Data Grid](docs/screenshot_employee_data.jpg)

## API Reference

See [docs/api.md](docs/api.md) for the full API reference.

## License

MIT
