Metadata-Version: 2.3
Name: pandas-plots
Version: 2.0.7
Summary: A collection of helper for table handling and visualization
Keywords: tables,pivot,plotly,venn,plot,vizualization,duckdb
Author: smeisegeier
Author-email: smeisegeier <dexterDSD@googlemail.com>
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering
Requires-Dist: pandas<3
Requires-Dist: plotly>=6.2
Requires-Dist: kaleido>=1
Requires-Dist: matplotlib>=3.10.0
Requires-Dist: matplotlib-inline
Requires-Dist: matplotlib-venn==0.11.10
Requires-Dist: seaborn>=0.13.2
Requires-Dist: jinja2>=3.1.4
Requires-Dist: requests>=2.32.0
Requires-Dist: numpy<2.0.0
Requires-Dist: missingno>=0.5.2
Requires-Dist: duckdb>=1.3.0
Requires-Dist: nbformat>=4.2.0
Requires-Dist: dataframe-image>=0.2.7
Requires-Dist: connection-helper>=0.13.1
Requires-Dist: upsetplot>=0.9.0
Requires-Dist: ipynbname>=2025.8.0.0
Requires-Dist: urllib3>=2.6.3
Requires-Dist: pillow>=12.2.0
Requires-Dist: nbconvert>=7.17.0
Requires-Dist: fonttools>=4.60.2
Requires-Dist: aiohttp>=3.13.3
Requires-Dist: orjson>=3.11.4
Requires-Dist: tornado>=6.5.5
Requires-Python: >=3.11
Project-URL: Bug Tracker, https://github.com/smeisegeier/pandas-plots/issues
Project-URL: Homepage, https://github.com/smeisegeier/pandas-plots
Project-URL: Repository, https://github.com/smeisegeier/pandas-plots
Description-Content-Type: text/markdown

# pandas-plots
<a id="markdown-pandas-plots" name="pandas-plots"></a>

a comprehensive python package for enhanced **data visualization** and **analysis** with pandas dataframes. provides a high-level api for creating beautiful tables, plots, venn diagrams, and utility functions with minimal code.

[![PyPI - Version](https://img.shields.io/pypi/v/pandas-plots)](https://pypi.org/project/pandas-plots/) [![GitHub last commit](https://img.shields.io/github/last-commit/smeisegeier/pandas-plots?logo=github)](https://github.com/smeisegeier/pandas-plots) [![GitHub License](https://img.shields.io/github/license/smeisegeier/pandas-plots?logo=github)](https://github.com/smeisegeier/pandas-plots/blob/main/LICENSE) [![Python Versions](https://img.shields.io/badge/python-3.10_|_3.11_|_3.12_|_3.13-blue.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIj4KICA8ZGVmcz4KICAgIDxsaW5lYXJHcmFkaWVudCBpZD0icHlZZWxsb3ciIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2ZlNSIgb2Zmc2V0PSIwLjYiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2RhMSIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogICAgPGxpbmVhckdyYWRpZW50IGlkPSJweUJsdWUiIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzY5ZiIgb2Zmc2V0PSIwLjQiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzQ2OCIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogIDwvZGVmcz4KCiAgPHBhdGggZD0iTTI3LDE2YzAtNyw5LTEzLDI0LTEzYzE1LDAsMjMsNiwyMywxM2wwLDIyYzAsNy01LDEyLTExLDEybC0yNCwwYy04LDAtMTQsNi0xNCwxNWwwLDEwbC05LDBjLTgsMC0xMy05LTEzLTI0YzAtMTQsNS0yMywxMy0yM2wzNSwwbDAtM2wtMjQsMGwwLTlsMCwweiBNODgsNTB2MSIgZmlsbD0idXJsKCNweUJsdWUpIi8+CiAgPHBhdGggZD0iTTc0LDg3YzAsNy04LDEzLTIzLDEzYy0xNSwwLTI0LTYtMjQtMTNsMC0yMmMwLTcsNi0xMiwxMi0xMmwyNCwwYzgsMCwxNC03LDE0LTE1bDAtMTBsOSwwYzcsMCwxMyw5LDEzLDIzYzAsMTUtNiwyNC0xMywyNGwtMzUsMGwwLDNsMjMsMGwwLDlsMCwweiBNMTQwLDUwdjEiIGZpbGw9InVybCgjcHlZZWxsb3cpIi8+CgogIDxjaXJjbGUgcj0iNCIgY3g9IjY0IiBjeT0iODgiIGZpbGw9IiNGRkYiLz4KICA8Y2lyY2xlIHI9IjQiIGN4PSIzNyIgY3k9IjE1IiBmaWxsPSIjRkZGIi8+Cjwvc3ZnPgo=)](https://www.python.org/downloads/)

<!-- TOC -->

- [pandas-plots](#pandas-plots)
    - [📦 installation](#%F0%9F%93%A6-installation)
    - [🚀 features](#%F0%9F%9A%80-features)
    - [📤 publishing](#%F0%9F%93%A4-publishing)
        - [notebook setup](#notebook-setup)
        - [convert to markdown](#convert-to-markdown)
        - [convert to html](#convert-to-html)
    - [🔍 examples](#%F0%9F%94%8D-examples)
        - [styled table](#styled-table)
        - [table description](#table-description)
        - [upset plot](#upset-plot)
        - [uml graph](#uml-graph)
        - [set filter](#set-filter)
        - [sankey diagram](#sankey-diagram)
        - [box plot with violin overlay](#box-plot-with-violin-overlay)
        - [box plot with statistics](#box-plot-with-statistics)
        - [venn diagrams](#venn-diagrams)
    - [📚 api reference](#%F0%9F%93%9A-api-reference)
        - [table utilities tbl](#table-utilities-tbl)
        - [plotting functions pls](#plotting-functions-pls)
        - [helper functions hlp](#helper-functions-hlp)
    - [🧩 prerequisites](#%F0%9F%A7%A9-prerequisites)
    - [🤝 contributing](#%F0%9F%A4%9D-contributing)
    - [📄 license](#%F0%9F%93%84-license)
    - [🏷️ tags](#%F0%9F%8F%B7%EF%B8%8F-tags)

<!-- /TOC -->

<br>

## 📦 installation
<a id="markdown-%F0%9F%93%A6-installation" name="%F0%9F%93%A6-installation"></a>

install package using uv (recommended)

```bash
uv add -U pandas-plots
```

use in python source

```python
from pandas_plots import tbl, pls, hlp, const
```


<br>

## 🚀 features
<a id="markdown-%F0%9F%9A%80-features" name="%F0%9F%9A%80-features"></a>

- **table utilities** (`tbl`): style dataframes as HTML tables with heatmaps, totals, kpi indicators, and percentages; describe column distributions with a single call
- **plotting functions** (`pls`): plotly-based box plots, histograms, stacked bars, sankey diagrams, upset plots, facet charts, and more
- **helper functions** (`hlp`): configure rendering for markdown export, cascade duckdb filters with counts, notebook and file utilities
- **cli** (`cli`): convert Jupyter notebooks to Markdown or HTML for publishing
<!-- - **constants** (`const`): useful constants like color palettes -->

<br>

## 📤 publishing
<a id="markdown-%F0%9F%93%A4-publishing" name="%F0%9F%93%A4-publishing"></a>

> [!TIP]
>this package enables `.ipynb` publishing:
>
>- converts notebooks to `markdown` or `html`
>- styled pandas tables are rendered as images, retaining all features
>- images can be scaled for better visual impression
>- supports `github` and `gitlab` anchors
>- supports github auto theme (`theme="system"`)
>- supports alternative texts in images
>- html output can use image folders or inline base64 encoding
>- supports easy csv export of diagram data
>- can force pdf friendly options to avoid pdf artifacts

### notebook setup
<a id="markdown-notebook-setup" name="notebook-setup"></a>

call `hlp.setup_rendering()` at the top of your notebook to configure alle  rendering settings:

```python
# src/my_notebook.ipynb

from pandas_plots import hlp

hlp.setup_rendering(
    static=True,           # True for static rendering, False for interactive
    apply_dark_theme=False
)
```

### convert to markdown
<a id="markdown-convert-to-markdown" name="convert-to-markdown"></a>

```python
# src/convert.ipynb

from pandas_plots.cli.converter import jupyter_to_md

jupyter_to_md(
    path="src/my_notebook.ipynb",
    output_dir="./docs",      # output folder (created if missing)
    no_input=True,            # strip input cells from output
    execute=True,             # re-execute notebook before converting
    theme="system",           # None | "light" | "dark" | "system"
    chrome_path="/Applications/Chromium.app/Contents/MacOS/Chromium"  #  path to `ungoogled-chromium` binary
)
```

### convert to html
<a id="markdown-convert-to-html" name="convert-to-html"></a>

```python
from pandas_plots.cli.converter import jupyter_to_html

jupyter_to_html(
    path="src/my_notebook.ipynb",
    output_dir="./docs",
    no_input=True,
    execute=False,
    use_base64=False,  # False = images in separate folder, True = inline base64
)
```

> [!WARNING]
> conversion requires `ungoogled-chromium` installed, normal `chromium` wont work anymore (see prerequisites section below)

<br>

## 🔍 examples
<a id="markdown-%F0%9F%94%8D-examples" name="%F0%9F%94%8D-examples"></a>


<!-- <br> -->

### styled table
<a id="markdown-styled-table" name="styled-table"></a>

```python
tbl.pivot_df(
    df[["color", "payment", "fare"]],
    total_mode="sum",
    total_axis="xy",
    data_bar_axis=None,
    total_exclude=True,
    pct_axis="xy",
    precision=0,
    heatmap_axis="xy",
    kpi_mode="rag_abs",
    kpi_rag_list=[1000, 10000],
    swap=True,
    font_size_td=12,
    font_size_th=14,
)
```

<!-- ![pivot_df()](img/2025-12-26-13-46-10.png) -->
![pivot](https://github.com/smeisegeier/pandas-plots/blob/main/img/2025-12-26-13-46-10.png?raw=true)
<br>

### table description
<a id="markdown-table-description" name="table-description"></a>

```python
tbl.describe_df(
    df,
    caption="taxis",
    top_n_uniques=10,
    top_n_chars_in_columns=10,
    top_n_chars_in_index=15,
)
```

<!-- ![table_describe](img/2025-12-26-20-53-27.png) -->
![table](https://github.com/smeisegeier/pandas-plots/blob/main/img/2025-12-26-20-53-27.png?raw=true)

<br>

### upset plot
<a id="markdown-upset-plot" name="upset-plot"></a>

```python
pls.plot_upset(
    df_upset,
    include_false_subsets=False,
    orientation="horizontal",
)
```

<!-- ![upset](img/2025-12-26-20-17-45.png) -->
![upset](https://github.com/smeisegeier/pandas-plots/blob/main/img/2025-12-26-20-17-45.png?raw=true)

<br>

### uml graph
<a id="markdown-uml-graph" name="uml-graph"></a>

```python
metrics = pls.plot_uml_graph()
```

<!-- ![uml_graph](img/2025-12-26-20-46-37.png) -->
![uml](https://github.com/smeisegeier/pandas-plots/blob/main/img/2025-12-26-20-46-37.png?raw=true)

<br>

### set filter
<a id="markdown-set-filter" name="set-filter"></a>

```python
filter = hlp.get_duckdb_filter_n(
    con,
    "from Tumor",
    FILTERS,
    # distinct_metric="z_pat_id",
)
```

```
counts: rows
---
n = 3_241_401                                     (100.0%) ██████████████████████████████
└ [2020-2023.07]:                   n = 2_633_644  (81.3%) ░░░░░░████████████████████████
└ [not z_is_dco]:                   n = 2_547_636  (78.6%) ░░░░░░░███████████████████████
└ [keine M1]:                       n = 2_305_215  (71.1%) ░░░░░░░░░█████████████████████
└ [keine Verstorbenen < 180 Tage]:  n = 2_132_064  (65.8%) ░░░░░░░░░░░███████████████████
└ [lympho- und mesoendokr. Tumore]:    n = 27_653   (0.9%) ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
```

<br>

### sankey diagram
<a id="markdown-sankey-diagram" name="sankey-diagram"></a>

```python
_ = pls.plot_sankey(width=1000)
```

<!-- ![](img/2025-12-26-20-31-15.png) -->
![sankey](https://github.com/smeisegeier/pandas-plots/blob/main/img/2025-12-26-20-31-15.png?raw=true)

<br>

### box plot with violin overlay
<a id="markdown-box-plot-with-violin-overlay" name="box-plot-with-violin-overlay"></a>

```python
_ = pls.plot_box(df['fare'], height=400, violin=true)
```
<!-- ![plot_box](img/2024-02-13-00-40-27.png) -->
![box](https://github.com/smeisegeier/pandas-plots/blob/main/img/2024-02-13-00-40-27.png?raw=true)

<br>

### box plot with statistics
<a id="markdown-box-plot-with-statistics" name="box-plot-with-statistics"></a>

```python
_ = pls.plot_boxes_large(df[["dropoff_borough","distance"]])
```

<!-- ![box_plot](img/2025-12-26-20-35-37.png) -->
![box](https://github.com/smeisegeier/pandas-plots/blob/main/img/2025-12-26-20-35-37.png?raw=true)
<br>

### venn diagrams
<a id="markdown-venn-diagrams" name="venn-diagrams"></a>

```python
# show venn diagram for 3 sets
set_a = {'ford','ferrari','mercedes', 'bmw'}
set_b = {'opel','bmw','bentley','audi'}
set_c = {'ferrari','bmw','chrysler','renault','peugeot','fiat'}

_df, _details = pls.plot_venn3(
    title="taxis",
    a_set=set_a,
    a_label="cars1",
    b_set=set_b,
    b_label="cars2",
    c_set=set_c,
    c_label="cars3",
    verbose=0,
    size=8,
)
```

![venn](https://github.com/smeisegeier/pandas-plots/blob/main/img/2024-02-19-20-49-52.png?raw=true)

<br>

<br>

## 📚 api reference
<a id="markdown-%F0%9F%93%9A-api-reference" name="%F0%9F%93%9A-api-reference"></a>

### table utilities (tbl)
<a id="markdown-table-utilities-tbl" name="table-utilities-tbl"></a>

| function | description |
|----------|-------------|
| `show_num_df()` | displays a table as styled version with additional information |
| `describe_df()` | alternative version of pandas `describe()` function |
| `descr_db()` | short description for a `duckdb` relation |
| `pivot_df()` | gets a pivot table of a 3 column dataframe (or 2 columns if no weights are given) |
| `print_summary()` | shows statistics for a pandas dataframe or series |

### plotting functions (pls)
<a id="markdown-plotting-functions-pls" name="plotting-functions-pls"></a>

| function | description |
|----------|-------------|
| `plot_box()` | auto annotated boxplot w/ violin option |
| `plot_boxes()` | multiple boxplots (annotation is experimental) |
| `plot_stacked_bars()` | shortcut to stacked bars |
| `plot_bars()` | standardized bar plot for categorical column with confidence intervals |
| `plot_histogram()` | histogram for one or more numerical columns |
| `plot_joints()` | joint plot for exactly two numerical columns |
| `plot_quadrants()` | quickly shows a 2x2 heatmap |
| `plot_facet_stacked_bars()` | stacked bars for a facet value as subplots |
| `plot_sankey()` | generates a sankey diagram |
| `plot_pie()` | generates a pie chart |
| `plot_box_large()` | for large datasets using seaborn |
| `plot_boxes_large()` | for large datasets using seaborn |
| `plot_histogram_large()` | for large datasets using seaborn |
| `plot_upset()` | generates an upset plot based on upsetplot |
| `plot_uml_graph()` | generates a uml graph based on mermaid for structured data |
| `plot_venn2()` | displays a venn diagram for 2 sets |
| `plot_venn3()` | displays a venn diagram for 3 sets |

### helper functions (hlp)
<a id="markdown-helper-functions-hlp" name="helper-functions-hlp"></a>

| function | description |
|----------|-------------|
| `to_series()` | converts a dataframe to a series |
| `mean_confidence_interval()` | calculates mean and confidence interval for a series |
| `wrap_text()` | formats strings or lists to a given width |
| `replace_delimiter_outside_quotes()` | replaces delimiters only outside of quotes in csv imports |
| `create_barcode_from_url()` | creates a barcode from a given url |
| `add_datetime_col()` | adds a datetime column to a dataframe (chainable) |
| `show_package_version()` | prints version of a list of packages |
| `get_os()` | helps identify and ensure operating system at runtime |
| `add_bitmask_label()` | adds a column that resolves a bitmask column into human-readable labels |
| `find_cols()` | finds all columns in a list that contain any of the given stubs |
| `add_measures_to_pyg_config()` | adds measures to a pygwalker config file |
| `get_tum_details()` | prints details of a specific tumor (requires connection to clinical cancer data) |
| `get_duckdb_filter_n()` | print rowcounts for cascading filters in duckdb with ansi bars |
| `print_filter()` | prints filter as markdown sql codeblock |
| `is_ipynb()` | detects if code is running in jupyter notebook |
| `prepend_uv_header()` | prepends uv header to a .py script to make it executable for `uv run` command |
| `create_py_script()` | creates a .py script from a .ipynb file |
| `setup_rendering()` | triggers clean(er) rendering of plots and pandas tables to markdown |
| `find_str_in_duckdb()` | finds a given string in all tables of a DuckDB database |
| `export_plot_data()` | exports the dataframe used for a plot to ./data/<name>.csv |

<br>

## 🧩 prerequisites
<a id="markdown-%F0%9F%A7%A9-prerequisites" name="%F0%9F%A7%A9-prerequisites"></a>

- **python 3.10+**: compatible with python versions 3.10 - 3.13
- **[uv](https://github.com/astral-sh/uv/)**: uv is recommended for package management

> ⚠️ this package depends on `numpy<2.0.0` since `UpSetPlot` is still tied to the previous versions

<br>

## 🤝 contributing
<a id="markdown-%F0%9F%A4%9D-contributing" name="%F0%9F%A4%9D-contributing"></a>

contributions are welcome! please feel free to submit a pull request. for major changes, please open an issue first to discuss what you would like to change.

<br>

## 📄 license
<a id="markdown-%F0%9F%93%84-license" name="%F0%9F%93%84-license"></a>

this project is licensed under the **MIT** license - see the [license](license) file for details.

<br>

## 🏷️ tags
<a id="markdown-%F0%9F%8F%B7%EF%B8%8F-tags" name="%F0%9F%8F%B7%EF%B8%8F-tags"></a>

#pandas #visualizations #statistics #data-science #data-analysis #python