Using Polars with UpSet Plots#
This example demonstrates how to use Polars DataFrames with UpSet plots. Polars is a fast DataFrame library written in Rust that can be used as an alternative to pandas.
First, let’s import the necessary libraries and create our sample data using Polars:
import altair_upset as au
import polars as pl
import numpy as np
# Create sample data with realistic social media usage patterns
np.random.seed(42)
n_users = 1000
# Generate binary data for each platform
platforms = ['Instagram', 'TikTok', 'Twitter', 'LinkedIn', 'Facebook']
probabilities = [0.8, 0.6, 0.5, 0.4, 0.7] # Probability of using each platform
# Create data using Polars
data_dict = {}
for platform, prob in zip(platforms, probabilities):
data_dict[platform] = np.random.choice([0, 1], size=n_users, p=[1-prob, prob])
data = pl.DataFrame(data_dict)
Basic UpSet Plot with Polars Data#
Create a simple UpSet plot using Polars DataFrame. Note that the UpSet plot function will automatically convert the Polars DataFrame to pandas internally:
# Convert Polars DataFrame to pandas for visualization
pandas_df = data.to_pandas()
au.UpSetAltair(
data=pandas_df,
sets=platforms,
title="Social Media Platform Usage",
subtitle="Distribution of user activity across social media platforms"
).chart
Working with Different Data Types#
Polars supports various data types that can be used with UpSet plots. Here’s an example using different data types:
# Create data with different types
mixed_data = pl.DataFrame({
'Instagram': pl.Series([1, 0, 1], dtype=pl.Boolean), # Boolean
'TikTok': pl.Series([1, 0, 1], dtype=pl.Int32), # Integer
'Twitter': pl.Series([1.0, 0.0, 1.0], dtype=pl.Float64) # Float
})
# Convert to pandas for visualization
mixed_pandas_df = mixed_data.to_pandas()
# All these types will be handled correctly in the UpSet plot
mixed_plot = au.UpSetAltair(
data=mixed_pandas_df,
sets=['Instagram', 'TikTok', 'Twitter'],
title="Mixed Data Types Example"
).chart
Performance Benefits#
When working with large datasets, you can leverage Polars’ fast data manipulation capabilities before creating the UpSet plot. Here’s an example of preprocessing data with Polars:
# Use Polars for fast data filtering
active_users = data.filter(
pl.col('Instagram') | pl.col('TikTok') | pl.col('Twitter')
).to_pandas()
au.UpSetAltair(
data=active_users,
sets=platforms,
title="Active Social Media Users",
subtitle="Users with at least one social media account"
).chart