Helpers¶
Various helper functions for working with Polars.
- oi_tools.polars_helpers.clean_col_name(
- name: str,
Normalize a column name to
snake_case.- Parameters:
name (str) – Raw column name string.
- Returns:
Cleaned column name with whitespace replaced by underscores, camelCase converted to snake_case, and non-alphanumeric characters replaced with underscores.
- Return type:
str
- oi_tools.polars_helpers.to_expr(
- x: str | Expr | int | float,
Convert the input to a Polars expression.
- Parameters:
x (str | Expr | int | float) – A Polars expression, column name string, or numeric literal.
- Returns:
A Polars expression.
- Return type:
pl.Expr
- oi_tools.polars_helpers.to_masked_expr(
- *xs: str | Expr | int | float,
Create a set of expressions with a standardized null mask.
All output expressions evaluate to null wherever any input expression is null.
- Parameters:
*xs (str | Expr | int | float) – Expressions to mask.
- Returns:
Expressions that evaluate to null when any input is null.
- Return type:
Sequence[pl.Expr]
- oi_tools.polars_helpers.to_selector(
- x: Collection[str] | Selector,
Convert the input to a Polars selector.
- Parameters:
x (Collection[str] | Selector) – A Polars selector or a collection of column names.
- Returns:
A Polars column selector.
- Return type:
cs.Selector
Various helper/utility functions.
- oi_tools.helpers.inflation_adjust(
- col: str | Expr | int | float,
- *,
- from_year: str | Expr | int | float,
- to_year: str | Expr | int | float,
- series: str = 'CUUR0000SA0',
Adjust for inflation using the Consumer Price Index.
Useful references:
- Parameters:
col (str | Expr | int | float) – The column (or columns) to adjust.
from_year (str | Expr | int | float) – The year in which the dollar value is currently measured.
to_year (str | Expr | int | float) – The year to which you would like to inflation adjust.
series (str) – The CPI series used for inflation adjustment,
CUUR0000SA0by default.
- Return type:
Expr
Examples
>>> df = pl.DataFrame({"income": [50000, 75000], "year": [2010, 2015]}) >>> df.with_columns( ... income_2023=inflation_adjust("income", from_year="year", to_year=2023) ... ) shape: (2, 3) ┌────────┬──────┬──────────────┐ │ income ┆ year ┆ income_2023 │ │ --- ┆ --- ┆ --- │ │ i64 ┆ i64 ┆ f64 │ ╞════════╪══════╪══════════════╡ │ 50000 ┆ 2010 ┆ 69867.832117 │ │ 75000 ┆ 2015 ┆ 96417.767502 │ └────────┴──────┴──────────────┘