Metadata-Version: 2.3
Name: clean-column-names
Version: 0.1.0
Summary: Add your description here
Author: Dheepak Krishnamurthy
Author-email: Dheepak Krishnamurthy <1813121+kdheepak@users.noreply.github.com>
Requires-Dist: pandas>=3.0.3
Requires-Python: >=3.13
Description-Content-Type: text/markdown

# clean-column-names

Clean pandas `DataFrame` column names into predictable, consistent case styles.

## Installation

```bash
uv add clean-column-names
```

## Usage

```python
import pandas as pd
import clean_column_names

df = pd.DataFrame(
    columns=[
        "First Name",
        "Café Sales ($)",
        "HTTPStatusCode",
        "",
        None,
        "First Name",
    ]
)

df = df.pipe(clean_column_names.clean_column_names)

print(df.columns.tolist())
```

Output:

```python
[
    "first_name",
    "cafe_sales_$",
    "http_status_code",
    "column",
    "column_1",
    "first_name_1",
]
```

The original `DataFrame` is not modified.

## API

```python
df = df.pipe(
    clean_column_names.clean_column_names,
    case="snake",
    replace=None,
    remove_accents=True,
)
```

### Arguments

`df`: A pandas `DataFrame`.

`case`: The target case style. Defaults to `"snake"`.

`replace`: Optional mapping of literal text replacements to apply before case conversion. Matching
is case-insensitive.

`remove_accents`: When `True`, accented characters are transliterated to ASCII where possible.
Defaults to `True`.

## Case Styles

| `case`       | Example       |
| ------------ | ------------- |
| `"snake"`    | `column_name` |
| `"kebab"`    | `column-name` |
| `"camel"`    | `columnName`  |
| `"pascal"`   | `ColumnName`  |
| `"const"`    | `COLUMN_NAME` |
| `"sentence"` | `Column name` |
| `"title"`    | `Column Name` |
| `"lower"`    | `column name` |
| `"upper"`    | `COLUMN NAME` |

## Examples

Use kebab case:

```python
df = df.pipe(
    clean_column_names.clean_column_names,
    case="kebab",
)

print(df.columns.tolist())
```

```python
[
    "first-name",
    "cafe-sales-$",
    "http-status-code",
    "column",
    "column-1",
    "first-name-1",
]
```

Apply replacements before cleaning:

```python
df = df.pipe(
    clean_column_names.clean_column_names,
    replace={"HTTP": "API"},
)

print(df.columns.tolist())
```

```python
[
    "first_name",
    "cafe_sales_$",
    "api_status_code",
    "column",
    "column_1",
    "first_name_1",
]
```

Keep accented characters:

```python
df = df.pipe(
    clean_column_names.clean_column_names,
    case="title",
    remove_accents=False,
)

print(df.columns.tolist())
```

```python
[
    "First Name",
    "Café Sales ($)",
    "Http Status Code",
    "Column",
    "Column 1",
    "First Name 1",
]
```

## Behavior Notes

Blank and null column names are converted to `column`.

If multiple columns clean to the same name, numeric suffixes are added using the target case style's
separator:

```python
df = pd.DataFrame(columns=["Name", "Name", "Name"])
df = df.pipe(clean_column_names.clean_column_names)

print(df.columns.tolist())
```

```python
["name", "name_1", "name_2"]
```

This package supports ordinary flat columns and pandas `MultiIndex` columns.

## License

MIT
