Metadata-Version: 2.4
Name: qpx-tabular
Version: 0.1.0
Summary: A powerful, production-ready tabular data preprocessing and visualization library.
Author: Punit
License: MIT
Project-URL: Homepage, https://github.com/punitxdev/QPX
Project-URL: Bug Tracker, https://github.com/punitxdev/QPX/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=1.5.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: matplotlib>=3.5.0
Requires-Dist: seaborn>=0.11.0
Requires-Dist: scipy>=1.7.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: mkdocs>=1.4.0; extra == "dev"
Requires-Dist: mkdocs-material>=9.0.0; extra == "dev"
Dynamic: license-file

# QPX Tabular
[![Python Version](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code Coverage](https://img.shields.io/badge/Coverage-61%25-yellow.svg)]()
[![Documentation](https://img.shields.io/badge/Docs-Live-blue.svg)](https://punitxdev.github.io/QPX/)

QPX Tabular is a powerful, production-ready tabular data preprocessing and visualization library designed to accelerate data science workflows. It turns raw, messy pandas DataFrames into machine-learning ready datasets with a single line of code.

## Features

*   **Automated Preprocessing (`auto_preprocess`)**: Automatically handles missing values, drops constants, drops high-cardinality nominals, encodes categoricals intelligently, and downcasts memory.
*   **Fail-Loud Architecture**: Built for production. Instead of failing silently, QPX immediately alerts you (`KeyError`, `ValueError`) if you provide invalid data configurations.
*   **Comprehensive Data Health Diagnostics**: Get 360-degree views of your dataset's health via `dataset_health` and `statistical_snapshot`.
*   **Beautiful Visualizations**: One-line correlation heatmaps, distribution plots, and hierarchical feature clustering matrices.

## Installation

To install `qpx`, you can simply clone this repository and install it locally using `pip`:

```bash
git clone https://github.com/punitxdev/QPX.git
cd QPX
pip install -e .
```

### Dependencies
- `pandas`
- `numpy`
- `matplotlib`
- `seaborn`
- `scipy`

## Quickstart

Clean an entire dataset with one function:

```python
import pandas as pd
from qpx.tabular import preprocessing

# Load your raw data
df = pd.read_csv("my_messy_data.csv")

# Clean, encode, impute, and downcast in one go!
clean_df, report = preprocessing.auto_preprocess(
    df,
    max_onehot=10, 
    return_report=True
)

print(report)
```

Generate a deep-dive correlation map:

```python
from qpx.tabular import visuals

visuals.corr_map(clean_df, target="my_target_column")
```

## Documentation
The complete API reference and user guide is hosted online at: 
**[https://punitxdev.github.io/QPX/](https://punitxdev.github.io/QPX/)**

If you want to build the documentation locally for development:

```bash
pip install -e .[dev]
mkdocs serve
```

To publish the documentation to GitHub Pages, simply run:
```bash
mkdocs gh-deploy
```

## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

Made with love by Punit
