Metadata-Version: 2.4
Name: unslml
Version: 0.1.2
Summary: Simple AutoML library for classification and regression
Author: Naveen
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: pandas
Requires-Dist: numpy
Requires-Dist: scikit-learn
Requires-Dist: joblib
Dynamic: requires-python

# UNSLML

A state-of-the-art, robust, and highly accurate **AutoML and Machine Learning Library** in Python. 

`unslml` automatically detects task types (classification or regression), performs stateful feature engineering, filters extreme outliers, conducts hyperparameter searches, and provides simple one-line model saving and loading.

---

## 🌟 Key Features

* **Auto-Task Detection**: Automatically detects whether your target is a classification or regression task based on target column datatypes.
* **Smart Numeric Text Parser**: Automatically extracts numerical values from string columns that represent measurements or values (e.g., `"1200 sqft"` -> `1200.0`, `"42 Lac"` -> `4,200,000.0`, `"1.40 Cr"` -> `14,000,000.0`).
* **Robust Outlier Filtering**: Automatically identifies and filters extreme target outliers in regression (e.g. data entry typos) to prevent metric skew.
* **Stateful Preprocessing**: Saves imputations and categorical mapping encodings during training to ensure identical transformation on test/prediction sets.
* **Auto-Hyperparameter Tuning**: Performs grid search cross-validation across multiple standard estimators (Logistic/Linear Regression, Decision Trees, Random Forests, KNN).
* **Smart Performance Scaling**: Sub-samples extremely large datasets during the parameter search phase to run in seconds rather than hours.
* **Pipeline Serialization (Save & Load)**: Prompts you to save the entire pipeline state to a `.pkl` file at the end of training, which can be loaded back with a single line of code.

---

## 🚀 Installation

Install the library directly from PyPI using pip:

```bash
pip install unslml
```

---

## 💻 How to Use

### 1. Training & Auto-Saving a Pipeline
Create a script (e.g., `train.py`) to fit the model. The fitting process automatically runs preprocessing, tunes multiple models, reports evaluation scores, and prompts you to save the best model:

```python
from unslml import AutoML

# Initialize AutoML pipeline
ml = AutoML()

# Fit model (auto-detects task type, handles preprocessing & fits best model)
ml.fit(
    file="house_prices.csv",
    target="Price (in rupees)"
)
# Prompt: "Enter the file path to save the best model (default: best_model.pkl): "
```

### 2. Loading & Predicting on Unseen Data
You can load the saved `.pkl` file (which contains the best model, categorical mappings, and median values) and predict on raw, unprocessed pandas DataFrames:

```python
import pandas as pd
from unslml import AutoML

# Load the entire trained pipeline
ml_loaded = AutoML.load("best_model.pkl")

# New raw sample data to predict
new_houses = pd.DataFrame({
    'location': ['location_name'],
    'Bathroom': [2],
    'Balcony': [1.0],
    'facing': ['North'],
    'Furnishing': ['Semi-Furnished'],
    'Transaction': ['Resale']
})

# Make predictions directly (preprocessing is applied automatically)
predictions = ml_loaded.predict(new_houses)
print("Predicted Prices:", predictions)
```
