Metadata-Version: 2.4
Name: data-autoeda
Version: 2.0.0
Summary: Automatic Exploratory Data Analysis, Cleaning, Validation, Visualization, and Smart Insights on ANY dataset.
Author: shubham kumar
License: MIT
Project-URL: Homepage, https://github.com/Shu40/AutoEDA/
Project-URL: Repository, https://github.com/Shu40/AutoEDA/
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: pandas>=1.3.0
Requires-Dist: numpy>=1.20.0
Requires-Dist: matplotlib>=3.4.0
Requires-Dist: seaborn>=0.11.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: click>=8.0.0
Requires-Dist: rich>=10.0.0
Requires-Dist: reportlab>=3.6.0
Requires-Dist: streamlit>=1.20.0
Requires-Dist: openpyxl>=3.0.0
Requires-Dist: xlrd>=2.0.1
Provides-Extra: fast
Requires-Dist: polars>=0.19.0; extra == "fast"

# AutoEDA

A production-ready Python package that performs automatic Exploratory Data Analysis (EDA), Data Cleaning, Data Validation, Visualization, and Smart Insights on ANY dataset.

## Features
- **Multi-Format Support**: Analyze `.csv`, `.xlsx`, `.xls`, and `.json` effortlessly.
- **Dataset Health Score**: Instantly see a 0-100 score with category (Excellent/Good/Fair/Poor).
- **Chat With Dataset**: Ask natural language questions about your data interactively.
- **Executive Summary**: Get a concise, manager-friendly overview of your dataset.
- **Streamlit Dashboard**: Launch an interactive web interface with `autoeda dashboard`.
- **PDF Report Generation**: Professional PDF report with tables, charts, and insights.
- **Smart Insights Engine**: Generates 10+ business-style data observations automatically.
- **Cleaning Recommendations**: Actionable, numbered cleaning steps for your specific data.
- **Large Dataset Mode**: Smartly samples and optimizes memory for files >100MB or >100k rows.
- **Cleaned Data Export**: Export cleaned data straight from the CLI.
- **Performance Optimized**: Optional Polars backend for lightning-fast loading of large datasets.
- **Visualizations**: Automatically generates relevant charts using Matplotlib and Seaborn.
- **Rich Terminal UI**: Beautiful, organized CLI reports.

## Installation

```bash
# Basic installation
pip install .

# Installation with Polars backend for large datasets
pip install .[fast]
```

## Usage

### CLI

```bash
# Complete analysis
autoeda data.csv
autoeda data.xlsx
autoeda data.json

# Clean data and export to cleaned_data.csv
autoeda data.csv --clean

# Generate a PDF report
autoeda data.csv --report

# Executive summary
autoeda data.csv --summary

# Chat with your dataset
autoeda data.csv --ask

# Only visualizations
autoeda data.csv --visualize

# Run everything
autoeda data.csv --all

# Launch Interactive Dashboard
autoeda dashboard
```

### Python API

```python
from autoeda.cli import analyze

# Complete analysis
df = analyze("data.csv")

# Executive summary only
analyze("data.csv", summary=True)

# Clean data
df_clean = analyze("data.csv", clean=True)
```

