Metadata-Version: 2.4
Name: india-housing-datasets
Version: 1.0.0
Summary: Standardized Indian housing datasets for data analysis, visualization, and machine learning practice.
Home-page: https://github.com/Rvbaghel/india_housing_datasets
Author: Vishal Baghel
Author-email: baghelvishal264@gmail.com
License: MIT
Project-URL: Documentation, https://github.com/Rvbaghel/india_housing_datasets#readme
Project-URL: Source, https://github.com/Rvbaghel/india_housing_datasets
Project-URL: Bug Tracker, https://github.com/Rvbaghel/india_housing_datasets/issues
Keywords: india indian housing dataset real estate housing price pandas python machine learning data science
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=1.3.0
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: project-url
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# 🏠 Indian Housing Datasets - Python Library

A lightweight Python library providing **standardized housing datasets for major Indian cities**. Perfect for learning data science, practicing machine learning, and building housing price prediction models. All datasets are returned as **pandas DataFrames** for seamless integration with the Python data science ecosystem.

Ideal for students, beginners, and ML practitioners exploring **Indian real estate data analysis**, **housing price prediction**, and **regression modeling**.

---

## 📦 Installation

Install via pip:

```bash
pip install india-housing-datasets
```

---

## 🚀 Quick Start

Load a city's housing dataset and explore it as a pandas DataFrame:

```python
from india_housing_datasets import load_housing

# Load Mumbai housing data
df = load_housing("mumbai")

# Explore the data
print(df.head())
print(df.info())
print(df.describe())

# Check for missing values
print(df.isnull().sum())
```

---

## 📊 Visualization Example

Visualize relationships between housing features:

```python
import matplotlib.pyplot as plt
from india_housing_datasets import load_housing

df = load_housing("bangalore")

# Scatter plot: Area vs Price
df.plot.scatter(x="area_sqft", y="price_lakhs", alpha=0.5, figsize=(10, 6))
plt.title("Housing Prices in Bangalore")
plt.xlabel("Area (sq ft)")
plt.ylabel("Price (Lakhs ₹)")
plt.show()

# Distribution of BHK types
df["bhk"].value_counts().plot(kind="bar", color="steelblue")
plt.title("Distribution of BHK Types")
plt.xlabel("BHK")
plt.ylabel("Count")
plt.show()
```

---

## 🤖 Machine Learning Example

Build a simple **housing price prediction model** using linear regression:

```python
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, r2_score
from india_housing_datasets import load_housing

# Load Delhi housing data
df = load_housing("delhi")

# Prepare features and target
X = df[["area_sqft", "bhk", "bath", "age_years"]]
y = df["price_lakhs"]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print(f"Mean Absolute Error: {mean_absolute_error(y_test, y_pred):.2f} Lakhs")
print(f"R² Score: {r2_score(y_test, y_pred):.2f}")
```

---

## 🌆 Available Cities

The library currently supports housing datasets for the following Indian cities:

- **Mumbai**
- **Delhi**
- **Bangalore**
- **Hyderabad**
- **Chennai**
- **Pune**
- **Ahmedabad**
- **Kolkata**
- **Jaipur**
- **Chandigarh**

Load any city using:

```python
df = load_housing("city_name")  # e.g., "mumbai", "delhi", "bangalore"
```

---

## 📋 Dataset Schema

Each dataset contains the following standardized columns:

| Column        | Type    | Description                              |
|---------------|---------|------------------------------------------|
| `city`        | string  | Name of the city                         |
| `locality`    | string  | Locality/area within the city            |
| `area_sqft`   | integer | Built-up area in square feet             |
| `bhk`         | integer | Number of bedrooms (BHK)                 |
| `bath`        | integer | Number of bathrooms                      |
| `floor`       | integer | Floor number                             |
| `age_years`   | integer | Age of the property in years             |
| `price_lakhs` | float   | Property price in lakhs (₹)              |

---

## 💡 Use Cases

This library is designed for:

- **Learning Data Science**: Practice pandas, data cleaning, and exploratory data analysis (EDA)
- **Housing Price Prediction**: Build regression models to predict Indian real estate prices
- **Data Visualization**: Create charts and dashboards with matplotlib, seaborn, or plotly
- **Machine Learning Practice**: Experiment with feature engineering, model training, and evaluation
- **Academic Projects**: Use standardized datasets for coursework and research
- **Portfolio Building**: Showcase data science skills with Indian housing market analysis

---

## ⚠️ Deprecation Notice

**Important**: Older `fetch_*` dataset functions are **deprecated** and will be removed in future versions.

Please migrate to the new API:

```python
# ❌ Old (Deprecated)
from india_housing_datasets import fetch_mumbai_housing
df = fetch_mumbai_housing()

# ✅ New (Recommended)
from india_housing_datasets import load_housing
df = load_housing("mumbai")
```

---

## ⚖️ Disclaimer

The datasets provided in this library are **synthetically generated** and standardized to resemble Indian housing markets. They are intended for **educational purposes, data visualization practice, and machine learning experimentation only**.

**This data should not be used for:**
- Real estate investment decisions
- Market analysis or research
- Commercial applications

For real-world applications, please use authentic data sources.

---

## 👨‍💻 Author

**Vishal Baghel**  
📧 baghelvishal264@gmail.com  
🌐 [GitHub Repository](https://github.com/Rvbaghel/)

---

## 📜 License

**MIT License** © 2025 Vishal Baghel

---

## 🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the [issues page](https://github.com/Rvbaghel/india-housing-datasets/issues).

---

## ⭐ Support

If you find this library helpful, please consider giving it a ⭐ on GitHub!
