Metadata-Version: 2.4
Name: bm-eval-metrics
Version: 1.0.0
Summary: A package to evaluate bm metrics
Requires-Python: >=3.8
Requires-Dist: build>=1.2.2.post1
Requires-Dist: graphviz>=0.20.3
Requires-Dist: matplotlib>=3.7.5
Requires-Dist: nltk>=3.9.1
Requires-Dist: pandas>=2.0.3
Requires-Dist: scikit-learn>=1.3.2
Requires-Dist: twine>=6.1.0
Description-Content-Type: text/markdown

# 📄 `bm-eval-metrics`

**`bm-eval-metrics`** is a Python package providing easy-to-use evaluation metrics and utilities for Data Mining and Information Retrieval modules. It helps you access and view source code for various DM and IR algorithms efficiently.

---

## ✨ Features

* Data Mining algorithms (Hunt's, ID3, Bagging, AdaBoost, Apriori, etc.)
* Information Retrieval metrics (Jaccard, Precision/Recall/F-score, MAP, etc.)
* Near Duplicate Document detection (MinHash & LSH)
* Relevance Feedback (Rocchio & LCA)
* Source code viewer for all modules
* Built on NLTK, pandas, and scikit-learn

---

## 📦 Installation

Install from PyPI:

```bash
pip install bm-eval-metrics
```

---

## 🚀 Quick Start

### Basic Usage

```python
from eval_metrics.DM import adaboost, apriori, metrics
from eval_metrics.IR import eval_metrics, ndd, rel

# Print the source code directly
print("=== DM AdaBoost Module ===")
print(adaboost)

print("\n=== IR Evaluation Metrics ===")
print(eval_metrics)

print("\n=== IR Near Duplicate Documents ===")
print(ndd)
```

---

## 🛠️ Components Overview

| Component | Description |
| --------- | ----------- |
| **Information Retrieval (IR)** | |
| `eval_metrics.IR.all` | Cohesive IR File: MinHash, LSH, Rocchio, Jaccard, VS |
| `eval_metrics.IR.all_vis` | Cohesive IR File + Matplotlib visualizations & Heatmaps |
| `eval_metrics.IR.ndd` | Near Duplicate Documents (MinHash & LSH) |
| `eval_metrics.IR.rel` | Relevance feedback & query expansion (Rocchio & LCA) |
| `eval_metrics.IR.eval_metrics` | Jaccard, PRF, Compression Ratios, MAP metrics & plots |
| **Data Mining (DM)** | |
| `eval_metrics.DM.all` | Cohesive DM File: Hunt's, ID3, Bagging, AdaBoost, Metrics |
| `eval_metrics.DM.all_vis` | Cohesive DM File + Graphviz & Matplotlib visualizations |
| `eval_metrics.DM.apriori` | Apriori algorithm |
| `eval_metrics.DM.adaboost` | Bagging & AdaBoost ensemble classifiers |
| `eval_metrics.DM.bagging` | Bagging ensemble classifier |
| `eval_metrics.DM.hash` | Hash-based mining |
| `eval_metrics.DM.hunts` | Hunt's decision tree algorithm |
| `eval_metrics.DM.hunts_test` | Hunt's decision tree with dataset visualization |
| `eval_metrics.DM.id3` | ID3 decision tree algorithm |
| `eval_metrics.DM.id3_test` | ID3 decision tree with dataset visualization |
| `eval_metrics.DM.metrics` | Classification metrics & curves |
| `eval_metrics.DM.preprocessing` | Data preprocessing utilities |
| `eval_metrics.DM.lib_doc` | Pandas, NumPy, Sklearn cheat sheet (DM & IR logic) |
| `eval_metrics.DM.python_doc` | Python Basics cheat sheet (Sets, Dicts, Comprehensions, etc.) |

---

## 📚 Requirements

* Python 3.8+
* nltk
* pandas
* scikit-learn (for vectorization)
* matplotlib
* graphviz

Install dependencies automatically with:

```bash
pip install bm-eval-metrics
```

---

## 📂 Project Structure

```
eval_metrics/
│
├── src/
│   └── eval_metrics/
│       ├── __init__.py
│       ├── DM/
│       │   ├── __init__.py
│       │   ├── adaboost.py
│       │   ├── all.py
│       │   ├── ...
│       │   └── sources/
│       └── IR/
│           ├── __init__.py
│           ├── all.py
│           ├── ...
│           └── sources/
├── pyproject.toml
├── README.md
├── USAGE.md
└── INSTALLATION.md
```

---

## 🤝 Contributing

Contributions are welcome!

1. Fork the repository
2. Create a new branch
3. Commit your changes
4. Open a pull request

---

## 📄 License

This project is licensed under the **MIT License**.

---

## 📬 Support

If you encounter any issues or have feature requests, please open an issue on GitHub.

---
