Metadata-Version: 2.4
Name: stattools-anannya
Version: 0.1.6
Summary: Lightweight Python statistics library for descriptive stats and IQR-based outlier detection.
Author: Anannya Vyas
License: MIT License
        
        Copyright (c) 2026 Anannya Vyas
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/Anannya-Vyas/my-python-library
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: license-file

# 📊 StatTools

<p align="center">
  <em>A lightweight, zero-dependency Python statistics library for learning and analysis</em>
</p>




## 📢 A Note from the Author

> **Hi! I'm a student, and this is my first Python package.** 🎓
>
> I didn't create StatTools to compete with established libraries or to impress anyone—I built it as a **learning exercise** to understand how Python packages work, how to structure code properly, and how to publish to PyPI. This project helped me practice fundamental concepts like package structure, testing, documentation, and distribution.
>
> **StatTools is not a production-ready, feature-complete statistics library.** It's a student project that implements basic statistical functions as a learning journey. I'm sharing it openly because someone else learning Python might find it useful, or at least see how a beginner approaches building their first package.
>
> I plan to improve and expand it over time as I learn more. If you're a student like me, feel free to explore the code, suggest improvements, or even fork it for your own learning!
>
> — Anannya Vyas

## Acknowledgments

**Special thanks to my teacher [Lovnish Verma](https://github.com/lovnishverma)** for inspiring me to take on this project. Their own package, [**snapmyenv**](https://pypi.org/project/snapmyenv/), served as motivation and a reference for how to structure and publish a Python library. This wouldn't exist without their guidance and encouragement!

---

**StatTools** is a lightweight, zero-dependency statistics library designed to solve the "I need quick stats without NumPy" problem for students, educators, and developers. It provides essential descriptive statistics and outlier detection using only Python's standard library—making it perfect for learning environments, academic projects, and situations where you need reliable statistical analysis without heavy frameworks.

Share your code with confidence, knowing StatTools works everywhere Python runs—no compilation, no platform conflicts, no dependency hell.

## 🚀 Key Features

- 📈 **Descriptive Statistics**: Calculate mean, median, and percentiles with straightforward, textbook-accurate implementations
- 📊 **Dispersion Measures**: Compute Interquartile Range (IQR) for understanding data spread
- 🔍 **Outlier Detection**: Identify anomalies using the industry-standard IQR method
- 🛡️ **Zero Dependencies**: Built using only Python's standard library—install it anywhere without conflicts
- ✅ **Fully Tested**: Comprehensive pytest coverage ensures reliability
- 🪶 **Lightweight**: Minimal footprint, maximum clarity
- 📚 **Educational**: Clean, readable code that mirrors statistical textbook definitions

## 📦 Installation

```bash
pip install stattools-anannya==0.1.6
```

## ⚡ Quick Start

### The "Instant Analysis" Workflow

**Step 1: Import and Analyze**

```python
import stattools

# Your dataset
grades = [78, 82, 85, 88, 90, 92, 95, 45, 98, 100]

# Get insights instantly
print(f"Class Average: {stattools.mean(grades):.1f}")
print(f"Median Score: {stattools.median(grades):.1f}")
print(f"Top 25% Threshold: {stattools.percentile(grades, 75):.1f}")
print(f"Score Spread (IQR): {stattools.iqr(grades):.1f}")
print(f"Outliers: {stattools.detect_outliers_iqr(grades)}")
```

**Output:**
```
Class Average: 85.3
Median Score: 91.0
Top 25% Threshold: 96.2
Score Spread (IQR): 13.0
Outliers: [45]
```

### Common Use Cases

**Quality Control:**
```python
from stattools import mean, iqr, detect_outliers_iqr

# Product weights in grams
weights = [500, 502, 498, 501, 503, 499, 520, 497, 500, 502]

avg_weight = mean(weights)
variability = iqr(weights)
defects = detect_outliers_iqr(weights)

print(f"Average: {avg_weight:.2f}g (±{variability:.2f}g IQR)")
print(f"Defective items: {defects}")
```

**Financial Screening:**
```python
from stattools import percentile, detect_outliers_iqr

# Daily returns (%)
returns = [0.5, -0.3, 0.8, -0.2, 0.4, 12.5, -0.1, 0.6]

normal_range = percentile(returns, 95)
anomalies = detect_outliers_iqr(returns)

print(f"95% of returns below: {normal_range:.2f}%")
print(f"Abnormal trading days: {anomalies}")
```

## 📖 API Reference

### `mean(data)` → float

Calculates the arithmetic mean (average) of a dataset.

**Parameters:**
- `data` (list/tuple): Numeric values

**Returns:** Float representing the mean

**Example:**
```python
stattools.mean([10, 20, 30, 40, 50])  # Returns: 30.0
```

---

### `median(data)` → float

Finds the middle value in a sorted dataset. For even-length datasets, returns the average of the two middle values.

**Parameters:**
- `data` (list/tuple): Numeric values

**Returns:** Float representing the median

**Example:**
```python
stattools.median([1, 2, 3, 4, 5])  # Returns: 3.0
stattools.median([1, 2, 3, 4])     # Returns: 2.5
```

---

### `percentile(data, p)` → float

Calculates the p-th percentile using linear interpolation between closest ranks.

**Parameters:**
- `data` (list/tuple): Numeric values
- `p` (int/float): Percentile to calculate (0-100)

**Returns:** Float representing the percentile value

**Example:**
```python
stattools.percentile([10, 20, 30, 40, 50], 75)  # Returns: 40.0
stattools.percentile([1, 2, 3, 4, 5], 50)       # Returns: 3.0 (same as median)
```

---

### `iqr(data)` → float

Computes the Interquartile Range (Q3 - Q1), a measure of statistical dispersion.

**Parameters:**
- `data` (list/tuple): Numeric values

**Returns:** Float representing the IQR

**Example:**
```python
stattools.iqr([1, 2, 3, 4, 5, 6, 7, 8, 9])  # Returns: 4.0
```

---

### `detect_outliers_iqr(data, multiplier=1.5)` → list

Identifies outliers using the IQR method. Values are considered outliers if they fall outside:
- **Lower bound:** Q1 - (multiplier × IQR)
- **Upper bound:** Q3 + (multiplier × IQR)

**Parameters:**
- `data` (list/tuple): Numeric values
- `multiplier` (float): Sensitivity factor (default: 1.5, standard statistical practice)

**Returns:** List of outlier values

**Example:**
```python
data = [5, 7, 8, 10, 12, 100]
stattools.detect_outliers_iqr(data)              # Returns: [100]
stattools.detect_outliers_iqr(data, multiplier=3.0)  # Less sensitive, Returns: [100]
```

**Interpretation:**
- `multiplier=1.5` (default): Standard outlier detection
- `multiplier=3.0`: Extreme outliers only
- Lower multipliers → more sensitive (flags more values)

## 🔍 What Makes StatTools Different?

Unlike heavyweight scientific computing libraries, StatTools focuses on:

| Feature | StatTools | NumPy/SciPy/Pandas |
|---------|-----------|-------------------|
| **Dependencies** | None (pure Python) | Compiled C/Fortran binaries |
| **Install Size** | ~10 KB | 50-100+ MB |
| **Learning Curve** | Minimal | Steep |
| **Platform Issues** | None | Common on ARM/M1/Windows |
| **Code Clarity** | Readable textbook implementations | Optimized C wrappers |
| **Best For** | Learning, teaching, simple scripts | Production data science |

## 💡 Real-World Examples

### Example 1: Grade Analysis System

```python
from stattools import mean, median, percentile, detect_outliers_iqr

class GradeAnalyzer:
    def __init__(self, scores):
        self.scores = scores
    
    def summary(self):
        return {
            'average': mean(self.scores),
            'median': median(self.scores),
            'top_10_percent': percentile(self.scores, 90),
            'struggling_students': [s for s in self.scores if s < percentile(self.scores, 25)],
            'anomalies': detect_outliers_iqr(self.scores)
        }

# Usage
analyzer = GradeAnalyzer([78, 82, 85, 88, 90, 92, 95, 45, 98, 100])
report = analyzer.summary()
print(report)
```

### Example 2: Manufacturing Quality Dashboard

```python
from stattools import mean, iqr, detect_outliers_iqr

def quality_check(measurements, tolerance_iqr=5.0):
    """
    Check if manufacturing process is within acceptable variability.
    """
    avg = mean(measurements)
    spread = iqr(measurements)
    defects = detect_outliers_iqr(measurements)
    
    status = "PASS" if spread <= tolerance_iqr and len(defects) == 0 else "FAIL"
    
    return {
        'status': status,
        'average': avg,
        'variability': spread,
        'defect_count': len(defects),
        'defective_items': defects
    }

# Daily production run
batch = [500.1, 499.8, 500.3, 500.0, 499.9, 500.2, 515.0]
print(quality_check(batch))
# {'status': 'FAIL', 'average': 502.19, 'variability': 0.4, 
#  'defect_count': 1, 'defective_items': [515.0]}
```

### Example 3: Sports Performance Tracking

```python
from stattools import median, percentile

# Player sprint times (seconds)
sprint_times = [10.2, 10.5, 10.3, 10.4, 10.6, 10.1, 10.5, 10.3]

typical_time = median(sprint_times)
personal_best = min(sprint_times)
consistency_target = percentile(sprint_times, 25)  # Top 25% performance

print(f"Typical Performance: {typical_time}s")
print(f"Personal Best: {personal_best}s")
print(f"Consistency Target (75th percentile): {consistency_target}s")
```

## 🧪 Running Tests

StatTools uses pytest for comprehensive testing.

**Install pytest:**
```bash
pip install pytest
```

**Run all tests:**
```bash
python -m pytest
```

**Run with verbose output:**
```bash
python -m pytest -v
```

**Generate coverage report:**
```bash
pip install pytest-cov
python -m pytest --cov=stattools --cov-report=html
```

All tests should pass ✅

## 📁 Project Structure

```
stattools/
├── stattools/
│   ├── __init__.py          # Package initialization & public API
│   ├── descriptive.py       # Mean, median, percentile functions
│   └── outliers.py          # IQR calculation & outlier detection
├── tests/
│   └── test_stattools.py    # Comprehensive test suite
├── README.md                # This documentation
├── LICENSE                  # MIT License
├── setup.py                 # Package configuration
├── .gitignore               # Git exclusions
└── requirements-dev.txt     # Development dependencies
```

## ⚠️ Limitations

- **Performance**: Optimized for clarity over speed. For datasets with millions of rows, consider NumPy/Pandas.
- **Scope**: Focuses on descriptive statistics. Does not include inferential statistics (t-tests, ANOVA, regression, etc.).
- **Data Types**: Expects numeric data (int/float). Does not handle categorical data or timestamps.
- **Missing Data**: Does not have built-in handling for NaN/None values. Clean your data first.

## 🗺️ Roadmap

Future enhancements under consideration:

- [ ] Standard deviation and variance
- [ ] Mode calculation (handling multimodal distributions)
- [ ] Z-score outlier detection
- [ ] Covariance and correlation
- [ ] Summary statistics report generator
- [ ] Support for weighted statistics
- [ ] Basic data validation utilities

**Want to see a feature?** Open an issue or submit a PR!

## 💻 Development

### Setup Development Environment

```bash
# Clone the repository
git clone https://github.com/Anannya-Vyas/my-python-library.git
cd my-python-library

# Install in editable mode with dev dependencies
pip install -e ".[dev]"
```

### Running Checks

```bash
# Run tests
pytest

# Check code formatting (if using Black)
black --check stattools/

# Type checking (if using mypy)
mypy stattools/
```

## 🤝 Contributing

Contributions are welcome! Whether it's bug fixes, new features, documentation improvements, or examples—your help makes StatTools better for everyone.

**How to contribute:**

1. **Fork** the repository
2. **Create** a feature branch (`git checkout -b feature/amazing-feature`)
3. **Write tests** for your changes
4. **Ensure all tests pass** (`pytest`)
5. **Commit** your changes (`git commit -m 'Add amazing feature'`)
6. **Push** to your fork (`git push origin feature/amazing-feature`)
7. **Open** a Pull Request

**Contribution Guidelines:**
- All new functions must include docstrings and examples
- Maintain zero-dependency philosophy (standard library only)
- Add tests for all new functionality
- Keep code readable and educational

## 🐛 Found a Bug?

Open an issue on [GitHub Issues](https://github.com/Anannya-Vyas/my-python-library/issues) with:

- **Clear description** of the problem
- **Steps to reproduce** the issue
- **Expected behavior** vs. actual behavior
- **Python version** and operating system
- **Sample data** (if applicable)

## 📄 Changelog

### v1.0.0
- Initial release
- Core descriptive statistics (mean, median, percentile)
- IQR calculation
- IQR-based outlier detection
- Comprehensive test coverage
- Published on PyPI

## 📄 License

This project is licensed under the **MIT License** — see the [LICENSE](LICENSE) file for details.

You are free to use, modify, and distribute this software with proper attribution.

## 👩‍💻 Author

**Anannya Vyas**

- GitHub: [@Anannya-Vyas](https://github.com/Anannya-Vyas)
- PyPI: [stattools-anannya](https://pypi.org/project/stattools-anannya/)
- Project: [my-python-library](https://github.com/Anannya-Vyas/my-python-library)

## ⭐ Show Your Support

If StatTools helped you with your project, consider:

- ⭐ **Starring** the repository on [GitHub](https://github.com/Anannya-Vyas/my-python-library)
- 📢 **Sharing** it with classmates, colleagues, and on social media
- 🐛 **Reporting bugs** to help improve the library
- 💡 **Contributing** new features or documentation improvements

---

<p align="center">
  <strong> Made by a student learning Python package development</strong>
</p>
