Metadata-Version: 2.4
Name: pydepspy
Version: 0.1.1
Summary: Security scanner for Python dependencies - detects typosquats, CVEs, malicious hooks, unused deps, and metadata anomalies
Author-email: Tanuj Saxena <tanuj.saxena.rks@gmail.com>
License: MIT
Keywords: security,dependencies,scanning,vulnerability,typosquat
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.1.0
Requires-Dist: rich>=13.0.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: packaging>=23.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: pypi-simple>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black==23.9.1; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Dynamic: license-file

# PyDepSpy - Python Dependency Security Scanner

![PyDepSpy](https://img.shields.io/badge/Security-Python%20Dependencies-blue)

**PyDepSpy** is a production-ready CLI tool that scans Python project dependencies for security vulnerabilities and suspicious packages. It runs **5 independent detection engines in parallel** across your dependency tree, detecting typosquats, CVEs, malicious install hooks, unused dependencies, and metadata anomalies.

## Why PyDepSpy?

✅ **5 Detection Engines** - Comprehensive scanning:
  - **Typosquat Detection**: Finds packages similar to popular ones (e.g., `requets` instead of `requests`)
  - **CVE Scanner**: Queries [OSV.dev](https://osv.dev) for known vulnerabilities
  - **Malicious Hooks**: AST-scans `setup.py` for suspicious install-time code execution
  - **Ghost Dependencies**: Identifies unused declared dependencies (security surface area)
  - **Metadata Auditor**: Flags red flags in PyPI metadata (no homepage, brand new packages, suspicious emails)

✅ **Parallel Execution** - Scans 50 packages in ~3 seconds (not 50 seconds)

✅ **Local Caching** - Second scan of same project is instant (24-hour TTL on API responses)

✅ **Beautiful Output** - Rich terminal tables with color-coded severity

✅ **Multiple Formats** - Terminal, JSON, and SARIF (GitHub Code Scanning)

✅ **CI/CD Ready** - Exit code 1 on CRITICAL/HIGH findings, composable in pipelines

## Installation

```bash
pip install pydepspy
```

Or from source:

```bash
git clone https://github.com/tanuj437/pydepspy.git
cd pydepspy
pip install -e .
```

## Quick Start

```bash
# Scan current project (terminal)
pydepspy scan

# Scan a specific directory (example used below)
pydepspy scan --project F:\\Vedika

# Output as JSON
pydepspy scan --format json

# Save to file
pydepspy scan --output report.json

# Use SARIF for GitHub Code Scanning
pydepspy scan --format sarif --output results.sarif
```

## How I used it

Example: scanned F:\\Vedika and produced JSON and terminal reports. Commands used:

```powershell
# Run a JSON scan (PowerShell)
python -m pydepspy.cli scan --project F:\\Vedika --format json

# Run the terminal report
python -m pydepspy.cli scan --project F:\\Vedika
```

Sample JSON output (trimmed):

```json
{
  "version": "1.0",
  "findings": [
    {"package": "torch", "severity": "CRITICAL", "type": "CVE", "cve_id": "GHSA-47fc-vmwq-366v", "fix": "Upgrade to >= 1.13.1"},
    {"package": "numpy", "severity": "MEDIUM", "type": "CVE", "cve_id": "GHSA-6p56-wp2h-9hxr", "fix": "Upgrade to >= 1.21"},
    {"package": "huggingface_hub", "severity": "LOW", "type": "METADATA_ANOMALY"}
  ],
  "statistics": {"total": 8, "by_severity": {"CRITICAL": 1, "HIGH": 0, "MEDIUM": 1, "LOW": 6}}
}
```

Actionable next steps after a scan:

- Immediately address CRITICAL/HIGH findings (upgrade or remove packages).
- Inspect UNUSED_DEP items and remove unused dependencies from your manifest.
- Pin or avoid brand-new packages until vetted.
- Re-run PyDepSpy and your test suite after changes.

These examples show the exact commands and a real scan result to make it clear how to use PyDepSpy in your workflow.

## Usage

### Basic Scan

```bash
$ pydepspy scan
🔍 Scanning .
✓ No issues found across 24 packages
```

### With Findings

```bash
$ pydepspy scan
🔍 Scanning .

┏━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ Severity ┃ Package       ┃ Type              ┃ Message            ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ CRITICAL │ requets       │ TYPOSQUAT         │ Possible typosquat │
│ HIGH     │ django        │ CVE               │ CVE-2024-XXXXX     │
│ MEDIUM   │ unused-pkg    │ UNUSED            │ Never imported     │
└──────────┴───────────────┴───────────────────┴────────────────────┘

3 CRITICAL • 1 HIGH • 1 MEDIUM across 24 packages
```

### Command Line Options

```
Options:
  --project PATH         Path to Python project [default: current directory]
  --format TEXT          Output format: text, json, sarif [default: text]
  --output FILE          Write output to file instead of stdout
  --cache / --no-cache   Use local cache for API responses [default: cache enabled]
  --help                 Show help message
```

## Exit Codes

- `0` - No CRITICAL or HIGH findings
- `1` - One or more CRITICAL or HIGH findings detected
- Other - Error occurred during scanning

This makes PyDepSpy composable in CI pipelines:

```bash
pydepspy scan || exit 1  # Fail the build if issues found
```

## Output Formats

### Terminal (Default)

Beautiful, human-readable Rich tables with color-coded severity.

### JSON

Structured output for programmatic processing:

```json
{
  "version": "1.0",
  "findings": [
    {
      "package": "malicious-pkg",
      "severity": "CRITICAL",
      "type": "TYPOSQUAT",
      "message": "Possible typosquat of: requests",
      "cve_id": null,
      "fix": null
    }
  ],
  "statistics": {
    "total": 1,
    "by_severity": {"CRITICAL": 1, "HIGH": 0, "MEDIUM": 0, "LOW": 0}
  }
}
```

### SARIF

For GitHub Code Scanning integration:

```bash
pydepspy scan --format sarif --output results.sarif
```

Then upload to GitHub:

```bash
gh codeql database upload-results results.sarif --repository=...
```

## Supported Formats

PyDepSpy automatically detects and parses dependencies from:

- `requirements.txt` - Standard pip format
- `pyproject.toml` - PEP 621 and Poetry format
- `setup.py` - Setuptools install_requires
- `Pipfile` / `Pipfile.lock` - Pipenv format

## Caching

API responses are cached locally at `~/.pydepspy/cache.db` with a 24-hour TTL:

```bash
# Use cache (default)
pydepspy scan

# Disable cache
pydepspy scan --no-cache

# Clear cache
rm ~/.pydepspy/cache.db
```

Second scans of the same project are instant! 🚀

## Development

### Setup

```bash
git clone https://github.com/tanuj437/pydepspy.git
cd pydepspy
pip install -e ".[dev]"
```

### Run Tests

```bash
pytest
pytest --cov=pydepspy
```

### Run Linters

```bash
black pydepspy tests
ruff check pydepspy tests
mypy pydepspy
```

## Architecture

PyDepSpy uses a modular, detector-based architecture:

- **Parsers** - Extract dependencies from various formats
- **Detectors** - Independent scanning engines that return standardized `Finding` objects
- **Aggregator** - Deduplicates and scores findings
- **Reporters** - Render output in different formats
- **Cache** - SQLite-backed with TTL
- **CLI** - Click-based entry point

Each detector runs concurrently via `asyncio.gather()` for maximum performance.

## Roadmap

### v0.1.0 (Current)
- ✅ Typosquat detection
- ✅ CVE scanner
- ✅ Install hook inspector
- ✅ Ghost dependency finder
- ✅ Metadata auditor
- ✅ Terminal, JSON, SARIF output
- ✅ Caching

### v0.2.0 (Planned)
- [ ] `--fix` flag for auto-upgrading vulnerable packages
- [ ] GitHub Actions integration (SARIF upload + CI enforcement)
- [ ] More metadata anomaly checks
- [ ] Custom rule support
- [ ] Real-time PyPI monitoring

## PyPI & Library Usage

PyDepSpy is available as a CLI and as a Python library for integration in other tools.

Install from PyPI:

```bash
pip install pydepspy
```

Use as a CLI (console script):

```bash
pydepspy scan --project /path/to/project --format json
```

Or import the library in Python:

```python
# Parsing packages
from pydepspy.parser import parse_pyproject
packages = parse_pyproject('pyproject.toml')

# Running detectors programmatically
from pydepspy.detectors import TyposquatDetector, CVEDetector
from pydepspy.aggregator import aggregate_findings
import asyncio

async def run_scan(packages):
    detectors = [TyposquatDetector(), CVEDetector()]
    tasks = [det.check(pkg, ver) for pkg, ver in packages for det in detectors]
    results = await asyncio.gather(*tasks)
    findings = [f for res in results for f in (res or [])]
    return aggregate_findings([findings])

# Example usage
# aggregated = asyncio.run(run_scan(packages))
# print(aggregated)
```

## Publishing checklist

Follow these steps to publish a new release to PyPI:

1. Bump version in `pyproject.toml` (already bumped to 0.1.1).
2. Update `CHANGELOG.md` with notable changes.
3. Ensure tests and linters pass:
   ```bash
   python -m ruff check pydepspy tests
   python -m black --check --target-version py39 pydepspy tests
   python -m pytest -q
   ```
4. Build distributions:
   ```bash
   python -m pip install --upgrade build twine
   python -m build
   ```
5. Verify artifacts in `./dist`.
6. Upload to TestPyPI first:
   ```bash
   python -m twine upload --repository testpypi dist/*
   ```
7. After verification, upload to PyPI:
   ```bash
   python -m twine upload dist/*
   ```
8. Tag the release and push tags:
   ```bash
   git tag v0.1.1
   git push --follow-tags
   ```

Notes:
- Use a clean venv to avoid dependency conflicts when building/publishing.
- Consider signing with GPG for authenticity.

## License

MIT License - see [LICENSE](LICENSE) file

## Contributing

Contributions welcome! Please:

1. Fork the repo
2. Create a feature branch (`git checkout -b feature/something`)
3. Add tests
4. Submit a PR

## Acknowledgments

- [OSV.dev](https://osv.dev) - Free vulnerability database
- [PyPI Simple API](https://peps.python.org/pep-0503/) - Package metadata
- [Rich](https://rich.readthedocs.io/) - Beautiful terminal output
- [Click](https://click.palletsprojects.com/) - CLI framework

## Contact

- 🐛 Issues: [GitHub Issues](https://github.com/tanuj437/pydepspy/issues)
- 💬 Discussions: [GitHub Discussions](https://github.com/tanuj437/pydepspy/discussions)
- 📧 Email: tanuj.saxena.rks@gmail.com

---

**Made with ❤️ for Python security**
