Metadata-Version: 2.4
Name: binarysniffer
Version: 1.11.3
Summary: A high-performance CLI and library for detecting open source components in binaries through semantic signature matching
Author-email: "Oscar Valenzuela B." <oscar.valenzuela.b@gmail.com>
License: Apache-2.0
Project-URL: Homepage, https://github.com/SemClone/binarysniffer
Project-URL: Bug Tracker, https://github.com/SemClone/binarysniffer/issues
Project-URL: Documentation, https://github.com/SemClone/binarysniffer/tree/main/docs
Keywords: binary-analysis,license-compliance,signature-matching,oss-detection,semantic-analysis
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Software Distribution
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS.md
Requires-Dist: click>=8.1.0
Requires-Dist: tqdm>=4.66.0
Requires-Dist: xxhash>=3.5.0
Requires-Dist: zstandard>=0.23.0
Requires-Dist: pybloom-live>=4.0.0
Requires-Dist: python-magic>=0.4.27
Requires-Dist: pygments>=2.18.0
Requires-Dist: rich>=13.0.0
Requires-Dist: tabulate>=0.9.0
Requires-Dist: osslili>=1.5.6
Requires-Dist: upmex>=1.6.7
Provides-Extra: fuzzy
Requires-Dist: python-tlsh>=4.5.0; extra == "fuzzy"
Provides-Extra: android
Requires-Dist: androguard>=4.1.0; extra == "android"
Provides-Extra: archives
Requires-Dist: py7zr>=0.21.0; extra == "archives"
Requires-Dist: rarfile>=4.2; extra == "archives"
Requires-Dist: python-debian>=0.1.49; extra == "archives"
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Requires-Dist: pytest-cov>=5.0.0; extra == "dev"
Requires-Dist: black>=24.0.0; extra == "dev"
Requires-Dist: mypy>=1.8.0; extra == "dev"
Requires-Dist: ruff>=0.3.0; extra == "dev"
Requires-Dist: pre-commit>=3.6.0; extra == "dev"
Provides-Extra: fast
Requires-Dist: lz4>=4.3.0; extra == "fast"
Requires-Dist: numpy>=1.24.0; extra == "fast"
Requires-Dist: scikit-learn>=1.4.0; extra == "fast"
Dynamic: license-file

# BinarySniffer - Binary Component Detection and Security Analysis

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![PyPI version](https://img.shields.io/pypi/v/binarysniffer.svg)](https://pypi.org/project/binarysniffer/)

A high-performance CLI tool and Python library for detecting open source components and security threats in binaries through semantic signature matching. Specialized for analyzing mobile apps (APK/IPA), Java archives, ML models, and source code to identify OSS components, their licenses, and potential security risks.

## Features

- **Binary Component Detection**: Identify 188+ OSS components in compiled binaries using semantic signatures
- **ML Model Security Analysis**: Comprehensive security scanning with MITRE ATT&CK mapping
- **Multi-Format Support**: APK/IPA, JAR/WAR, ELF/PE/Mach-O, ML models (pickle, ONNX, SafeTensors)
- **SEMCL.ONE Integration**: Works seamlessly with osslili, purl2notices, and other ecosystem tools

## Installation

```bash
pip install binarysniffer
```

For development:
```bash
git clone https://github.com/SemClone/binarysniffer.git
cd binarysniffer
pip install -e .
```

With performance extras:
```bash
pip install binarysniffer[fast]
```

## Quick Start

```bash
# Analyze a binary file
binarysniffer analyze /path/to/binary

# ML model security scan
binarysniffer ml-scan model.pkl --deep

# Generate SBOM
binarysniffer analyze app.apk --format cyclonedx -o sbom.json
```

## Usage

### CLI Usage

```bash
# Basic analysis
binarysniffer analyze app.apk

# ML model security analysis
binarysniffer ml-scan model.pkl --risk-threshold 0.5

# Directory scanning with recursion
binarysniffer analyze /path/to/project -r

# Generate CycloneDX SBOM
binarysniffer analyze app.jar --format sbom -o app-sbom.json

# Extract package inventory
binarysniffer inventory app.apk --with-hashes -o inventory.json
```

### Python API

```python
from binarysniffer import EnhancedBinarySniffer

# Initialize analyzer
sniffer = EnhancedBinarySniffer()

# Analyze a file
result = sniffer.analyze_file("app.apk")
for match in result.matches:
    print(f"{match.component} - {match.confidence:.2%}")
    print(f"License: {match.license}")

# ML security analysis
from binarysniffer.ml_security import MLSecurityAnalyzer

analyzer = MLSecurityAnalyzer()
risks = analyzer.analyze_model("model.pkl")
```

## Core Capabilities

### Binary Analysis
- Advanced format support (ELF, PE, Mach-O) via LIEF
- Android DEX bytecode analysis
- Static library (.a) support
- Symbol and import extraction

### Archive Support
- Mobile apps (APK, IPA)
- Java archives (JAR, WAR)
- Python packages (wheel, egg)
- Linux packages (DEB, RPM)
- Extended formats (7z, RAR, Zstandard)

### ML Model Security (v1.10.0+)
- Safe pickle file analysis
- ONNX and SafeTensors validation
- PyTorch/TensorFlow native formats
- 100% detection rate on known exploits
- SARIF output for CI/CD integration

### Signature Database
- 188 OSS components covered
- 1,400+ high-quality signatures
- Automatic license detection
- Security severity classification

## Integration with SEMCL.ONE

BinarySniffer is a core component of the SEMCL.ONE ecosystem:

- Complements **osslili** for source code license detection
- Works with **purl2notices** for comprehensive attribution
- Integrates with **ospac** for policy evaluation
- Supports **upmex** for package metadata extraction

## Configuration

```yaml
# ~/.binarysniffer/config.json
{
  "signature_sources": [
    "https://signatures.binarysniffer.io/core.xmdb"
  ],
  "min_confidence": 0.5,
  "parallel_workers": 4,
  "auto_update": true
}
```

## Documentation

- [User Guide](docs/USER_GUIDE.md) - Comprehensive usage examples
- [API Reference](docs/API_REFERENCE.md) - Python API documentation
- [ML Security](docs/ML_SECURITY.md) - ML model security analysis
- [Signature Management](docs/SIGNATURE_MANAGEMENT.md) - Creating and managing signatures
- [Architecture](docs/ARCHITECTURE.md) - System design and internals

## Advanced Topics

- [TLSH Fuzzy Matching](docs/TLSH_FUZZY_MATCHING.md) - Detecting modified components
- [Creating Signatures](docs/CREATING_SIGNATURES.md) - Contributing new signatures
- [Installation Guide](docs/INSTALLATION.md) - Platform-specific setup
- [Package Verification](docs/PACKAGE_VERIFICATION.md) - Archive analysis

## Contributing

We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for details on:
- Code of conduct
- Development setup
- Submitting pull requests
- Signature contributions

## Support

For support and questions:
- [GitHub Issues](https://github.com/SemClone/binarysniffer/issues) - Bug reports and feature requests
- [Documentation](https://github.com/SemClone/binarysniffer) - Complete project documentation
- [SEMCL.ONE Community](https://semcl.one) - Ecosystem support and discussions

## License

Apache License 2.0 - see [LICENSE](LICENSE) file for details.

## Authors

See [AUTHORS.md](AUTHORS.md) for a list of contributors.

---

*Part of the [SEMCL.ONE](https://semcl.one) ecosystem for comprehensive OSS compliance and code analysis.*
