Metadata-Version: 2.4
Name: cve-report-aggregator
Version: 0.10.0
Summary: Aggregate and deduplicate vulnerability scan reports from Grype and Trivy
Project-URL: Homepage, https://github.com/mkm29/cve-report-aggregator
Project-URL: Repository, https://github.com/mkm29/cve-report-aggregator
Project-URL: Issues, https://github.com/mkm29/cve-report-aggregator/issues
Author-email: Mitchell Murphy <mitchell.murphy@defenseunicorns.com>
License: MIT
License-File: LICENSE
Keywords: cve,grype,sbom,security,trivy,vulnerability
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Quality Assurance
Requires-Python: >=3.12
Requires-Dist: click>=8.3.0
Requires-Dist: openai[aiohttp]>=2.5.0
Requires-Dist: pydantic-settings>=2.7.1
Requires-Dist: pydantic>=2.12.2
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: rich-click>=1.9.3
Requires-Dist: rich>=14.2.0
Requires-Dist: structlog>=25.1.0
Provides-Extra: dev
Requires-Dist: mypy>=1.18.2; extra == 'dev'
Requires-Dist: pytest-cov>=7.0.0; extra == 'dev'
Requires-Dist: pytest>=8.4.1; extra == 'dev'
Requires-Dist: ruff>=0.14.1; extra == 'dev'
Description-Content-Type: text/markdown

# CVE Report Aggregation and Deduplication Tool

[![Python Version](https://img.shields.io/badge/python-3.12%20%7C%203.13-blue.svg)](https://www.python.org/downloads/)
[![PyPI version](https://img.shields.io/pypi/v/cve-report-aggregator.svg)](https://pypi.org/project/cve-report-aggregator/)
[![PyPI downloads](https://img.shields.io/pypi/dm/cve-report-aggregator.svg)](https://pypi.org/project/cve-report-aggregator/)
[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![CI](https://github.com/mkm29/cve-report-aggregator/actions/workflows/test.yml/badge.svg)](https://github.com/mkm29/cve-report-aggregator/actions/workflows/test.yml)
[![codecov](https://codecov.io/gh/mkm29/cve-report-aggregator/branch/main/graph/badge.svg?token=mJcMNSlBIM)](https://codecov.io/gh/mkm29/cve-report-aggregator)
[![Latest Release](https://img.shields.io/github/v/release/mkm29/cve-report-aggregator)](https://github.com/mkm29/cve-report-aggregator/releases)
[![Docker](https://img.shields.io/badge/docker-available-blue.svg)](https://github.com/mkm29/cve-report-aggregator/pkgs/container/cve-report-aggregator)
[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)

![CVE Report Aggregator Logo](./images/logo.png)

A Python package for aggregating and deduplicating Grype and Trivy vulnerability scan reports.

## Features

- **Self-Contained Docker Image**: Includes all scanning tools (Grype, Syft, Trivy, UDS CLI) in a single hardened
  Alpine-based image
- **Supply Chain Security**: SLSA Level 3 compliant with signed images, SBOMs, and provenance attestations
- **AI-Powered CVE Enrichment**: Optional OpenAI integration for automated vulnerability mitigation analysis
- **Production-Ready Package**: Installable via pip/pipx with proper dependency management
- **Rich Terminal Output**: Beautiful, color-coded tables and progress indicators using the Rich library
- **Multi-Scanner Support**: Works with both Grype and Trivy scanners
- **SBOM Auto-Scan**: Automatically detects and scans Syft SBOM files with Grype
- **Auto-Conversion**: Automatically converts Grype reports to CycloneDX format for Trivy scanning
- **CVE Deduplication**: Combines identical vulnerabilities across multiple scans
- **Automatic Null CVSS Filtering**: Filters out invalid CVSS scores (null, N/A, or zero) from all vulnerability reports
- **CVSS 3.x-Based Severity Selection**: Optional mode to select highest severity based on actual CVSS 3.x base scores
- **Scanner Source Tracking**: Identifies which scanner (Grype or Trivy) provided the vulnerability data
- **Occurrence Tracking**: Counts how many times each CVE appears
- **Parallel Processing**: Concurrent package downloading with configurable worker pools (10-14x speedup)
- **Flexible CLI**: Click-based interface with rich-click styling and sensible defaults
- **Full Test Coverage**: Comprehensive test suite with pytest (237 tests, 91% coverage)
- **Security Hardened**: Non-root user (UID 1001), minimal Alpine base, pinned dependencies, and vulnerability-scanned

## Configuration

CVE Report Aggregator supports flexible configuration through multiple sources with the following precedence (highest to
lowest):

1. **CLI Arguments** - Command-line flags and options
1. **YAML Configuration File** - `.cve-aggregator.yaml` or `.cve-aggregator.yml`
1. **Environment Variables** - Prefixed with `CVE_AGGREGATOR_`
1. **Default Values**

### CLI Options

| Option                      | Short | Description                                                                       | Default            |
| --------------------------- | ----- | --------------------------------------------------------------------------------- | ------------------ |
| `--input-dir`               | `-i`  | Input directory containing scan reports or SBOMs                                  | `./reports`        |
| `--scanner`                 | `-s`  | Scanner type to process (`grype` or `trivy`)                                      | `grype`            |
| `--log-level`               | `-l`  | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)                             | `INFO`             |
| `--mode`                    | `-m`  | Aggregation mode: `highest-score`, `first-occurrence`, `grype-only`, `trivy-only` | `highest-score`    |
| `--enrich-cves`             |       | Enable CVE enrichment with OpenAI                                                 | `false`            |
| `--openai-api-key`          |       | OpenAI API key (defaults to `OPENAI_API_KEY` env var)                             | None               |
| `--openai-model`            |       | OpenAI model to use for enrichment                                                | `gpt-5-nano`       |
| `--openai-reasoning-effort` |       | Reasoning effort level (`low`, `medium`, `high`)                                  | `medium`           |
| `--max-cves-to-enrich`      |       | Maximum number of CVEs to enrich                                                  | None (all)         |
| `--enrich-severity-filter`  |       | Severity levels to enrich (can be used multiple times)                            | `Critical`, `High` |
| `--help`                    | `-h`  | Show help message and exit                                                        | N/A                |
| `--version`                 |       | Show version and exit                                                             | N/A                |

### YAML Configuration File

Create a `.cve-aggregator.yaml` or `.cve-aggregator.yml` file in your project directory:

```yaml
# Scanner and processing settings
scanner: grype                          # Scanner type: grype or trivy
mode: highest-score                     # Aggregation mode
log_level: INFO                         # Logging level

input_dir: ./reports                    # Input directory for reports

# Parallel processing
maxWorkers: 14                          # Concurrent download workers (auto-detect if omitted)

# Remote package downloads
downloadRemotePackages: true            # Enable remote SBOM downloads
registry: registry.defenseunicorns.com
organization: sld-45
packages:
  - name: gitlab
    version: 18.4.2-uds.0-unicorn
    architecture: amd64
  - name: gitlab-runner
    version: 18.4.0-uds.0-unicorn
    architecture: amd64

# CVE Enrichment (OpenAI)
enrich_cves: true                       # Enable CVE enrichment
openai_model: gpt-5-nano                # OpenAI model (gpt-5-nano, gpt-4o, etc.)
openai_reasoning_effort: medium         # Reasoning effort (low, medium, high)
max_cves_to_enrich: 50                  # Limit enrichment count (optional)
enrich_severity_filter:                 # Severity levels to enrich
  - Critical
  - High
```

See [.cve-aggregator.example.yaml](.cve-aggregator.example.yaml) for a complete example.

### Environment Variables

All configuration options can be set via environment variables with the `CVE_AGGREGATOR_` prefix:

```bash
# Scanner settings
export CVE_AGGREGATOR_SCANNER=grype
export CVE_AGGREGATOR_MODE=highest-score
export CVE_AGGREGATOR_LOG_LEVEL=DEBUG

# Input/output
export CVE_AGGREGATOR_INPUT_DIR=/path/to/reports
export CVE_AGGREGATOR_OUTPUT_FILE=/path/to/output.json

# Parallel processing
export CVE_AGGREGATOR_MAX_WORKERS=14

# Remote packages
export CVE_AGGREGATOR_DOWNLOAD_REMOTE_PACKAGES=true
export CVE_AGGREGATOR_REGISTRY=registry.example.com
export CVE_AGGREGATOR_ORGANIZATION=my-org

# CVE Enrichment
export OPENAI_API_KEY=sk-...                            # OpenAI API key (no prefix)
export CVE_AGGREGATOR_ENRICH_CVES=true
export CVE_AGGREGATOR_OPENAI_MODEL=gpt-5-nano
export CVE_AGGREGATOR_OPENAI_REASONING_EFFORT=medium
export CVE_AGGREGATOR_MAX_CVES_TO_ENRICH=50
```

### Configuration Examples

#### Basic Usage with Defaults

```bash
# Process reports from ./reports/ with default settings
cve-report-aggregator

# Output: $HOME/output/unified-YYYYMMDDhhmmss.json
```

#### Custom Scanner and Verbosity

```bash
# Use Trivy scanner with debug logging
cve-report-aggregator --scanner trivy --log-level DEBUG
```

#### CVE Enrichment

```bash
# Enable AI-powered enrichment for Critical and High CVEs
export OPENAI_API_KEY=sk-...
cve-report-aggregator --enrich-cves

# Customize enrichment settings
cve-report-aggregator \
  --enrich-cves \
  --openai-model gpt-4o \
  --openai-reasoning-effort high \
  --max-cves-to-enrich 10 \
  --enrich-severity-filter Critical
```

#### Remote Package Downloads

```yaml
# .cve-aggregator.yaml
downloadRemotePackages: true
registry: registry.defenseunicorns.com
organization: sld-45
maxWorkers: 14
packages:
  - name: gitlab
    version: 18.4.2-uds.0-unicorn
```

```bash
# Run with config file
cve-report-aggregator --config .cve-aggregator.yaml
```

## Performance

CVE Report Aggregator now supports **parallel processing** for significantly faster execution with large package sets:

### Parallel Package Downloading

When downloading SBOM reports from remote registries (e.g., using UDS Zarf), packages are downloaded concurrently using
a configurable worker pool:

```yaml
# .cve-aggregator.yaml
maxWorkers: 14  # Number of concurrent download workers (optional)
```

**Performance Improvement:**

- **Before**: Sequential downloads (~150 seconds for 14 packages)
- **After**: Parallel downloads (~10-15 seconds for 14 packages)
- **Speedup**: **10-14x faster** for the download phase

**Auto-Detection:** If `maxWorkers` is not specified, the optimal worker count is automatically detected using the
formula: `min(32, cpu_count + 4)`. Set to `1` to disable parallelization.

**Thread Safety:** All parallel operations use thread-safe data structures (`Lock()`) to ensure data integrity across
concurrent workers.

For detailed information about the optimization plan and future phases (parallel SBOM scanning and report processing),
see [PERFORMANCE_OPTIMIZATION.md](PERFORMANCE_OPTIMIZATION.md).

## Prerequisites

**Optional (depending on scanner choice):**

- [grype](https://github.com/anchore/grype) - For Grype scanning (default scanner)
- [syft](https://github.com/anchore/syft) - For converting reports to CycloneDX format (Trivy workflow)
- [trivy](https://github.com/aquasecurity/trivy) - For Trivy scanning

```bash
# Install Grype
brew install grype

# Install syft (for Trivy workflow)
brew install syft

# Install trivy
brew install aquasecurity/trivy/trivy
```

## Installation

### Using Docker (Recommended)

The easiest way to use CVE Report Aggregator is via the pre-built Docker image, which includes all necessary scanning
tools (Grype, Syft, Trivy, UDS CLI):

```bash
# Pull the latest signed image from GitHub Container Registry
docker pull ghcr.io/mkm29/cve-report-aggregator:latest

# Or build locally
docker build -t cve-report-aggregator .

# Or use Docker Compose
docker compose run cve-aggregator --help

# Run with mounted volumes for reports and output
docker run --rm \
  -v $(pwd)/reports:/workspace/reports:ro \
  -v $(pwd)/output:/home/cve-aggregator/output \
  ghcr.io/mkm29/cve-report-aggregator:latest \
  --input-dir /workspace/reports \
  --verbose

# Note: Output files are automatically saved to $HOME/output with timestamped names
# Format: <package_name>-YYYYMMDDhhmmss.json (e.g., gitlab-20251019182051.json)
```

#### Image Security & Supply Chain

All container images are built with enterprise-grade security:

- **Signed with Cosign**: Keyless signing using GitHub OIDC identity
- **SBOM Included**: CycloneDX and SPDX attestations attached to every image
- **Provenance**: SLSA Level 3 compliant build attestations
- **Multi-Architecture**: Supports both amd64 and arm64 (Apple Silicon)
- **Vulnerability Scanned**: Regularly scanned with Grype and Trivy

##### Verify Image Signature

```bash
# Install cosign
brew install cosign

# Verify the image signature
cosign verify ghcr.io/mkm29/cve-report-aggregator:latest \
  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com'

# Output shows verified signature with GitHub Actions identity
```

##### Download and Verify SBOM

```bash
# Download CycloneDX SBOM (JSON format)
cosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \
  --type cyclonedx \
  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \
  jq -r '.payload' | base64 -d | jq . > sbom-cyclonedx.json

# Download SPDX SBOM (JSON format)
cosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \
  --type spdx \
  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \
  jq -r '.payload' | base64 -d | jq . > sbom-spdx.json

# View all attestations and signatures
cosign tree ghcr.io/mkm29/cve-report-aggregator:latest
```

##### Download Build Provenance

```bash
# Download SLSA provenance attestation
cosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \
  --type slsaprovenance \
  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \
  jq -r '.payload' | base64 -d | jq . > provenance.json
```

#### Available Image Tags

Images are published to GitHub Container Registry with the following tags:

- `latest` - Latest stable release (recommended for production)
- `v*.*.*` - Specific version tags (e.g., `v0.5.1`, `v0.5.2`)
- `rc` - Release candidate builds (for testing pre-release versions)

```bash
# Pull specific version
docker pull ghcr.io/mkm29/cve-report-aggregator:v0.5.1

# Pull latest stable
docker pull ghcr.io/mkm29/cve-report-aggregator:latest

# Pull release candidate (if available)
docker pull ghcr.io/mkm29/cve-report-aggregator:rc
```

All tags are signed and include full attestations (signature, SBOM, provenance).

## CVE Enrichment

CVE Report Aggregator supports optional AI-powered enrichment using OpenAI GPT models to automatically analyze
vulnerabilities in the context of UDS Core security controls. This feature generates concise, actionable mitigation
summaries that explain how defense-in-depth security measures help protect against specific CVEs.

### Key Features

- **gpt-5-nano with Batch API**: Cost-optimized analysis with 50% discount on already low token costs
- **Asynchronous Processing**: Submits all CVEs to OpenAI Batch API and polls for completion
- **UDS Core Security Context**: Analyzes 20+ NetworkPolicies and 19 Pepr admission policies
- **Single-Sentence Summaries**: Format "UDS helps to mitigate {CVE_ID} by {explanation}"
- **Configurable Reasoning Effort**: Tune analysis depth with `low`, `medium`, or `high` settings
- **Severity Filtering**: Default enrichment for Critical and High severity only
- **Flexible Configuration**: CLI, YAML, or environment variables

**Note:** Batch API enrichment typically completes within minutes to hours (up to 24-hour maximum). The CLI will poll
for completion automatically and display progress updates.

### Quick Start

```bash
# Set API key
export OPENAI_API_KEY=sk-...

# Enable enrichment (enriches Critical and High severity CVEs by default)
cve-report-aggregator --enrich-cves

# Customize enrichment with higher reasoning effort
cve-report-aggregator \
  --enrich-cves \
  --openai-model gpt-4o \
  --openai-reasoning-effort high \
  --max-cves-to-enrich 10 \
  --enrich-severity-filter Critical
```

### Reasoning Effort

The `openai_reasoning_effort` parameter controls how deeply the AI model analyzes each CVE:

- **`low`**: Faster, more concise analysis with lower token usage
- **`medium`** (default): Balanced analysis with good quality and reasonable token usage
- **`high`**: Most thorough analysis with higher quality but increased token usage

**When to adjust:**

- Use `low` for large CVE sets where speed and cost are priorities
- Use `medium` (default) for most production use cases
- Use `high` for critical vulnerabilities requiring detailed analysis

**Note:** The `reasoning_effort` parameter is only supported by GPT-5 models (gpt-5-nano, gpt-5-mini). The temperature
parameter is fixed at 1.0 for GPT-5 models as required by OpenAI.

```bash
# Example: High-quality analysis for critical CVEs only
cve-report-aggregator \
  --enrich-cves \
  --openai-reasoning-effort high \
  --enrich-severity-filter Critical
```

### Cost Optimization

The system achieves extremely low costs through:

1. **gpt-5-nano**: Ultra cost-effective model ($0.150/1M input, $0.600/1M output tokens)
1. **OpenAI Batch API**: 50% cost discount compared to synchronous API calls
1. **Single-Sentence Format**: 80% fewer output tokens (100 vs 500 tokens per CVE)
1. **Severity Filtering**: ~70% fewer CVEs enriched (Critical/High only by default)

**Batch API Benefits:**

The OpenAI Batch API processes requests asynchronously with significant cost savings:

- **50% cost discount** on all API calls (applied automatically)
- Processes all CVEs in a single batch submission
- Results available within 24 hours (typically much faster)
- Automatic retry and error handling

**Cost Examples (gpt-5-nano with Batch API @ 50% discount):**

- 10 CVEs: ~$0.0006 (11,000 tokens @ $0.075/1M input, $0.300/1M output)
- 100 CVEs: ~$0.006 (110,000 tokens)
- 1,000 CVEs: ~$0.06 (1,100,000 tokens)

**Comparison with Standard Pricing:**

- 100 CVEs with Batch API (gpt-5-nano): $0.006
- 100 CVEs without Batch API (gpt-5-nano): $0.012
- 100 CVEs with GPT-4: ~$12.00
- **Cost Reduction vs GPT-4: 99.95%**
- **Cost Reduction vs Synchronous API: 50%**

### Output Format

Enrichments are added to the unified report under the `enrichments` key:

```json
{
  "enrichments": {
    "CVE-2024-12345": {
      "cve_id": "CVE-2024-12345",
      "mitigation_summary": "UDS helps to mitigate CVE-2024-12345 by enforcing non-root container execution through Pepr admission policies and blocking unauthorized external network access via default-deny NetworkPolicies.",
      "analysis_model": "gpt-5-nano",
      "analysis_timestamp": "2025-01-20T12:34:56.789Z"
    }
  },
  "summary": {
    "enrichment": {
      "enabled": true,
      "total_cves": 150,
      "enriched_cves": 45,
      "model": "gpt-5-nano",
      "severity_filter": ["Critical", "High"]
    }
  }
}
```

For complete documentation and advanced usage, see [CVE_ENRICHMENT.md](CVE_ENRICHMENT.md).

## Docker Credentials Management

The Docker container supports two methods for providing registry credentials:

1. **Build-Time Secrets**
1. **Environment Variables**

### Method 1: Build-Time Secrets (Recommended)

**Best for**: Private container images where credentials can be baked in securely.

Create a credentials file in JSON format with `username`, `password`, and `registry` fields:

```bash
cat > docker/config.json <<EOF
{
  "username": "myuser",
  "password": "mypassword",
  "registry": "ghcr.io"
}
EOF
chmod 600 docker/config.json
```

**Important**: Always encrypt the credentials file with SOPS before committing:

```bash
# Encrypt the credentials file
sops -e docker/config.json.dec > docker/config.json.enc

# Or encrypt in place
sops -e docker/config.json.dec > docker/config.json.enc
```

Build the image with the secret:

```bash
# If using encrypted file, decrypt first
sops -d docker/config.json.enc > docker/config.json.dec

# Build with the decrypted credentials
docker buildx build \
  --secret id=credentials,src=./docker/config.json.dec \
  -f docker/Dockerfile \
  -t cve-report-aggregator:latest .

# Remove decrypted file after build
rm docker/config.json.dec
```

Or build directly with unencrypted file (for local development):

```bash
docker buildx build \
  --secret id=credentials,src=./docker/config.json \
  -f docker/Dockerfile \
  -t cve-report-aggregator:latest .
```

The credentials will be stored in the image at `$DOCKER_CONFIG/config.json` (defaults to
`/home/cve-aggregator/.docker/config.json`) in proper Docker authentication format with base64-encoded credentials.

Run the container (no runtime credentials needed - uses baked-in config.json):

```bash
docker run --rm cve-report-aggregator:latest --help
```

**Important**: This method bakes credentials into the image. Only use for private registries and **never** push images
with credentials to public registries.

### Method 2: Environment Variables (Development Only)

**Warning**: This method exposes the password in process listings and Docker inspect output. Only use for
development/testing.

```bash
docker run -it --rm \
  -e REGISTRY_URL="$UDS_URL" \
  -e UDS_USERNAME="$UDS_USERNAME" \
  -e UDS_PASSWORD="$UDS_PASSWORD" \
  cve-report-aggregator:latest --help
```

### How Credentials Are Handled

The `entrypoint.sh` script checks for Docker authentication on startup:

1. **Docker config.json** (Build-Time): Checks if `$DOCKER_CONFIG/config.json` exists

   - If found: Skips all credential checks and login - uses existing Docker auth
   - Location: `/home/cve-aggregator/.docker/config.json`

1. **Environment Variables** (if config.json not found): Requires all three variables:

   - `REGISTRY_URL` - Registry URL (e.g., `registry.defenseunicorns.com`)
   - `UDS_USERNAME` - Registry username
   - `UDS_PASSWORD` - Registry password

If config.json doesn't exist and environment variables are not provided, the container exits with an error.

**Important**: Mounting your local `~/.docker/config.json` file into the container will **not** work. The Docker config
must be baked into the image during build (Method 1) or you must use environment variables (Method 2). UDS/Zarf requires
credentials in a specific format that differs from standard Docker auth.

### From Source

```bash
# Clone the repository
git clone https://github.com/mkm29/cve-report-aggregator.git
cd cve-report-aggregator

# Install in development mode
pip install -e .

# Or install with dev dependencies
pip install -e ".[dev]"
```

### From PyPI (when published)

```bash
# Install globally
pip install cve-report-aggregator

# Or install with pipx (recommended)
pipx install cve-report-aggregator
```

## Usage

### Basic Usage (Default Locations)

Process reports from `./reports/` and automatically save timestamped output to `$HOME/output/`:

```bash
cve-report-aggregator
# Output: $HOME/output/unified-YYYYMMDDhhmmss.json
```

### Use Trivy Scanner

Automatically convert reports to CycloneDX and scan with Trivy:

```bash
cve-report-aggregator --scanner trivy
```

### Process SBOM Files

The script automatically detects and scans Syft SBOM files:

```bash
cve-report-aggregator -i /path/to/sboms -v
```

### Custom Input Directory

```bash
# Specify custom input directory (output still goes to $HOME/output with timestamp)
cve-report-aggregator -i /path/to/reports
```

### Verbose Mode

Enable detailed processing output:

```bash
cve-report-aggregator -v
```

### Combined Options

```bash
cve-report-aggregator -i ./scans --scanner trivy -v
# Output: $HOME/output/<package>-YYYYMMDDhhmmss.json
```

### Use Highest Severity Across Scanners

When scanning with multiple scanners (or multiple runs of the same scanner), automatically select the highest severity
rating:

```bash
# Scan the same image with both Grype and Trivy, use highest severity
grype myapp:latest -o json > reports/grype-app.json
trivy image myapp:latest -f json -o reports/trivy-app.json
cve-report-aggregator -i reports/ --mode highest-score
# Output: $HOME/output/unified-YYYYMMDDhhmmss.json
```

This is particularly useful when:

- Combining results from multiple scanners with different severity assessments
- Ensuring conservative (worst-case) severity ratings for compliance
- Aggregating multiple scans over time where severity data may have been updated

**Note:** All output files are automatically saved to `$HOME/output/` with timestamped filenames in the format
`<package_name>-YYYYMMDDhhmmss.json`. When processing multiple packages, each gets its own timestamped file (e.g.,
`gitlab-20251019182051.json`, `gitlab-runner-20251019182055.json`).

For complete configuration options, see the [Configuration](#configuration) section.

## Output Formats

The tool generates reports in two formats for maximum flexibility:

### 1. JSON Format (Unified Report)

The unified report includes:

### Metadata

- Generation timestamp
- Scanner type and version
- Source report count and filenames

### Summary

- Total vulnerability occurrences
- Unique vulnerability count
- Severity breakdown (Critical, High, Medium, Low, Negligible, Unknown)
- Per-image scan results

### Vulnerabilities (Deduplicated)

For each unique CVE/GHSA:

- Vulnerability ID
- Occurrence count
- Selected scanner (which scanner provided the vulnerability data)
- Severity and CVSS scores
- Fix availability and versions
- All affected sources (images and artifacts)
- Detailed match information

### 2. CSV Format (Simplified Export)

A simplified CSV export is automatically generated alongside each unified JSON report for easy consumption in
spreadsheet applications and reporting tools.

**Filename Format**: `<package_name>-<timestamp>.csv`

**Columns**:

- `CVE ID`: Vulnerability identifier
- `Severity`: Severity level (Critical, High, Medium, Low, etc.)
- `Count`: Number of occurrences across all scanned images
- `CVSS`: Highest CVSS 3.x score (or "N/A" if unavailable)
- `Impact`: Impact analysis from OpenAI enrichment (if enabled)
- `Mitigation`: Mitigation summary from OpenAI enrichment (if enabled)

**Example**:

```csv
"CVE ID","Severity","Count","CVSS","Impact","Mitigation"
"CVE-2023-4863","Critical","5","9.8","Without UDS Core controls, this critical vulnerability...","UDS helps to mitigate CVE-2023-4863 by..."
"CVE-2023-4973","High","3","7.5","This vulnerability could allow...","UDS helps to mitigate CVE-2023-4973 by..."
```

**Features**:

- Sorted by severity (Critical > High > Medium > Low) and CVSS score
- Includes enrichment data when CVE enrichment is enabled
- UTF-8 encoded with proper CSV escaping
- Compatible with Excel, Google Sheets, and data analysis tools

**Location**: `$HOME/output/<package_name>-<timestamp>.csv`

## Development

### Running Tests

```bash
# Run all tests
pytest

# Run with coverage
pytest --cov=cve_report_aggregator --cov-report=html

# Run specific test file
pytest tests/test_severity.py
```

### Code Quality

```bash
# Format code
black src/ tests/

# Lint code
ruff check src/ tests/

# Type checking
mypy src/
```

### Building the Package

```bash
# Build distribution packages
python -m build

# Install locally
pip install dist/cve_report_aggregator-0.1.0-py3-none-any.whl
```

## Project Structure

```bash
cve-report-aggregator/
├── src/
│   └── cve_report_aggregator/
│       ├── __init__.py           # Package exports and metadata
│       ├── main.py               # CLI entry point
│       ├── models.py             # Type definitions
│       ├── utils.py              # Utility functions
│       ├── severity.py           # CVSS and severity logic
│       ├── scanner.py            # Scanner integrations
│       ├── aggregator.py         # Deduplication engine
│       └── report.py             # Report generation
├── tests/
│   ├── __init__.py
│   ├── conftest.py               # Pytest fixtures
│   ├── test_severity.py          # Severity tests
│   └── test_aggregator.py        # Aggregation tests
├── pyproject.toml                # Project configuration
├── README.md                     # This file
└── LICENSE                       # MIT License
```

## Example Workflows

### Docker E2E Workflow

```bash
# Scan container images and aggregate with Docker
docker run --rm \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -v $(pwd)/reports:/workspace/reports \
  -v $(pwd)/output:/home/cve-aggregator/output \
  ghcr.io/mkm29/cve-report-aggregator:latest bash -c "\
    grype nginx:latest -o json > /workspace/reports/nginx.json && \
    grype postgres:15 -o json > /workspace/reports/postgres.json && \
    cve-report-aggregator --input-dir /workspace/reports --log-level DEBUG"

# View results (find the most recent timestamped file)
jq '.summary' output/unified-*.json | tail -1
```

### Grype Workflow (Default)

```bash
# Scan multiple container images with Grype
grype registry.io/app/service1:v1.0 -o json > reports/service1.json
grype registry.io/app/service2:v1.0 -o json > reports/service2.json
grype registry.io/app/service3:v1.0 -o json > reports/service3.json

# Aggregate all reports (output saved to $HOME/output with timestamp)
cve-report-aggregator --log-level DEBUG

# Query results with jq (use the timestamped file)
REPORT=$(ls -t $HOME/output/unified-*.json | head -1)
jq '.summary' "$REPORT"
jq '.vulnerabilities[] | select(.vulnerability.severity == "Critical")' "$REPORT"
```

### SBOM Workflow

```bash
# Generate SBOMs with Syft (or use Zarf-generated SBOMs)
syft registry.io/app/service1:v1.0 -o json > sboms/service1.json
syft registry.io/app/service2:v1.0 -o json > sboms/service2.json

# Script automatically detects and scans SBOMs with Grype
cve-report-aggregator -i ./sboms --log-level DEBUG

# Results include all vulnerabilities found (use timestamped file)
REPORT=$(ls -t $HOME/output/unified-*.json | head -1)
jq '.summary.by_severity' "$REPORT"
```

### Trivy Workflow

```bash
# Start with Grype reports (script will convert to CycloneDX)
grype registry.io/app/service1:v1.0 -o json > reports/service1.json
grype registry.io/app/service2:v1.0 -o json > reports/service2.json

# Aggregate and scan with Trivy (auto-converts to CycloneDX)
cve-report-aggregator --scanner trivy --log-level DEBUG

# Or scan SBOMs directly with Trivy
cve-report-aggregator -i ./sboms --scanner trivy --log-level DEBUG

# View most recent output
REPORT=$(ls -t $HOME/output/unified-*.json | head -1)
jq '.summary' "$REPORT"
```

## License

MIT License - See LICENSE file for details

## Contributing

Contributions are welcome! Please:

1. Fork the repository
1. Create a feature branch
1. Add tests for new functionality
1. Ensure all tests pass
1. Submit a pull request

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for version history and changes.
