Metadata-Version: 2.4
Name: vehlo-code-scanner
Version: 0.1.3
Summary: Multi-tenant security scanning platform wrapping Amazon Security Helper
Project-URL: Homepage, https://github.com/Vehlo-CyberSec/vehlo-code-scanner
Project-URL: Repository, https://github.com/Vehlo-CyberSec/vehlo-code-scanner
Requires-Python: >=3.12
Requires-Dist: alembic<2.0,>=1.13
Requires-Dist: authlib<2.0,>=1.3
Requires-Dist: boto3>=1.34
Requires-Dist: celery<6.0,>=5.3
Requires-Dist: fastapi<1.0,>=0.115
Requires-Dist: httpx<1.0,>=0.27
Requires-Dist: itsdangerous<3.0,>=2.1
Requires-Dist: psycopg[binary]<4.0,>=3.1
Requires-Dist: pydantic<3.0,>=2.0
Requires-Dist: redis<6.0,>=5.0
Requires-Dist: rich<14.0,>=13.5
Requires-Dist: sqlalchemy<3.0,>=2.0
Requires-Dist: typer<1.0,>=0.16
Requires-Dist: uvicorn[standard]<1.0,>=0.30
Provides-Extra: dev
Requires-Dist: pytest-cov<6.0,>=5.0; extra == 'dev'
Requires-Dist: pytest<9.0,>=8.0; extra == 'dev'
Provides-Extra: scan
Requires-Dist: vehlo-ash<4,>=3.2.5; extra == 'scan'
Description-Content-Type: text/markdown

# Vehlo Code Scanner

Multi-tenant security scanning platform wrapping [Amazon Security Helper (ASH)](https://github.com/awslabs/automated-security-helper). Runs ASH across Vehlo's 500+ repos, centralizes findings in PostgreSQL, and surfaces them through a React dashboard with triage workflows and analytics.

## Architecture

```
CI/CD pipeline → ASH scan → vcs CLI (--push) → POST /api/v1/scans → FastAPI + PostgreSQL → React Dashboard
```

Three components:

- **CLI / scanner** — Python package (`vcs`) that wraps ASH. Supports container, local, and pre-commit modes. Outputs rich terminal tables, optionally pushes results to the central API, and can fail the build on a severity threshold.
- **API service** — FastAPI monolith handling ingest, findings lifecycle, and analytics. Backed by PostgreSQL. Auto-resolves findings when they disappear from subsequent scans.
- **Dashboard** — React + Vite SPA. Overview, findings browser, per-repo drill-down, scan history, and analytics.

## Quick Start

```bash
# Start all services
docker compose up

# API → http://localhost:8002
# Dashboard → http://localhost:5175
```

### Installation

Distribution is **dual-mode**: a public path (PyPI + ECR Public, no AWS
account needed) and an AWS-gated path (CodeArtifact + private ECR) for
internal users. The tool itself is open to install — the **API token** gates
pushing results and **SSO** gates viewing them. The `[scan]` extra adds the
ASH engine (`vehlo-ash`, a rename-only repackaging of AWS's Apache-2.0
automated-security-helper — see `packaging/vehlo-ash/`); without it you still
get the CLI, `--push`, and the client.

**Public (no AWS):**

```bash
# pipx (recommended — modern macOS/Linux pythons refuse bare pip installs)
pipx install 'vehlo-code-scanner[scan]'    # quotes matter in zsh

# or pip, inside a venv
pip install 'vehlo-code-scanner[scan]'

# or Docker from ECR Public (ASH bundled, nothing else to install)
docker run --rm -v "$PWD:/src" \
  public.ecr.aws/w1f0w6h2/vehlo-code-scanner:latest scan /src --mode local
```

**AWS-gated (internal):** authenticate once with `aws sso login`, then:

```bash
# pip via CodeArtifact (configures the index, then installs)
aws codeartifact login --tool pip \
  --domain "$VCS_CA_DOMAIN" --repository "$VCS_CA_REPO"
pip install 'vehlo-code-scanner[scan]'

# Docker via ECR
aws ecr get-login-password --region "$AWS_REGION" \
  | docker login --username AWS --password-stdin "$ECR_REGISTRY"
docker run --rm -v "$PWD:/src" \
  "$ECR_REGISTRY/vehlo-code-scanner:latest" scan /src
```

**GitHub Actions** (the calling job needs `permissions: id-token: write`):

```yaml
- uses: Vehlo-CyberSec/vehlo-code-scanner@v1
  with:
    api-url: ${{ vars.VCS_API_URL }}
    api-token: ${{ secrets.VCS_API_TOKEN }}
    image: ${{ vars.VCS_ECR_IMAGE }}      # full ECR URI
    aws-role: ${{ secrets.VCS_AWS_ROLE }} # OIDC role with ECR pull
    aws-region: ${{ vars.AWS_REGION }}
    fail-on: high
```

> ASH is git-only upstream (and its PyPI name is squatted), so `vcs scan`
> without the engine prints install guidance. The container images bundle ASH;
> pip users get it via the `[scan]` extra (`vehlo-ash` from PyPI or
> CodeArtifact, depending on the index you install from).

For local development from a checkout:

```bash
uv tool install '.[scan]'   # or: uv sync --group local-scan  (ASH from git)
```

After installation, `vcs` (and `vcs-admin`) are available on your PATH.

### Local development

```bash
# Install Python deps
uv sync

# Apply migrations
VCS_DATABASE_URL=postgresql+psycopg://vcs:<password>@localhost:5433/vcs alembic upgrade head
# (dev password: see docker-compose.yml)

# Run API
VCS_DATABASE_URL=... uvicorn vcs.api.app:create_app --factory --reload

# Run dashboard (in ./dashboard)
npm run dev
```

### Running a scan

```bash
# Scan current directory and print results
vcs scan .

# Scan and push results to central API
VCS_API_URL=http://localhost:8002 VCS_API_TOKEN=<token> vcs scan . --push

# Fail CI if critical or high findings exist
vcs scan . --push --fail-on high
```

## Project Structure

```
src/vcs/
├── api/
│   ├── routes/         # health, ingest, findings, repos, scans, overview, analytics
│   ├── services/       # ingest logic, auto-resolve
│   ├── app.py          # FastAPI factory
│   ├── deps.py         # DB session injection
│   └── schemas.py      # Pydantic request/response models
├── models/             # SQLAlchemy ORM: org, group, user, repo, scan, finding, api_token
├── scanner/            # ASH wrapper: runner, parser, result models
├── cli/                # Typer CLI: scan command, rich output
├── client/             # HTTP client for pushing results to API
├── db.py               # Database connection factory
├── enums.py            # Severity, FindingStatus, etc.
└── config.py           # Settings from environment

dashboard/src/
├── pages/              # Overview, Findings, FindingDetail, Repos, RepoDetail, Scans, Analytics
├── components/         # Layout, SeverityBadge, StatusBadge, Pagination, Panel, etc.
├── api/                # TanStack Query hooks
└── types.ts            # TypeScript types

alembic/versions/       # Database migrations
tests/                  # Unit + integration tests
```

## Environment Variables

| Variable | Description | Default |
|---|---|---|
| `VCS_DATABASE_URL` | PostgreSQL connection string | required |
| `VCS_API_URL` | Central API base URL (CLI push) | required for `--push` |
| `VCS_API_TOKEN` | Bearer token for API auth (CLI push) | required for `--push` |
| `VCS_REDIS_URL` | Celery broker/result backend | `redis://localhost:6380/0` |
| `VCS_S3_ENDPOINT` | S3/MinIO endpoint for raw-result archiving | `http://localhost:9002` |
| `VCS_SESSION_SECRET` | Secret for signing dashboard session cookies | dev default (**required** once OIDC is configured — startup fails without it) |
| `VCS_CORS_ORIGINS` | Comma-separated allowed CORS origins; empty value = deny cross-origin (prod) | unset → any localhost port (dev) |
| `VCS_OIDC_ISSUER` | OIDC issuer URL (enables dashboard SSO) | unset → SSO disabled |
| `VCS_OIDC_CLIENT_ID` / `VCS_OIDC_CLIENT_SECRET` | OIDC client credentials | unset |
| `VCS_OIDC_REDIRECT_URI` | OIDC callback URL | `.../api/v1/auth/callback` |
| `VCS_OIDC_GROUPS_CLAIM` | Claim holding the user's group names | `groups` |
| `VCS_DASHBOARD_URL` | Post-login redirect target | `http://localhost:5175` |
| `VCS_DEV_LOGIN` | **Dev only** — enables `/api/v1/auth/dev-login` (one-click local session, no IdP). Never set in prod. | unset → disabled |

## Running Tests

```bash
pytest
pytest --cov=vcs --cov-report=term-missing
```
