Metadata-Version: 2.4
Name: certisigma-census
Version: 0.6.1
Summary: Cryptographic file inventory and exfiltration detection — powered by CertiSigma
Project-URL: Homepage, https://certisigma.ch
Project-URL: Documentation, https://developers.certisigma.ch/sdk
Project-URL: Repository, https://github.com/massimocavallin/certisigma-census
Project-URL: Issues, https://github.com/massimocavallin/certisigma-census/issues
Author: Ten Sigma Sagl
License-Expression: MIT
Keywords: attestation,breach-detection,cryptography,file-integrity,forensics
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security :: Cryptography
Classifier: Topic :: System :: Filesystems
Requires-Python: >=3.10
Requires-Dist: certisigma>=1.5.0
Requires-Dist: click>=8.1
Requires-Dist: tomli>=2.0; python_version < '3.11'
Provides-Extra: dev
Requires-Dist: fpdf2>=2.8.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: watchdog>=4.0.0; extra == 'dev'
Provides-Extra: report
Requires-Dist: fpdf2>=2.8.0; extra == 'report'
Provides-Extra: watch
Requires-Dist: watchdog>=4.0.0; extra == 'watch'
Description-Content-Type: text/markdown

# CertiSigma Census

Cryptographic file inventory and exfiltration detection — powered by [CertiSigma](https://certisigma.ch).

Census scans directories, computes SHA-256 hashes, attests them via the CertiSigma API (three-layer cryptographic proof: ECDSA T0, qualified TSA T1, Bitcoin T2), and maintains a local manifest. When suspect files surface, Census compares their hashes against the registry to prove — with cryptographic certainty — whether they match inventoried assets.

## Installation

```bash
pip install certisigma-census

# With watch mode (filesystem monitoring)
pip install certisigma-census[watch]

# With PDF report generation
pip install certisigma-census[report]

# Everything
pip install "certisigma-census[watch,report]"
```

Requires Python 3.10+. TOML config support on Python 3.10 uses `tomli` (auto-installed).

## Quick Start

### 1. Inventory scan

```bash
export CERTISIGMA_API_KEY=cs_...

# Scan a directory and attest all file hashes
census scan /path/to/sensitive-files --source inventory-hr

# Dry run — hash only, no attestation
census scan /path/to/files --dry-run

# Scan only PDFs and Word docs, skip files over 100 MB
census scan /data --include "*.pdf" --include "*.docx" --max-size 100M

# Resume an interrupted scan
census scan /data --source quarterly --manifest inventory.db --resume
```

This produces a `.census-manifest.db` (SQLite) mapping each hash to its file path, size, and attestation metadata.

### 2. Breach comparison

```bash
# Compare suspect files against the CertiSigma registry
census compare /path/to/suspect-files --manifest /path/to/.census-manifest.db

# Save report as JSON or CSV
census compare /suspect --output report.json
census compare /suspect --output report.csv
```

Exit code: `0` if no matches, `1` if matches found.

### 3. Manifest status and export

```bash
# Show summary
census status /path/to/.census-manifest.db

# Export manifest as CSV for compliance reporting
census export manifest.db --format csv --output inventory.csv

# Export as JSON
census export manifest.db --format json --output inventory.json
```

### 4. Evidence verification

```bash
# Verify a hash against the CertiSigma registry
census verify a1b2c3d4e5f67890...

# Verify a file (hash it first, then check)
census verify /path/to/document.pdf --file

# Save OpenTimestamps proof
census verify a1b2c3... --save-ots proof.ots
```

No API key required — all verification endpoints are public.

### 5. Integrity check

```bash
# Check files against manifest baseline
census integrity manifest.db

# Strict mode: exit 1 on any discrepancy
census integrity manifest.db --strict
```

100% local operation — no API calls, no network needed.

### 6. Forensic reports

```bash
# HTML report (always available, zero dependencies)
census report manifest.db -o report.html

# PDF report (requires: pip install certisigma-census[report])
census report manifest.db -o report.pdf --evidence --integrity

# Evidence bundle: ZIP with report + OTS proofs + checksums
census report manifest.db -o bundle.zip --bundle --evidence
```

### 7. Manifest diff

```bash
# Compare two manifests
census diff baseline.db current.db

# HTML diff report
census diff baseline.db current.db -o diff.html

# Machine-readable (exit codes: 0=none, 1=added, 2=removed, 4=modified)
census diff baseline.db current.db --json
```

### 8. Standalone hashing

```bash
# Hash a file
census hash document.pdf

# Hash a directory
census hash /path/to/files

# Verify against known hash
census hash document.pdf --verify a1b2c3d4e5...
```

### 9. Attestation tracking

```bash
# Check attestation status
census track att_12345

# Wait for Bitcoin anchoring
census track att_12345 --poll --timeout 7200
```

### 10. Self-diagnostic

```bash
# Run all health checks
census doctor

# Check including a specific manifest
census doctor --manifest inventory.db

# Machine-readable output for CI
census doctor --json
```

### 11. Manifest merging

```bash
# Merge manifests from different servers
census merge server1.db server2.db -o combined.db

# Merge with glob
census merge scans/*.db -o full-inventory.db --json
```

### 12. Configuration

```bash
# Create config template
census config init --project

# View effective config
census config show

# Enable shell completions
eval "$(census completion bash)"
```

### 13. Watch mode (continuous monitoring)

```bash
# Watch a directory for changes and attest new/modified files
census watch /path/to/files --source "production"

# Dry run — hash only, no attestation
census watch /data --dry-run

# Network mount — use polling
census watch /mnt/share --polling --poll-interval 10
```

Requires: `pip install certisigma-census[watch]`

## How It Works

1. **Scan** — Census walks the directory, computes SHA-256 for each file (streamed, constant memory), and builds a local manifest.
2. **Attest** — Hashes are sent in batches (up to 100 per call) to the CertiSigma API. Each hash receives a three-layer cryptographic proof (T0 ECDSA signature, T1 qualified TSA timestamp, T2 Bitcoin anchor).
3. **Compare** — Suspect files are hashed and verified against the registry via `POST /verify/batch`. Matches prove the file was previously inventoried, regardless of filename or directory structure changes.

The original file content **never leaves** the client. Only SHA-256 hashes are transmitted.

## Features

| Feature | Description | Docs |
|---------|-------------|------|
| **File filters** | `--include`, `--exclude` globs; `--min-size`, `--max-size` | [scanning.md](docs/features/scanning.md) |
| **Resume scans** | `--resume` skips unchanged files, preserves attestation state | [scanning.md](docs/features/scanning.md) |
| **CSV/JSON export** | Compare reports and manifest export in both formats | [comparison.md](docs/features/comparison.md) |
| **Retry with backoff** | Automatic retry on 429/5xx with exponential backoff | [retry-and-resilience.md](docs/features/retry-and-resilience.md) |
| **Structured logging** | `--log-format json` for SIEM/ELK integration | [logging.md](docs/features/logging.md) |
| **Progress bars** | Visual feedback for scan, attest, and compare operations | [scanning.md](docs/features/scanning.md) |
| **SQLite manifest** | WAL mode, indexed lookups, auto-migration from JSON | [manifest.md](docs/features/manifest.md) |
| **Watch mode** | Continuous filesystem monitoring with batch attestation | [watching.md](docs/features/watching.md) |
| **Evidence verification** | Full T0/T1/T2 chain, OTS proof export | [evidence.md](docs/features/evidence.md) |
| **Integrity check** | Tamper detection against manifest baseline | [integrity.md](docs/features/integrity.md) |
| **Forensic reports** | HTML, PDF, evidence bundles (ZIP) | [reporting.md](docs/features/reporting.md) |
| **Manifest diff** | Compare snapshots, AIDE-style exit codes, HTML reports | [diff.md](docs/features/diff.md) |
| **Standalone hashing** | SHA-256 without manifests or API calls | [hash.md](docs/features/hash.md) |
| **Attestation tracking** | Monitor T0/T1/T2 progression with `--poll` | [tracking.md](docs/features/tracking.md) |
| **Config files** | TOML config with user/project precedence | [config.md](docs/features/config.md) |
| **Shell completions** | bash, zsh, fish via `census completion` | — |
| **Self-diagnostic** | API health, config, inotify, manifest integrity | [doctor.md](docs/features/doctor.md) |
| **Manifest merging** | Combine manifests from distributed scans | [merge.md](docs/features/merge.md) |
| **JSON output** | `--json` on scan, compare, status, doctor, merge | — |

Full documentation: [`docs/features/`](docs/features/)

## CLI Reference

### Global options

| Option | Description |
|--------|-------------|
| `-v` / `--verbose` | Enable debug logging |
| `--log-format text\|json` | Log output format (default: text) |
| `--version` | Show version |

### `census scan`

| Option | Description |
|--------|-------------|
| `--source LABEL` | Source label for attestations |
| `--manifest PATH` | Manifest output path (default: `<dir>/.census-manifest.db`) |
| `--api-key KEY` | API key (or set `CERTISIGMA_API_KEY`) |
| `--base-url URL` | Override API base URL |
| `--dry-run` | Hash only, no attestation |
| `--resume` | Resume interrupted scan |
| `--include GLOB` | Include files matching pattern (repeatable) |
| `--exclude GLOB` | Exclude files matching pattern (repeatable) |
| `--min-size SIZE` | Skip files smaller than SIZE (e.g. `1K`, `10M`) |
| `--max-size SIZE` | Skip files larger than SIZE (default: `5G`) |
| `--json` | Machine-readable JSON summary |

### `census compare`

| Option | Description |
|--------|-------------|
| `--manifest PATH` | Local manifest for cross-referencing |
| `--output PATH` | Save report (`.json` or `.csv` by extension) |
| `--include/--exclude/--min-size/--max-size` | Same filters as scan |
| `--json` | Machine-readable JSON output (stdout only; `--output` ignored — see stderr note) |

### `census export`

| Option | Description |
|--------|-------------|
| `--format csv\|json` | Output format (default: csv) |
| `--output PATH` | Output file (default: stdout) |

### `census verify`

| Option | Description |
|--------|-------------|
| `--file` | Treat argument as a file path (hash it first) |
| `--save-ots PATH` | Save OTS proof to this path |
| `--json` | Machine-readable JSON output |
| `--api-key KEY` | API key (optional for verify) |
| `--base-url URL` | Override API base URL |

### `census integrity`

| Option | Description |
|--------|-------------|
| `--json` | Machine-readable JSON output |
| `--output PATH` | Save results (`.csv` or `.json` by extension) |
| `--strict` | Exit with code 1 on any discrepancy |

### `census report`

| Option | Description |
|--------|-------------|
| `-o`/`--output PATH` | Output file (`.html`, `.pdf`, or `.zip`) **required** |
| `--evidence` | Fetch T0/T1/T2 evidence chain for attested files |
| `--integrity` | Run integrity check and include results |
| `--bundle` | Generate evidence bundle (ZIP) |
| `--api-key KEY` | API key (needed only with `--evidence`) |

### `census status`

| Option | Description |
|--------|-------------|
| `--json` | Machine-readable JSON output |

### `census doctor`

| Option | Description |
|--------|-------------|
| `--manifest PATH` | Check health of a specific manifest file |
| `--json` | Machine-readable JSON output |
| `--api-key KEY` | API key |
| `--base-url URL` | Override API base URL |

### `census merge`

| Option | Description |
|--------|-------------|
| `-o`/`--output PATH` | Output manifest path **required** |
| `--json` | Machine-readable JSON summary |

### `census diff`

| Option | Description |
|--------|-------------|
| `--json` | Machine-readable JSON output |
| `-o`/`--output PATH` | Save report (`.html`, `.csv`, or `.json` by extension) |
| `--summary` | Show only counts, no individual file details |

Exit codes: 0=none, 1=added, 2=removed, 4=modified (bitmask, OR'd together).

### `census hash`

| Option | Description |
|--------|-------------|
| `--verify HASH` | Compare computed hash against expected SHA-256 |
| `--json` | Output as JSON array |

### `census track`

| Option | Description |
|--------|-------------|
| `--poll` | Continuously check until T2 level reached |
| `--poll-interval SECS` | Seconds between checks (default: 60) |
| `--timeout SECS` | Max time to poll (default: 3600) |
| `--json` | Machine-readable JSON output |
| `--api-key KEY` | API key |
| `--base-url URL` | Override API base URL |

### `census config`

| Action | Description |
|--------|-------------|
| `show` | Display effective merged config |
| `init` | Create a template config file |
| `paths` | Show config file locations |
| `--project` | Act on project `.census.toml` |

### `census completion`

Takes a shell name: `bash`, `zsh`, or `fish`.

```bash
eval "$(census completion bash)"   # bash
eval "$(census completion zsh)"    # zsh
census completion fish | source    # fish
```

### `census watch`

| Option | Description |
|--------|-------------|
| `--debounce SECS` | Quiet period before processing (default: 2.0s) |
| `--batch-interval SECS` | Max time between attestation batches (default: 30s) |
| `--scan-on-start / --no-scan-on-start` | Baseline scan before watching (default: on) |
| `--on-delete ignore\|mark\|remove` | Action on file deletion (default: ignore) |
| `--polling` | Use PollingObserver for NFS/CIFS mounts |
| `--poll-interval SECS` | Polling interval (default: 5s) |
| `--source/--manifest/--api-key/--dry-run` | Same as `census scan` |
| `--include/--exclude/--min-size/--max-size` | Same filters as scan |

Requires: `pip install certisigma-census[watch]`

## Dependencies

- [`certisigma`](https://pypi.org/project/certisigma/) — Official CertiSigma Python SDK
- [`click`](https://click.palletsprojects.com/) — CLI framework

Optional:
- [`watchdog`](https://pypi.org/project/watchdog/) — Filesystem monitoring (only for `census watch`)
- [`fpdf2`](https://pypi.org/project/fpdf2/) — PDF report generation (only for `census report` with `.pdf` output)

## License

MIT — Ten Sigma Sagl
