How It Works #
Census computes SHA-256 hashes of files, sends only hashes to the CertiSigma API, and receives three layers of cryptographic proof:
- T0 — ECDSA Signature — Immediate, server signs the hash with P-256
- T1 — TSA Timestamp — Minutes, Merkle tree + RFC 3161 Time Stamping Authority
- T2 — Bitcoin Anchor — Hours, Merkle root anchored via OpenTimestamps
Installation #
Python 3.10+. TOML config support on 3.10 uses tomli (auto-installed).
# Base install
pip install certisigma-census
# With watch mode (filesystem monitoring)
pip install certisigma-census[watch]
# With PDF report generation
pip install certisigma-census[report]
# Everything
pip install "certisigma-census[watch,report]"
Quick Start #
Inventory scan
export CERTISIGMA_API_KEY=cs_...
# Scan and attest all file hashes
census scan /path/to/sensitive-files --source inventory-hr
# Dry run (hash only, no attestation)
census scan /path/to/files --dry-run
Breach comparison
# Compare suspect files against the registry
census compare /path/to/suspect-files --manifest inventory.db
# Exit 0 = no matches, 1 = exfiltration detected
Integrity check
# Check files against manifest baseline (100% local)
census integrity manifest.db
# Differential: only new findings since last run
census integrity manifest.db --since auto --write-state auto
GitHub Action #
Composite action with zero Docker overhead. SARIF auto-upload to GitHub Security tab.
# Breach detection with SARIF upload
- uses: certisigma/census-action@v1
with:
command: compare
target: ./artifacts
manifest: ./inventory.db
env:
CERTISIGMA_API_KEY: ${{ secrets.CERTISIGMA_API_KEY }}
# Integrity check (no API key needed)
- uses: certisigma/census-action@v1
with:
command: integrity
manifest: ./inventory.db
| Input | Required | Default | Description |
|---|---|---|---|
command | Yes | — | scan, integrity, compare, bulk-scan |
target | No | . | Directory to scan or check |
manifest | No | .census-manifest.db | Manifest file path |
api-key | No | — | API key (or set env var) |
format | No | auto | text, json, jsonl, sarif (sarif only for compare) |
upload-sarif | No | true | Auto-upload SARIF to Security tab |
source | No | — | Audit label for the scan |
exit-zero | No | false | Report-only mode (compare/bulk-scan) |
version | No | latest | Pin certisigma-census version |
extra-args | No | — | Additional CLI flags |
python-version | No | 3.12 | Python version for setup-python |
Commands #
Census provides 30+ commands. Expand each for options and examples.
census scan <dir>
Walk directory, compute SHA-256 hashes, attest in batch, save manifest.
| Option | Description |
|---|---|
--source LABEL | Source label for attestations |
--manifest PATH | Manifest output path |
--dry-run | Hash only, no attestation |
--resume | Resume interrupted scan |
--workers N | Parallel hashing (1–8) |
--attest-manifest | Attest manifest’s own hash |
--include/--exclude | Glob patterns for filtering |
--min-size/--max-size | Size filters (e.g. 1K, 100M) |
--json | Machine-readable output |
census compare <dir>
Hash suspect files and verify against the CertiSigma registry.
| Option | Description |
|---|---|
--manifest PATH | Local manifest for cross-reference |
--format text|json|sarif|jsonl | Output format |
--detailed | Enriched results (source, T0/T1/T2 level) |
--exit-zero | Report-only mode (always exit 0) |
--summary | Counts only, no match details |
--on-match CMD | Execute CMD on matches (JSON on stdin) |
census integrity <manifest>
Tamper detection against manifest baseline. 100% local, no API calls.
| Option | Description |
|---|---|
--strict | Exit 1 on any discrepancy |
--since PATH | Differential mode (auto = sidecar) |
--write-state PATH | Save state for next run |
--format text|json|jsonl | Output format |
census diff <base> <target>
Compare two manifests. AIDE-style bitmask exit codes (1=added, 2=removed, 4=modified).
census bulk-scan <dir>
Bulk leak detection via /scan endpoint. Up to 50K hashes per call with auto-chunking.
| Option | Description |
|---|---|
--dry-run | Hash only, no API call |
--exit-zero | Report-only mode |
--summary | Counts only |
--source LABEL | Incident tracking label |
census verify <hash|--file>
Verify a hash or file against the registry. Full T0/T1/T2 evidence chain. No API key required.
census verify-manifest <manifest>
Full-chain verification: all manifest hashes against the registry.
census update <manifest>
AIDE-style baseline update: detect → review → accept. New entries are unattested.
census report <manifest> -o <file>
Forensic reports: HTML (zero deps), PDF (fpdf2), evidence bundles (ZIP with OTS proofs).
census watch <dir>
Continuous filesystem monitoring via native OS events. Requires [watch] extra.
census seal / verify-seal
HMAC-SHA256 tamper-evidence seal for manifests (Tripwire/AIDE pattern).
census export / hash / track / stats
Manifest export (CSV/JSON/sha256sum), standalone hashing, attestation tracking, org statistics.
census compliance-report <manifest>
Generate compliance reports mapping Census data to NIS2, DORA, or ISO 27001 requirements. 100% local — no API calls.
census compliance-report manifest.db -o report.html
census compliance-report manifest.db --template dora -o report.html
census compliance-report manifest.db --template iso27001 --json
census compliance-report manifest.db --integrity -o report.html| Option | Description |
|---|---|
--template nis2|dora|iso27001 | Compliance framework (default: nis2) |
-o, --output PATH | Output file (.html or .json) |
--integrity / --no-integrity | Run integrity check and include results |
--json | Machine-readable JSON output |
census status <manifest>
Show manifest summary: total files, attested/pending counts, root directory, schema version.
census doctor / config / completion
Self-diagnostic (--manifest, --json), TOML configuration (config init, config show, config paths), shell completions (bash/zsh/fish).
census audit-log / snapshot
Tamper-evident JSONL audit log (audit-log show, verify, clear). Named snapshots for compliance baselines (snapshot create, list, diff, delete).
census share / tag / derived-list / annotate / metadata / key-rotate / key-gen
Forensic cooperation: share tokens, structured tagging, HMAC-derived lists, annotations, key rotation.
Output Formats #
| Format | Flag | Use case |
|---|---|---|
| Text | (default) | Human-readable terminal output |
| JSON | --json or --format json | CI/CD automation, machine parsing |
| JSONL | --format jsonl | SIEM/ELK streaming, log pipelines |
| SARIF | --format sarif | GitHub Security tab, VS Code, Defect Dojo |
| CSV | --output report.csv | Spreadsheets, compliance reporting |
| sha256sum | --format sha256sum | GNU coreutils compatible (sha256sum -c) |
| HTML/PDF | -o report.html, -o report.pdf | Forensic reports (census report, compliance-report) |
| ZIP | --bundle | Evidence bundle (report + OTS proofs + SHA256SUMS) |
All JSON output includes census_version and elapsed_seconds for forensic traceability. JSONL streams end with a _summary trailer.
Forensic Features #
- Evidence chain —
census verifywith T0/T1/T2 details, OTS proof export - Forensic reports — HTML, PDF, evidence bundles (ZIP with OTS proofs + SHA256SUMS)
- Audit log — Tamper-evident JSONL with SHA-256 hash chain (
census audit-log verify) - Named snapshots — Compliance baselines with diff comparison
- Manifest seal — HMAC-SHA256 tamper-evidence (Tripwire/AIDE pattern)
- Differential integrity —
--since auto --write-state autofor new-findings-only mode - Baseline update — AIDE-style detect → review → accept workflow
- Forensic annotation — Case IDs, notes, tags with AES-256-GCM zero-knowledge encryption
Cooperation #
Share forensic data with third parties without exposing original content.
- Derived lists — HMAC-SHA256 opaque hash lists for third-party breach detection. The third party can match suspects without seeing your inventory.
- Share tokens — Time-limited, use-limited tokens for chain of custody.
- Structured tagging — Key-value classification with encrypted tags and cursor-paginated query.
- Annotations — Add forensic notes, case IDs, and metadata to attestations.
# Create an opaque derived list from your manifest
census derived-list create --manifest ./inventory.db --label "Q1 2026"
# Third party matches their suspects
census derived-list match <list_id> --list-key <hex64> --hashes-file suspects.txt
CI/CD Integration #
Census is designed for automation. Exit codes, report-only mode, and SARIF output integrate with any CI/CD pipeline.
| Feature | Description |
|---|---|
--exit-zero | Report-only: always exit 0 (upload SARIF without gating) |
--summary | Counts only, no match details (concise CI logs) |
--format sarif | SARIF v2.1.0 for GitHub Security tab upload |
--on-match CMD | Execute command with results on stdin when matches > 0 |
--format jsonl | Streaming output for SIEM/ELK log pipelines |
--no-color | Disable colored output (also respects NO_COLOR env var) |
-q / --quiet | Suppress info output (errors and JSON always shown) |
certisigma/census-action@v1 for seamless CI/CD integration. See GitHub Action section.Configuration #
Census reads configuration from TOML files with user/project precedence:
- CLI flags (highest priority)
- Environment variables (
CERTISIGMA_API_KEY,CERTISIGMA_BASE_URL) - Project config (
.census.tomlin current directory) - User config (
~/.config/census/config.toml)
# Create a project config template
census config init --project
# View effective configuration
census config show
# Shell completions
eval "$(census completion bash)"
Security Model #
- Content never leaves the client — Only SHA-256 hashes are transmitted to the API. The original file content stays on your infrastructure.
- Zero-knowledge metadata — Annotations and tag values can be encrypted client-side with AES-256-GCM before sending to the API. The server stores ciphertext only.
- HMAC-derived lists — Third-party breach detection uses HMAC-SHA256 derivation. The third party sees opaque derived hashes, not your original inventory.
- Manifest is local — The hash-to-filepath mapping lives on your filesystem. CertiSigma never sees file paths or directory structure.
- API key scoping — RBAC scoped keys allow read-only access for analysts with full audit trail.
config show and doctor output.Compliance Mapping #
Census provides cryptographic evidence chains that map to regulatory requirements:
| Requirement | Framework | Census capability |
|---|---|---|
| Asset inventory | NIS2 Art.21, ISO 27001 A.8.1 | census scan + manifest |
| Change detection | NIS2 Art.21, DORA Art.9 | census integrity + differential |
| Incident response evidence | NIS2 Art.23, DORA Art.17 | census compare + forensic reports |
| Data integrity verification | DORA Art.11, ISO 27001 A.14 | census verify-manifest |
| Audit trail | NIS2 Art.21, ISO 27001 A.12.4 | census audit-log (tamper-evident) |
| Third-party risk | NIS2 Art.21, DORA Art.28 | Derived lists + share tokens |
| Data classification | ISO 27001 A.8.2 | Structured tagging + encryption |
| Cryptographic controls | ISO 27001 A.10, DORA Art.9 | T0/T1/T2 proof chain, AES-256-GCM |
| Supply chain integrity | NIS2 Art.21(2d) | census seal + verify-seal |
| Continuous monitoring | DORA Art.9(2) | census watch + systemd timers |
Architecture #
Census is a client of the CertiSigma API. It uses the published Python SDK and treats it as a black box.
| Component | Description |
|---|---|
| CLI | Click-based, 30+ commands, global flags (-v, -q, --no-color) |
| Manifest | SQLite (WAL mode), schema v2, auto-migration from JSON |
| Scanner | Streamed SHA-256, parallel hashing (ProcessPoolExecutor), glob filters |
| Watcher | watchdog + producer/consumer, debounce, batch attestation |
| Retry | Exponential backoff on 429/5xx with Retry-After header |
| Reports | HTML (zero deps), PDF (fpdf2), ZIP bundles with OTS proofs |
| Audit | JSONL with SHA-256 hash chain, tail-read for last hash |
Global Options #
| Option | Description |
|---|---|
-v / --verbose | Enable debug logging |
-q / --quiet | Suppress informational output |
--log-format text|json | Log output format |
--no-color | Disable colored output (also NO_COLOR env) |
--version | Show version |