How It Works #

Step 1
Scan
SHA-256 hash every file
Step 2
Attest
Three-layer proof (T0/T1/T2)
Step 3
Compare
Detect exfiltration

Census computes SHA-256 hashes of files, sends only hashes to the CertiSigma API, and receives three layers of cryptographic proof:

  1. T0 — ECDSA Signature — Immediate, server signs the hash with P-256
  2. T1 — TSA Timestamp — Minutes, Merkle tree + RFC 3161 Time Stamping Authority
  3. T2 — Bitcoin Anchor — Hours, Merkle root anchored via OpenTimestamps
Zero knowledge: Original file content never leaves the client. Only 64-character hex hashes are transmitted.

Installation #

Python 3.10+. TOML config support on 3.10 uses tomli (auto-installed).

bash
# Base install
pip install certisigma-census

# With watch mode (filesystem monitoring)
pip install certisigma-census[watch]

# With PDF report generation
pip install certisigma-census[report]

# Everything
pip install "certisigma-census[watch,report]"

Quick Start #

Inventory scan

bash
export CERTISIGMA_API_KEY=cs_...

# Scan and attest all file hashes
census scan /path/to/sensitive-files --source inventory-hr

# Dry run (hash only, no attestation)
census scan /path/to/files --dry-run

Breach comparison

bash
# Compare suspect files against the registry
census compare /path/to/suspect-files --manifest inventory.db

# Exit 0 = no matches, 1 = exfiltration detected

Integrity check

bash
# Check files against manifest baseline (100% local)
census integrity manifest.db

# Differential: only new findings since last run
census integrity manifest.db --since auto --write-state auto

GitHub Action #

Composite action with zero Docker overhead. SARIF auto-upload to GitHub Security tab.

yaml
# Breach detection with SARIF upload
- uses: certisigma/census-action@v1
  with:
    command: compare
    target: ./artifacts
    manifest: ./inventory.db
  env:
    CERTISIGMA_API_KEY: ${{ secrets.CERTISIGMA_API_KEY }}

# Integrity check (no API key needed)
- uses: certisigma/census-action@v1
  with:
    command: integrity
    manifest: ./inventory.db
InputRequiredDefaultDescription
commandYesscan, integrity, compare, bulk-scan
targetNo.Directory to scan or check
manifestNo.census-manifest.dbManifest file path
api-keyNoAPI key (or set env var)
formatNoautotext, json, jsonl, sarif (sarif only for compare)
upload-sarifNotrueAuto-upload SARIF to Security tab
sourceNoAudit label for the scan
exit-zeroNofalseReport-only mode (compare/bulk-scan)
versionNolatestPin certisigma-census version
extra-argsNoAdditional CLI flags
python-versionNo3.12Python version for setup-python

Commands #

Census provides 30+ commands. Expand each for options and examples.

census scan <dir>

Walk directory, compute SHA-256 hashes, attest in batch, save manifest.

OptionDescription
--source LABELSource label for attestations
--manifest PATHManifest output path
--dry-runHash only, no attestation
--resumeResume interrupted scan
--workers NParallel hashing (1–8)
--attest-manifestAttest manifest’s own hash
--include/--excludeGlob patterns for filtering
--min-size/--max-sizeSize filters (e.g. 1K, 100M)
--jsonMachine-readable output
census compare <dir>

Hash suspect files and verify against the CertiSigma registry.

OptionDescription
--manifest PATHLocal manifest for cross-reference
--format text|json|sarif|jsonlOutput format
--detailedEnriched results (source, T0/T1/T2 level)
--exit-zeroReport-only mode (always exit 0)
--summaryCounts only, no match details
--on-match CMDExecute CMD on matches (JSON on stdin)
census integrity <manifest>

Tamper detection against manifest baseline. 100% local, no API calls.

OptionDescription
--strictExit 1 on any discrepancy
--since PATHDifferential mode (auto = sidecar)
--write-state PATHSave state for next run
--format text|json|jsonlOutput format
census diff <base> <target>

Compare two manifests. AIDE-style bitmask exit codes (1=added, 2=removed, 4=modified).

census bulk-scan <dir>

Bulk leak detection via /scan endpoint. Up to 50K hashes per call with auto-chunking.

OptionDescription
--dry-runHash only, no API call
--exit-zeroReport-only mode
--summaryCounts only
--source LABELIncident tracking label
census verify <hash|--file>

Verify a hash or file against the registry. Full T0/T1/T2 evidence chain. No API key required.

census verify-manifest <manifest>

Full-chain verification: all manifest hashes against the registry.

census update <manifest>

AIDE-style baseline update: detect → review → accept. New entries are unattested.

census report <manifest> -o <file>

Forensic reports: HTML (zero deps), PDF (fpdf2), evidence bundles (ZIP with OTS proofs).

census watch <dir>

Continuous filesystem monitoring via native OS events. Requires [watch] extra.

census seal / verify-seal

HMAC-SHA256 tamper-evidence seal for manifests (Tripwire/AIDE pattern).

census export / hash / track / stats

Manifest export (CSV/JSON/sha256sum), standalone hashing, attestation tracking, org statistics.

census compliance-report <manifest>

Generate compliance reports mapping Census data to NIS2, DORA, or ISO 27001 requirements. 100% local — no API calls.

bash
census compliance-report manifest.db -o report.html
census compliance-report manifest.db --template dora -o report.html
census compliance-report manifest.db --template iso27001 --json
census compliance-report manifest.db --integrity -o report.html
OptionDescription
--template nis2|dora|iso27001Compliance framework (default: nis2)
-o, --output PATHOutput file (.html or .json)
--integrity / --no-integrityRun integrity check and include results
--jsonMachine-readable JSON output
census status <manifest>

Show manifest summary: total files, attested/pending counts, root directory, schema version.

census doctor / config / completion

Self-diagnostic (--manifest, --json), TOML configuration (config init, config show, config paths), shell completions (bash/zsh/fish).

census audit-log / snapshot

Tamper-evident JSONL audit log (audit-log show, verify, clear). Named snapshots for compliance baselines (snapshot create, list, diff, delete).

census share / tag / derived-list / annotate / metadata / key-rotate / key-gen

Forensic cooperation: share tokens, structured tagging, HMAC-derived lists, annotations, key rotation.

Output Formats #

FormatFlagUse case
Text(default)Human-readable terminal output
JSON--json or --format jsonCI/CD automation, machine parsing
JSONL--format jsonlSIEM/ELK streaming, log pipelines
SARIF--format sarifGitHub Security tab, VS Code, Defect Dojo
CSV--output report.csvSpreadsheets, compliance reporting
sha256sum--format sha256sumGNU coreutils compatible (sha256sum -c)
HTML/PDF-o report.html, -o report.pdfForensic reports (census report, compliance-report)
ZIP--bundleEvidence bundle (report + OTS proofs + SHA256SUMS)

All JSON output includes census_version and elapsed_seconds for forensic traceability. JSONL streams end with a _summary trailer.

Forensic Features #

  • Evidence chaincensus verify with T0/T1/T2 details, OTS proof export
  • Forensic reports — HTML, PDF, evidence bundles (ZIP with OTS proofs + SHA256SUMS)
  • Audit log — Tamper-evident JSONL with SHA-256 hash chain (census audit-log verify)
  • Named snapshots — Compliance baselines with diff comparison
  • Manifest seal — HMAC-SHA256 tamper-evidence (Tripwire/AIDE pattern)
  • Differential integrity--since auto --write-state auto for new-findings-only mode
  • Baseline update — AIDE-style detect → review → accept workflow
  • Forensic annotation — Case IDs, notes, tags with AES-256-GCM zero-knowledge encryption

Cooperation #

Share forensic data with third parties without exposing original content.

  • Derived lists — HMAC-SHA256 opaque hash lists for third-party breach detection. The third party can match suspects without seeing your inventory.
  • Share tokens — Time-limited, use-limited tokens for chain of custody.
  • Structured tagging — Key-value classification with encrypted tags and cursor-paginated query.
  • Annotations — Add forensic notes, case IDs, and metadata to attestations.
bash
# Create an opaque derived list from your manifest
census derived-list create --manifest ./inventory.db --label "Q1 2026"

# Third party matches their suspects
census derived-list match <list_id> --list-key <hex64> --hashes-file suspects.txt

CI/CD Integration #

Census is designed for automation. Exit codes, report-only mode, and SARIF output integrate with any CI/CD pipeline.

FeatureDescription
--exit-zeroReport-only: always exit 0 (upload SARIF without gating)
--summaryCounts only, no match details (concise CI logs)
--format sarifSARIF v2.1.0 for GitHub Security tab upload
--on-match CMDExecute command with results on stdin when matches > 0
--format jsonlStreaming output for SIEM/ELK log pipelines
--no-colorDisable colored output (also respects NO_COLOR env var)
-q / --quietSuppress info output (errors and JSON always shown)
GitHub Action: Use certisigma/census-action@v1 for seamless CI/CD integration. See GitHub Action section.

Configuration #

Census reads configuration from TOML files with user/project precedence:

  1. CLI flags (highest priority)
  2. Environment variables (CERTISIGMA_API_KEY, CERTISIGMA_BASE_URL)
  3. Project config (.census.toml in current directory)
  4. User config (~/.config/census/config.toml)
bash
# Create a project config template
census config init --project

# View effective configuration
census config show

# Shell completions
eval "$(census completion bash)"

Security Model #

  • Content never leaves the client — Only SHA-256 hashes are transmitted to the API. The original file content stays on your infrastructure.
  • Zero-knowledge metadata — Annotations and tag values can be encrypted client-side with AES-256-GCM before sending to the API. The server stores ciphertext only.
  • HMAC-derived lists — Third-party breach detection uses HMAC-SHA256 derivation. The third party sees opaque derived hashes, not your original inventory.
  • Manifest is local — The hash-to-filepath mapping lives on your filesystem. CertiSigma never sees file paths or directory structure.
  • API key scoping — RBAC scoped keys allow read-only access for analysts with full audit trail.
Important: API keys should never be committed to source control. Use environment variables or a secrets manager. Census masks keys in config show and doctor output.

Compliance Mapping #

Census provides cryptographic evidence chains that map to regulatory requirements:

RequirementFrameworkCensus capability
Asset inventoryNIS2 Art.21, ISO 27001 A.8.1census scan + manifest
Change detectionNIS2 Art.21, DORA Art.9census integrity + differential
Incident response evidenceNIS2 Art.23, DORA Art.17census compare + forensic reports
Data integrity verificationDORA Art.11, ISO 27001 A.14census verify-manifest
Audit trailNIS2 Art.21, ISO 27001 A.12.4census audit-log (tamper-evident)
Third-party riskNIS2 Art.21, DORA Art.28Derived lists + share tokens
Data classificationISO 27001 A.8.2Structured tagging + encryption
Cryptographic controlsISO 27001 A.10, DORA Art.9T0/T1/T2 proof chain, AES-256-GCM
Supply chain integrityNIS2 Art.21(2d)census seal + verify-seal
Continuous monitoringDORA Art.9(2)census watch + systemd timers

Architecture #

Census is a client of the CertiSigma API. It uses the published Python SDK and treats it as a black box.

ComponentDescription
CLIClick-based, 30+ commands, global flags (-v, -q, --no-color)
ManifestSQLite (WAL mode), schema v2, auto-migration from JSON
ScannerStreamed SHA-256, parallel hashing (ProcessPoolExecutor), glob filters
Watcherwatchdog + producer/consumer, debounce, batch attestation
RetryExponential backoff on 429/5xx with Retry-After header
ReportsHTML (zero deps), PDF (fpdf2), ZIP bundles with OTS proofs
AuditJSONL with SHA-256 hash chain, tail-read for last hash
SDK integration: Census consumes certisigma from PyPI. For API-level integration details, see the SDK documentation.

Global Options #

OptionDescription
-v / --verboseEnable debug logging
-q / --quietSuppress informational output
--log-format text|jsonLog output format
--no-colorDisable colored output (also NO_COLOR env)
--versionShow version