Metadata-Version: 2.4
Name: nodesafe
Version: 0.3.1
Summary: Security scanner for ComfyUI custom nodes and node-based workflow plugins
Project-URL: Homepage, https://github.com/neuregex/nodesafe
Project-URL: Repository, https://github.com/neuregex/nodesafe
Project-URL: Issues, https://github.com/neuregex/nodesafe/issues
Project-URL: Documentation, https://github.com/neuregex/nodesafe/blob/main/README.md
Project-URL: Changelog, https://github.com/neuregex/nodesafe/blob/main/CHANGELOG.md
Author: nodesafe contributors
License: Apache-2.0
License-File: LICENSE
Keywords: comfyui,custom-nodes,diffusion,malware-detection,scanner,security
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Quality Assurance
Requires-Python: >=3.10
Requires-Dist: click>=8.1
Requires-Dist: pyahocorasick>=2.0
Requires-Dist: pydantic>=2.0
Requires-Dist: rapidfuzz>=3.0
Requires-Dist: rich>=13.0
Requires-Dist: tomli>=2.0; python_version < '3.11'
Provides-Extra: anomaly
Requires-Dist: scikit-learn>=1.4; extra == 'anomaly'
Requires-Dist: torch>=2.0; extra == 'anomaly'
Provides-Extra: ast
Requires-Dist: libcst>=1.0; extra == 'ast'
Provides-Extra: dev
Requires-Dist: pyright>=1.1; extra == 'dev'
Requires-Dist: pytest-cov>=4.1; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: full
Requires-Dist: faiss-cpu>=1.7; extra == 'full'
Requires-Dist: libcst>=1.0; extra == 'full'
Requires-Dist: litellm>=1.50; extra == 'full'
Requires-Dist: ollama>=0.3; extra == 'full'
Requires-Dist: scikit-learn>=1.4; extra == 'full'
Requires-Dist: semgrep>=1.50; extra == 'full'
Requires-Dist: sentence-transformers>=3.0; extra == 'full'
Requires-Dist: torch>=2.0; extra == 'full'
Requires-Dist: transformers>=4.40; extra == 'full'
Requires-Dist: xgboost>=2.0; extra == 'full'
Provides-Extra: llm
Requires-Dist: litellm>=1.50; extra == 'llm'
Requires-Dist: ollama>=0.3; extra == 'llm'
Provides-Extra: ml
Requires-Dist: scikit-learn>=1.4; extra == 'ml'
Requires-Dist: xgboost>=2.0; extra == 'ml'
Provides-Extra: semgrep
Requires-Dist: semgrep>=1.50; extra == 'semgrep'
Provides-Extra: similarity
Requires-Dist: faiss-cpu>=1.7; extra == 'similarity'
Requires-Dist: sentence-transformers>=3.0; extra == 'similarity'
Requires-Dist: transformers>=4.40; extra == 'similarity'
Description-Content-Type: text/markdown

# nodesafe

> Security scanner for ComfyUI custom nodes — and the emerging standard for node-based workflow plugin security.

[![CI](https://github.com/neuregex/nodesafe/actions/workflows/ci.yml/badge.svg)](https://github.com/neuregex/nodesafe/actions/workflows/ci.yml)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)

`nodesafe` scans third-party plugins/nodes before you install them in node-based workflow tools, detecting malicious code with a cascading pipeline that combines static analysis, signature matching, machine learning, and optional semantic analysis with an LLM. Starting point: the ComfyUI ecosystem.

> [5-second GIF of the scanner detecting a malicious node — placeholder until v0.1]

## Why this exists

In **June 2024**, ComfyUI_LLMVISION stole browser credentials and crypto wallets from hundreds of users. In **April 2026**, a botnet compromised 1,000+ ComfyUI instances by auto-installing malicious nodes via the Manager. The custom_nodes ecosystem is large, fast-moving, and largely unverified.

`nodesafe` scans before you install.

## Quick start

```bash
pip install nodesafe
nodesafe scan /path/to/custom_node
```

Or directly without installing:

```bash
uvx nodesafe scan /path/to/custom_node
```

## How it works

A 9-layer cascading pipeline. Each layer more expensive than the previous. Most clean nodes pass in <100ms; only ambiguous cases escalate.

| Layer | Technique | Cost |
|-------|-----------|------|
| 0 | Hash matching against malware database | μs |
| 1 | Bloom filter of malicious URLs | μs |
| 2 | Aho-Corasick over dangerous patterns | ms |
| 3 | AST analysis (optional Semgrep backend) | ms |
| 4 | Typosquatting + OSV vulnerability check | ms |
| 5 | ML classifier (Naive Bayes + XGBoost) | tens of ms |
| 6 | Anomaly detection (Isolation Forest + Autoencoder) | tens of ms |
| 7 | Semantic similarity (CodeBERT embeddings + FAISS) | hundreds of ms |
| 8 | LLM review (optional, local-first via Ollama) | seconds |

**Current state (v0.1):** Layers 0 and 1 functional. Layers 2-3 in M1 sprint. The rest in the M2-M4 roadmap.

## Features

- ✓ **Pure static analysis** — never executes scanned code
- ✓ **Zero telemetry by default** — this policy is immutable
- ✓ **Works offline** (after the first signature update)
- ✓ **Multiple output formats**: JSON, SARIF (GitHub Code Scanning), Markdown
- ✓ **GitHub Action ready** — see the example workflow
- ✓ **Pre-commit hook ready** — for CI/CD of custom_nodes repositories
- ✓ **Local-first LLM analysis** — Ollama by default, cloud opt-in with BYO key
- ✓ **OSS Apache 2.0** — no freemium, no hidden SaaS, no paid whitelisting

## Usage

### Scan a directory

```bash
nodesafe scan /path/to/custom_node
```

### JSON output

```bash
nodesafe scan /path/to/custom_node --format json
```

### Integrate with GitHub Code Scanning (SARIF)

```bash
nodesafe scan custom_nodes/ --format sarif > nodesafe.sarif
```

### Only cheap layers (fast, no ML)

```bash
nodesafe scan /path/to/custom_node --layers 0,1,2,3
```

### Update signatures

```bash
nodesafe update
```

### Verify installation

```bash
nodesafe doctor
```

## Retrospective analysis

Would nodesafe have detected the historical incidents? We apply the pipeline mentally to each case:

| Incident | Detection layer | Time | Verdict |
|----------|-----------------|------|---------|
| LLMVISION (Jun 2024) | Layer 2-3 | ~30-50ms | malicious 0.98 |
| Pickai (Mar-Jun 2025) | Layer 2-3 + 5-7 | ~100ms | malicious 0.92 |
| Mining botnet (Apr 2026) | Layer 2-3 + Manager gate | <50ms | malicious 0.95 |

Full analysis in [`docs/retrospective-analysis.md`](docs/retrospective-analysis.md).

## Honest limitations

`nodesafe` is **static analysis**, not a sandbox. Its limits:

- **It does not prevent upstream supply chain attacks** (a legitimate provider being compromised). It detects the malware when it is distributed in nodes, not the original compromise.
- **It is not a replacement for the Manager** — it is complementary; ideally integrated.
- **It does not monitor runtime behavior** — that is the job of an IDS/EDR.
- **False positives happen** — the policy is conservative, but every flag shows exactly what triggered the alert so you can decide.

## Configuration

`~/.config/nodesafe/config.toml` (optional — sane defaults):

```toml
[scanner]
default_layers = "0,1,2,3,4,5,6"   # Layer 8 NOT included by default
fail_on = "suspicious"

[llm]
enabled = false                     # OFF by default. Conscious opt-in.
provider = "local"                  # local-first if enabled

[llm.local]
endpoint = "http://localhost:11434" # Ollama
model = "qwen2.5-coder:7b-instruct"

[telemetry]
enabled = false                     # ALWAYS false. Immutable policy.
```

## Roadmap

- **v0.1 (M1, current):** Layers 0-3, functional MVP, silent launch
- **v0.5 (M2):** Layer 5 ML + Semgrep + OSV integration + first public wave
- **v1.0 (M3):** Layers 6-7 + PR to ComfyUI-Manager + formal launch
- **v1.5 (M4):** Layer 8 LLM + public report + consolidated community
- **v2+ (Year 2):** `.nodesafe` standard portable to other node-based ecosystems

Full plan in [`ARCHITECTURE.md`](ARCHITECTURE.md).

## Contributing

PRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md).

**Especially welcome:**
- Contributions of **new malware signatures** — see [`signatures/README.md`](signatures/README.md)
- **False positive reports** for legitimate nodes
- **Missed detection reports** — open an issue with the `[missed-detection]` tag
- **Semgrep rules** specific to ComfyUI / diffusion patterns

## Acknowledgments

Inspired by HuggingFace's `safetensors` push, [Snyk Labs' research](https://labs.snyk.io/resources/hacking-comfyui-through-custom-nodes/) on ComfyUI attack vectors, and the unfortunate work of [u/_roblaughter_](https://www.reddit.com/r/StableDiffusion/) who discovered LLMVISION at his own cost.

## License

Apache 2.0. See [LICENSE](LICENSE).

## Long-term vision

ComfyUI is the most urgent case, not the only one. The full category of node-based tools with executable plugins (LangFlow, Flowise, Node-RED, n8n, etc.) shares the same structural problem. In the long term, `.nodesafe` aspires to become a **portable manifest artifact** that any ecosystem can adopt — analogous to how `.safetensors` became the standard for ML model weights.

V2-V3 of the project formalizes the standard and works with maintainers of other ecosystems. Today, brutal focus on ComfyUI.
