Metadata-Version: 2.4
Name: bootlegg
Version: 0.2.0
Summary: Detect polyfill.io and Funnull CDN malware on GitHub Pages and arbitrary websites
License: MIT
Project-URL: Homepage, https://github.com/hwang628/bootleg
Project-URL: Bug Tracker, https://github.com/hwang628/bootleg/issues
Keywords: security,polyfill,funnull,cdn,malware,scanner
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: aiohttp>=3.9

# bootlegg

Detect GitHub Pages sites loading scripts from Funnull-controlled CDNs —
polyfill.io, BootCSS, BootCDN, Staticfile, and a growing set of typosquat fronts.

**[Check your site →](https://moogician.github.io/bootleg-guard/)**

Our scan found **1,960 GitHub Pages sites** still loading malicious CDN scripts as of June 2026:
786 via polyfill.io (weaponized June 2024), 1,191 via Funnull's BootCSS / BootCDN / Staticfile
CDNs (malicious since June 2023, OFAC-sanctioned May 2025). Infected sites collectively carry
over 530,000 GitHub stars — including microsoft/AirSim (18k ⭐), deeplearning-ai/machine-learning-yearning-cn
(7.8k ⭐), and CyC2018/CS-Notes (184k ⭐), the primary technical interview reference for Chinese
software engineers.

---

## Install

```
pip install bootlegg
```

Or run directly without installing:

```
python3 -m bootlegg https://user.github.io/repo/
```

## Usage

```
bootlegg https://user.github.io/repo/
```

For github.io URLs, bootlegg automatically finds the source repo and runs two checks:

1. **Source scan** — searches GitHub Code API for CDN references in the repo's files
2. **Live crawl** — fetches the site (mobile UA + desktop fallback), walks linked pages
   up to `--max-pages` (default: 30), and scans each for malicious script tags

```
# GitHub token raises source scan from 10 → 30 req/min
bootlegg https://user.github.io/repo/ --token ghp_xxx
# or: export GITHUB_TOKEN=ghp_xxx

# Any site (no GitHub source search)
bootlegg https://example.com --no-github

# Single-page check, no crawl
bootlegg https://user.github.io/ --max-pages 1

# JSON output for scripting; exits 1 if infected
bootlegg https://user.github.io/ --json | jq .summary
```

## What it detects

| CDN | Status | Notes |
|-----|--------|-------|
| polyfill.io | **Malicious** | Acquired by Funnull Feb 2024; malware injected Jun 2024 |
| cdn.polyfill.io | **Malicious** | Same domain, different subdomain |
| polyfill.cn / polyfill.com | **Malicious** | Mirror / typosquat |
| bootcss.com | **Malicious** | Confirmed Funnull operator; malicious since Jun 2023 |
| bootcdn.net | **Malicious** | Confirmed Funnull operator |
| staticfile.org / staticfile.net | **Malicious** | Confirmed Funnull; OFAC-sanctioned May 2025 |
| jquecy.com | **Malicious** | Typosquats jQuery |
| jsdclivr.com | **Malicious** | Typosquats jsDelivr |
| clondflare.com | **Malicious** | Typosquats Cloudflare |
| bytedauce.com | **Malicious** | Typosquats ByteDance |
| bdustatic.com | **Malicious** | Typosquats BDU Static |
| ailyunoss.com | **Malicious** | Typosquats Alibaba Cloud OSS |
| cdn1.ai | **Suspected** | Post-sanction Funnull front, stood up Jun 2025 |
| bolecnd.com | **Suspected** | Post-sanction Funnull CDN front |
| yunray.ai | **Suspected** | Post-sanction Funnull CDN front |
| cdn5.com | **Suspected** | Post-sanction Funnull CDN front |
| ctgcdn.com | **Suspected** | Post-sanction Funnull CDN front |
| macoms.la / unionadjs.com | **C2 infra** | Funnull redirect / C2 infrastructure |

## Fix

`bootlegg` can automatically patch local files:

```bash
# Scan a local directory and show what would change
bootlegg ./my-site

# Apply fixes in-place (originals backed up as <file>.bak)
bootlegg ./my-site --fix

# Or a single file
bootlegg index.html --fix
```

Auto-replacements applied by `--fix`:

| Malicious CDN | Safe replacement |
|---|---|
| polyfill.io / polyfill.cn / polyfill.com | `polyfill-fastly.io` (drop-in) |
| bootcss.com / bootcdn.net | `cdnjs.cloudflare.com` |
| staticfile.org / staticfile.net | `cdnjs.cloudflare.com`* |

\* staticfile.org uses `/{lib}/{ver}/` paths vs. cdnjs's `/ajax/libs/{lib}/{ver}/` — verify those URLs load after fixing.

Typosquats and C2 infrastructure (macoms.la etc.) are flagged but not auto-replaced — remove those `<script>` tags manually.

## Scan data

[`infected_sites.md`](infected_sites.md) — 1,960 GitHub Pages sites confirmed
loading malicious CDN scripts across two June 2026 scans (subdomain BFS crawl up to 30 pages
per site + Sourcegraph-based discovery).

## Background

In February 2024, the polyfill.io domain was acquired by Funnull Technology Inc.,
a Chinese CDN operator. In June 2024, Cloudflare and Sansec discovered that Funnull
had modified the served JavaScript to inject malware targeting mobile browsers —
redirecting users to gambling and adult sites via fake browser-update popups.
Over 100,000 sites were affected globally at peak.

Sansec and Censys later confirmed (via shared Cloudflare account credentials) that
BootCSS, BootCDN, and Staticfile are operated by the same entity and had been
injecting malicious code since at least June 2023, a year before the polyfill
incident became public. The US Treasury sanctioned Funnull / Triad Nexus in May 2025.

References:
- [Sansec: polyfill.io supply chain attack](https://sansec.io/research/polyfill-supply-chain-attack)
- [Cloudflare: polyfill.io now available on cdnjs](https://blog.cloudflare.com/polyfill-io-now-available-on-cdnjs-reduce-your-supply-chain-risk)
- [OFAC sanction: Funnull / Triad Nexus](https://ofac.treasury.gov/recent-actions/20250515)
