Metadata-Version: 2.4
Name: cdn-detect
Version: 0.1.0
Summary: Identify the CDN, WAF, or bot-management vendor in front of a domain.
Author-email: DDactic <contact@ddactic.net>
License: MIT
Project-URL: Homepage, https://github.com/DDactic/cdn-detect
Project-URL: Issues, https://github.com/DDactic/cdn-detect/issues
Keywords: cdn,waf,security,fingerprinting,cloudflare,akamai,fastly
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Security
Classifier: Topic :: Internet :: WWW/HTTP
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: dnspython>=2.4
Requires-Dist: requests>=2.31
Dynamic: license-file

# cdn-detect

[![tests](https://github.com/DDactic/cdn-detect/actions/workflows/test.yml/badge.svg)](https://github.com/DDactic/cdn-detect/actions/workflows/test.yml)
[![PyPI](https://img.shields.io/pypi/v/cdn-detect.svg)](https://pypi.org/project/cdn-detect/)
[![Python](https://img.shields.io/pypi/pyversions/cdn-detect.svg)](https://pypi.org/project/cdn-detect/)
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

Identify the CDN, WAF, or bot-management vendor in front of a domain. Vendor-neutral, no API keys, no cloud account.

```
$ cdn-detect cloudflare.com github.com
domain:        cloudflare.com
ip:            104.16.132.229
is_cdn:        True
cdn_provider:  Cloudflare
methods:       ip_prefix, http_header

domain:        github.com
ip:            140.82.121.4
asn:           AS36459  (GITHUB)
is_cdn:        False
methods:
```

Why publish this? Most "what CDN is this site on?" tools either guess from a single signal (headers only, or DNS only) and miss obvious cases, or wrap a paid intel feed. `cdn-detect` runs four independent detection strategies — CNAME chain, IP prefix, ASN lookup, response headers, plus cookie-based bot-management fingerprints — and tells you which one fired. No registrations, no rate limits, MIT-licensed.

## Install

```bash
pip install cdn-detect
```

Requires Python 3.10+ and outbound DNS + HTTPS.

## Use

**CLI:**

```bash
cdn-detect example.com
cdn-detect example.com --json
cdn-detect example.com --no-http     # DNS-only, no HTTP round-trip
cdn-detect a.com b.com c.com         # multiple domains
```

**Library:**

```python
from cdn_detect import detect

result = detect("example.com")
print(result.cdn_provider)        # 'Cloudflare', 'Akamai', None, ...
print(result.detection_methods)   # ['cname', 'http_header']
print(result.bot_management)      # 'Cloudflare Bot Management', None, ...
print(result.as_dict())           # JSON-serialisable
```

## What it detects

| Signal | What's matched | Example |
|---|---|---|
| **CNAME chain** | Vendor-owned hostnames in the resolution chain | `evil.com → evil.com.cdn.cloudflare.net.` |
| **IP prefix** | Hard-coded vendor IPv4 ranges | `104.16.0.0/12 → Cloudflare` |
| **ASN** | Origin AS number via Team Cymru DNS | `AS13335 → Cloudflare` |
| **ASN org** | Vendor name in AS org string | `AS54113 FASTLY → Fastly` |
| **HTTP headers** | Vendor-specific response headers | `cf-ray:`, `x-amz-cf-id:`, `x-akamai-*` |
| **Cookies** | Vendor bot-management cookies | `__cf_bm`, `_abck`, `incap_ses_*` |
| **Tunnels** | Reverse-tunnel and zero-trust providers | `cfargotunnel.com`, `*.ts.net`, `ngrok.io` |

Currently knows about: Cloudflare, Akamai, Fastly, CloudFront, Imperva, Sucuri, Radware, StackPath, Edgio, Bunny CDN, KeyCDN, Reblaze, Azure Front Door / CDN, Google Cloud CDN, Vercel, Netlify, F5/Volterra, GitHub Pages — plus bot-management from Cloudflare, Akamai, Imperva, Radware, AWS WAF, PerimeterX/HUMAN, DataDome, Kasada — plus tunnels from Cloudflare, Tailscale, ngrok, Twingate, ZeroTier, Azure Dev Tunnels, and others.

## Why multiple strategies

A single signal misses a lot:

- Cloudflare proxied behind another reverse proxy: HTTP headers are stripped, but ASN still resolves to AS13335.
- Akamai customer with a custom hostname: CNAME chain has no `akamai.net` reference, but the IP lands in `23.32.0.0/11`.
- Customer behind Cloudflare Tunnel (`cfargotunnel.com`): the resolved IP belongs to Cloudflare, but the *protection model* is fundamentally different — `tunnel_provider` surfaces this distinction.
- Sites running an on-prem WAF that doesn't add response headers: cookies often still leak the vendor.

`detection_methods` lists every strategy that fired so you know how confident the verdict is.

## What it doesn't do

- Doesn't enumerate every vendor IP range. The IP-prefix list is curated, not exhaustive — for production-grade matching, supplement with vendor-published feeds (Cloudflare `/ips-v4`, AWS `ip-ranges.json`, etc.).
- Doesn't probe the origin behind the CDN. That's a separate problem (and a much more invasive one). For origin-discovery, see [DDactic](https://ddactic.net) or commercial DAST tooling.
- Doesn't try to fingerprint custom on-prem appliances by deep behavior — only by easily observable signals.

## Contributing signatures

The signature data lives in [`src/cdn_detect/signatures.py`](src/cdn_detect/signatures.py). PRs adding new providers or expanding IP ranges are very welcome, please include a one-line citation (vendor doc, public IP feed URL, or observable example) in the PR description.

## License

MIT. See [LICENSE](LICENSE).

## Maintained by

[DDactic](https://ddactic.net) — DDoS resilience testing and external attack-surface management.
