browser-recon.
SCAN · scn_8a4b9c2f

Scan report for shopvanguard.com

https://shopvanguard.com  ·  captured 2 May 2026, 14:22 IST  ·  142 requests, 11 endpoints, 3m 18s
Stack
Next.js 14
Anti-bot
Cloudflare BM
Auth
JWT + cookie
Fingerprinting
JA3 + canvas
Recommended approach

Scrapable with curl_cffi + residential proxies, ~3 hrs of setup, ~$0.80 per 1k requests at scale.

Tooling
curl_cffi
92% confidence
Proxy type
Residential
71% confidence
Cost band
$0.50 – $1.40
per 1k req · range
01 / DETECTION

Anti-bot stack

Cloudflare Bot Management is active in standard mode (not "I'm Under Attack"). No interactive challenge fired during the session — JS challenge was solved silently on first navigation.

Cloudflare Bot Management
Medium · standard mode
JS challenge issued on initial navigation, cleared via cf_clearance cookie. No Turnstile widget detected. __cf_bm rotates every 30 minutes.
cf-ray: 8d2a91f4eb1c4b73-BOM server: cloudflare set-cookie: __cf_bm=qH7fJ...; cf_clearance=bxQp... challenge-platform: h/b/orchestrate/v1
JA3 / TLS fingerprinting
High · deterministic
Active TLS fingerprint inspection observed. Vanilla requests and aiohttp will fail with high confidence — they emit a JA3 hash that Cloudflare flags as automated.
expected JA3: 769,49195-49199-...,0-23-65281-10-11-...,29-23-24,0 default python-requests JA3: known-blocked recommendation: curl_cffi (impersonate=chrome120) or tls-client
Canvas + WebGL fingerprinting
Low · passive
FingerprintJS Pro loaded but signal appears to be sent to analytics, not gating access. Safe to ignore for headless HTTP scraping; relevant only if using Playwright at scale.
02 / ARCHITECTURE

Site architecture

Next.js 14 with server-side rendering and React Server Components. First page-load HTML contains hydrated product data inside __NEXT_DATA__ — meaning you don't need a browser to extract listings, the JSON is already sitting in the markup.

<script id="__NEXT_DATA__" type="application/json"> {"props":{"pageProps":{"products":[{...}, ...]}},"page":"/category/[slug]"} </script>
Implication: for category and product pages, a single GET + HTML parse extracts everything. No JS execution required. JSON API fallback exists for paginated/dynamic content — see endpoint inventory below.
03 / ENDPOINTS

Endpoint inventory

11 distinct API endpoints captured, deduplicated from 142 raw requests. Static assets (images, fonts, analytics beacons) filtered out.

Method Path Purpose Auth
GET/api/productsCatalog listing, paginatednone
GET/api/products/[id]Product detailnone
GET/api/categoriesCategory treenone
POST/api/searchAlgolia-backed searchapp key in body
GET/api/inventory/[sku]Stock checknone
POST/api/cart/addCart mutationsession cookie
GET/api/user/meProfileJWT bearer
POST/api/auth/loginLogin → JWTcredentials
GET/api/recommendationsRelated itemsnone
POST/api/reviewsSubmit reviewJWT bearer
GET/api/sitemapFull URL list (gold)none
04 / DIFFICULTY

What makes this site hard (or easy)

Rather than a single 1–10 score, here's what we actually observed and how each factor affects effort.

05 / COST

Cost estimate at 1M requests/month

Based on observed signals and current market rates for proxy and tooling. Verify with replay testing (Tier 2) before committing to client pricing.

curl_cffi (free, OSS) library license: MIT
$0 included
Residential proxy (Bright Data / Decodo / Oxylabs) ~3 GB at avg 3KB per response
$450 – $900 $1.50–3.00 / GB
Compute (single VPS, async) can sustain ~50 req/sec on 2 vCPU
$20 – $40 monthly
Total estimate per 1M requests/month
$470 – $940 $0.47 – $0.94 / 1k
06 / EVIDENCE

Raw evidence trail

Every claim above is derived from data captured in your session. Click to expand.

Captured response headers (5 representative)
HTTP/2 200 server: cloudflare cf-ray: 8d2a91f4eb1c4b73-BOM cf-cache-status: DYNAMIC content-type: text/html; charset=utf-8 set-cookie: __cf_bm=qH7fJlR...; HttpOnly; Secure; SameSite=None set-cookie: cf_clearance=bxQp...; HttpOnly; Secure strict-transport-security: max-age=31536000 x-frame-options: SAMEORIGIN x-vercel-id: bom1::iad1::abc123-1714659712345-7fa8d
JS files matched to anti-bot vendors (3)
/cdn-cgi/challenge-platform/h/b/orchestrate/v1 → Cloudflare BM /_next/static/chunks/fp-pro.bundle.js → FingerprintJS Pro /cdn-cgi/bm/cv/669835187/api.js → Cloudflare BM (telemetry)
Cookies observed (names only, values redacted)
__cf_bm HttpOnly Secure SameSite=None TTL=30m cf_clearance HttpOnly Secure TTL=30d session_id HttpOnly Secure SameSite=Lax TTL=session sv_jwt HttpOnly Secure SameSite=Strict TTL=24h _vrcid (analytics — not session-critical) sv_cart_token Secure TTL=30d
Sample request payload — POST /api/search
{ "query": "running shoes", "filters": {"category": "footwear"}, "page": 1, "perPage": 24, "appId": "VANGUARD_PROD_PUBLIC", "apiKey": "9f3b...c12d" } note: appId + apiKey are public Algolia credentials, visible in client bundle. Direct Algolia access possible — bypasses /api/search entirely.