[Crawl4AI 文档 (v0.7.x)](https://crawl4ai.docslib.dev/)
- 
                    [首页](https://crawl4ai.docslib.dev/)


- 
                    [询问AI](https://crawl4ai.docslib.dev/core/ask-ai/)


- 
                    [快速开始](https://crawl4ai.docslib.dev/core/quickstart/)


- 
                    [代码示例](https://crawl4ai.docslib.dev/core/examples/)


- 
    [Search](https://crawl4ai.docslib.dev/advanced/lazy-loading/)


[首页](https://crawl4ai.docslib.dev/)
[询问AI](https://crawl4ai.docslib.dev/core/ask-ai/)
[快速开始](https://crawl4ai.docslib.dev/core/quickstart/)
[代码示例](https://crawl4ai.docslib.dev/core/examples/)
[Search](https://crawl4ai.docslib.dev/advanced/lazy-loading/)
- 





            [首页](https://crawl4ai.docslib.dev/)





- 





            [询问AI](https://crawl4ai.docslib.dev/core/ask-ai/)





- 





            [快速开始](https://crawl4ai.docslib.dev/core/quickstart/)





- 





            [代码示例](https://crawl4ai.docslib.dev/core/examples/)





- 












            应用程序
















            [演示应用](https://crawl4ai.docslib.dev/apps/)













            [C4A脚本编辑器](https://crawl4ai.docslib.dev/apps/c4a-script/index.html)













            [LLM上下文构建器](https://crawl4ai.docslib.dev/apps/llmtxt/index.html)









- 



            [演示应用](https://crawl4ai.docslib.dev/apps/)



- 



            [C4A脚本编辑器](https://crawl4ai.docslib.dev/apps/c4a-script/index.html)



- 



            [LLM上下文构建器](https://crawl4ai.docslib.dev/apps/llmtxt/index.html)



- 












            设置与[安装](https://crawl4ai.docslib.dev/core/installation/)
















            安装













            [Docker部署](https://crawl4ai.docslib.dev/core/docker-deployment/)









- 



            [安装](https://crawl4ai.docslib.dev/core/installation/)



- 



            [Docker部署](https://crawl4ai.docslib.dev/core/docker-deployment/)



- 












            博客与[变更日志](https://github.com/unclecode/crawl4ai/blob/main/CHANGELOG.md)
















            [博客首页](https://crawl4ai.docslib.dev/blog/)













            变更日志









- 



            [博客首页](https://crawl4ai.docslib.dev/blog/)



- 



            [变更日志](https://github.com/unclecode/crawl4ai/blob/main/CHANGELOG.md)



- 












            核心
















            [命令行界面](https://crawl4ai.docslib.dev/core/cli/)













            [简单爬取](https://crawl4ai.docslib.dev/core/simple-crawling/)













            [深度爬取](https://crawl4ai.docslib.dev/core/deep-crawling/)













            [自适应爬取](https://crawl4ai.docslib.dev/core/adaptive-crawling/)













            [URL播种](https://crawl4ai.docslib.dev/core/url-seeding/)













            [C4A脚本](https://crawl4ai.docslib.dev/core/c4a-script/)













            [爬虫结果](https://crawl4ai.docslib.dev/core/crawler-result/)













            [浏览器、爬虫和LLM配置](https://crawl4ai.docslib.dev/core/browser-crawler-config/)













            [Markdown生成](https://crawl4ai.docslib.dev/core/markdown-generation/)













            [适合的Markdown](https://crawl4ai.docslib.dev/core/fit-markdown/)













            [页面交互](https://crawl4ai.docslib.dev/core/page-interaction/)













            [内容选择](https://crawl4ai.docslib.dev/core/content-selection/)













            [缓存模式](https://crawl4ai.docslib.dev/core/cache-modes/)













            [本地文件和原始HTML](https://crawl4ai.docslib.dev/core/local-files/)













            [链接与媒体](https://crawl4ai.docslib.dev/core/link-media/)









- 



            [命令行界面](https://crawl4ai.docslib.dev/core/cli/)



- 



            [简单爬取](https://crawl4ai.docslib.dev/core/simple-crawling/)



- 



            [深度爬取](https://crawl4ai.docslib.dev/core/deep-crawling/)



- 



            [自适应爬取](https://crawl4ai.docslib.dev/core/adaptive-crawling/)



- 



            [URL播种](https://crawl4ai.docslib.dev/core/url-seeding/)



- 



            [C4A脚本](https://crawl4ai.docslib.dev/core/c4a-script/)



- 



            [爬虫结果](https://crawl4ai.docslib.dev/core/crawler-result/)



-

[浏览器、爬虫和LLM配置](https://crawl4ai.docslib.dev/core/browser-crawler-config/)



- 



            [Markdown生成](https://crawl4ai.docslib.dev/core/markdown-generation/)



- 



            [适合的Markdown](https://crawl4ai.docslib.dev/core/fit-markdown/)



- 



            [页面交互](https://crawl4ai.docslib.dev/core/page-interaction/)



- 



            [内容选择](https://crawl4ai.docslib.dev/core/content-selection/)



- 



            [缓存模式](https://crawl4ai.docslib.dev/core/cache-modes/)



- 



            [本地文件和原始HTML](https://crawl4ai.docslib.dev/core/local-files/)



- 



            [链接与媒体](https://crawl4ai.docslib.dev/core/link-media/)



- 









        高级















            [概述](https://crawl4ai.docslib.dev/advanced/advanced-features.md)













            [自适应策略](https://crawl4ai.docslib.dev/advanced/adaptive-strategies/)













            [虚拟滚动](https://crawl4ai.docslib.dev/advanced/virtual-scroll/)













            [文件下载](https://crawl4ai.docslib.dev/advanced/file-downloading/)












        懒加载












            [钩子与认证](https://crawl4ai.docslib.dev/advanced/hooks-auth/)













            [代理与安全](https://crawl4ai.docslib.dev/advanced/proxy-security/)













            [无痕浏览器](https://crawl4ai.docslib.dev/advanced/undetected-browser/)













            [会话管理](https://crawl4ai.docslib.dev/advanced/session-management/)













            [多URL爬取](https://crawl4ai.docslib.dev/advanced/multi-url-crawling/)













            [爬虫调度器](https://crawl4ai.docslib.dev/advanced/crawl-dispatcher/)













            [基于身份的爬取](https://crawl4ai.docslib.dev/advanced/identity-based-crawling/)













            [SSL证书](https://crawl4ai.docslib.dev/advanced/ssl-certificate/)













            [网络与控制台捕获](https://crawl4ai.docslib.dev/advanced/network-console-capture/)













            [PDF解析](https://crawl4ai.docslib.dev/advanced/pdf-parsing/)









- 



            [概述](https://crawl4ai.docslib.dev/advanced/advanced-features.md)



- 



            [自适应策略](https://crawl4ai.docslib.dev/advanced/adaptive-strategies/)



- 



            [虚拟滚动](https://crawl4ai.docslib.dev/advanced/virtual-scroll/)



- 



            [文件下载](https://crawl4ai.docslib.dev/advanced/file-downloading/)



- 


        懒加载


- 



            [钩子与认证](https://crawl4ai.docslib.dev/advanced/hooks-auth/)



- 



            [代理与安全](https://crawl4ai.docslib.dev/advanced/proxy-security/)



- 



            [无痕浏览器](https://crawl4ai.docslib.dev/advanced/undetected-browser/)



- 



            [会话管理](https://crawl4ai.docslib.dev/advanced/session-management/)



- 



            [多URL爬取](https://crawl4ai.docslib.dev/advanced/multi-url-crawling/)



- 



            [爬虫调度器](https://crawl4ai.docslib.dev/advanced/crawl-dispatcher/)



- 



            [基于身份的爬取](https://crawl4ai.docslib.dev/advanced/identity-based-crawling/)



- 



            [SSL证书](https://crawl4ai.docslib.dev/advanced/ssl-certificate/)



- 



            [网络与控制台捕获](https://crawl4ai.docslib.dev/advanced/network-console-capture/)



- 



            [PDF解析](https://crawl4ai.docslib.dev/advanced/pdf-parsing/)



- 












            提取
















            [无[LLM策略](https://crawl4ai.docslib.dev/extraction/llm-strategies/)](https://crawl4ai.docslib.dev/extraction/no-llm-strategies/)













            LLM策略













            [聚类策略](https://crawl4ai.docslib.dev/extraction/clustring-strategies/)













            [分块](https://crawl4ai.docslib.dev/extraction/chunking/)









- 



            [无LLM策略](https://crawl4ai.docslib.dev/extraction/no-llm-strategies/)



- 



            [LLM策略](https://crawl4ai.docslib.dev/extraction/llm-strategies/)



- 



            [聚类策略](https://crawl4ai.docslib.dev/extraction/clustring-strategies/)



- 



            [分块](https://crawl4ai.docslib.dev/extraction/chunking/)



- 












            API参考
















            [异步网页爬虫](https://crawl4ai.docslib.dev/api/async-webcrawler/)

[arun()](https://crawl4ai.docslib.dev/api/arun/)













            [arun_many()](https://crawl4ai.docslib.dev/api/arun_many/)













            [浏览器、爬虫和LLM配置](https://crawl4ai.docslib.dev/api/parameters/)













            [爬取结果](https://crawl4ai.docslib.dev/api/crawl-result/)













            [策略](https://crawl4ai.docslib.dev/api/strategies/)













            [C4A脚本参考](https://crawl4ai.docslib.dev/api/c4a-script-reference/)









- 



            [异步网页爬虫](https://crawl4ai.docslib.dev/api/async-webcrawler/)



- 



            [arun()](https://crawl4ai.docslib.dev/api/arun/)



- 



            [arun_many()](https://crawl4ai.docslib.dev/api/arun_many/)



- 



            [浏览器、爬虫和LLM配置](https://crawl4ai.docslib.dev/api/parameters/)



- 



            [爬取结果](https://crawl4ai.docslib.dev/api/crawl-result/)



- 



            [策略](https://crawl4ai.docslib.dev/api/strategies/)



- 



            [C4A脚本参考](https://crawl4ai.docslib.dev/api/c4a-script-reference/)



[首页](https://crawl4ai.docslib.dev/)
[询问AI](https://crawl4ai.docslib.dev/core/ask-ai/)
[快速开始](https://crawl4ai.docslib.dev/core/quickstart/)
[代码示例](https://crawl4ai.docslib.dev/core/examples/)
- 



            [演示应用](https://crawl4ai.docslib.dev/apps/)



- 



            [C4A脚本编辑器](https://crawl4ai.docslib.dev/apps/c4a-script/index.html)



- 



            [LLM上下文构建器](https://crawl4ai.docslib.dev/apps/llmtxt/index.html)



[演示应用](https://crawl4ai.docslib.dev/apps/)
[C4A脚本编辑器](https://crawl4ai.docslib.dev/apps/c4a-script/index.html)
[LLM上下文构建器](https://crawl4ai.docslib.dev/apps/llmtxt/index.html)
- 



            [安装](https://crawl4ai.docslib.dev/core/installation/)



- 



            [Docker部署](https://crawl4ai.docslib.dev/core/docker-deployment/)



[安装](https://crawl4ai.docslib.dev/core/installation/)
[Docker部署](https://crawl4ai.docslib.dev/core/docker-deployment/)
- 



            [博客首页](https://crawl4ai.docslib.dev/blog/)



- 



            [变更日志](https://github.com/unclecode/crawl4ai/blob/main/CHANGELOG.md)



[博客首页](https://crawl4ai.docslib.dev/blog/)
[变更日志](https://github.com/unclecode/crawl4ai/blob/main/CHANGELOG.md)
- 



            [命令行界面](https://crawl4ai.docslib.dev/core/cli/)



- 



            [简单爬取](https://crawl4ai.docslib.dev/core/simple-crawling/)



- 



            [深度爬取](https://crawl4ai.docslib.dev/core/deep-crawling/)



- 



            [自适应爬取](https://crawl4ai.docslib.dev/core/adaptive-crawling/)



- 



            [URL播种](https://crawl4ai.docslib.dev/core/url-seeding/)



- 



            [C4A脚本](https://crawl4ai.docslib.dev/core/c4a-script/)



- 



            [爬虫结果](https://crawl4ai.docslib.dev/core/crawler-result/)



- 



            [浏览器、爬虫和LLM配置](https://crawl4ai.docslib.dev/core/browser-crawler-config/)



- 



            [Markdown生成](https://crawl4ai.docslib.dev/core/markdown-generation/)



- 



            [适合的Markdown](https://crawl4ai.docslib.dev/core/fit-markdown/)



- 



            [页面交互](https://crawl4ai.docslib.dev/core/page-interaction/)



- 



            [内容选择](https://crawl4ai.docslib.dev/core/content-selection/)



- 



            [缓存模式](https://crawl4ai.docslib.dev/core/cache-modes/)



- 



            [本地文件和原始HTML](https://crawl4ai.docslib.dev/core/local-files/)



- 



            [链接与媒体](https://crawl4ai.docslib.dev/core/link-media/)



[命令行界面](https://crawl4ai.docslib.dev/core/cli/)
[简单爬取](https://crawl4ai.docslib.dev/core/simple-crawling/)
[深度爬取](https://crawl4ai.docslib.dev/core/deep-crawling/)
[自适应爬取](https://crawl4ai.docslib.dev/core/adaptive-crawling/)
[URL播种](https://crawl4ai.docslib.dev/core/url-seeding/)
[C4A脚本](https://crawl4ai.docslib.dev/core/c4a-script/)
[爬虫结果](https://crawl4ai.docslib.dev/core/crawler-result/)
[浏览器、爬虫和LLM配置](https://crawl4ai.docslib.dev/core/browser-crawler-config/)
[Markdown生成](https://crawl4ai.docslib.dev/core/markdown-generation/)
[适合的Markdown](https://crawl4ai.docslib.dev/core/fit-markdown/)
[页面交互](https://crawl4ai.docslib.dev/core/page-interaction/)

[内容选择](https://crawl4ai.docslib.dev/core/content-selection/)
[缓存模式](https://crawl4ai.docslib.dev/core/cache-modes/)
[本地文件和原始HTML](https://crawl4ai.docslib.dev/core/local-files/)
[链接与媒体](https://crawl4ai.docslib.dev/core/link-media/)
- 



            [概述](https://crawl4ai.docslib.dev/advanced/advanced-features.md)



- 



            [自适应策略](https://crawl4ai.docslib.dev/advanced/adaptive-strategies/)



- 



            [虚拟滚动](https://crawl4ai.docslib.dev/advanced/virtual-scroll/)



- 



            [文件下载](https://crawl4ai.docslib.dev/advanced/file-downloading/)



- 


        懒加载


- 



            [钩子与认证](https://crawl4ai.docslib.dev/advanced/hooks-auth/)



- 



            [代理与安全](https://crawl4ai.docslib.dev/advanced/proxy-security/)



- 



            [无痕浏览器](https://crawl4ai.docslib.dev/advanced/undetected-browser/)



- 



            [会话管理](https://crawl4ai.docslib.dev/advanced/session-management/)



- 



            [多URL爬取](https://crawl4ai.docslib.dev/advanced/multi-url-crawling/)



- 



            [爬虫调度器](https://crawl4ai.docslib.dev/advanced/crawl-dispatcher/)



- 



            [基于身份的爬取](https://crawl4ai.docslib.dev/advanced/identity-based-crawling/)



- 



            [SSL证书](https://crawl4ai.docslib.dev/advanced/ssl-certificate/)



- 



            [网络与控制台捕获](https://crawl4ai.docslib.dev/advanced/network-console-capture/)



- 



            [PDF解析](https://crawl4ai.docslib.dev/advanced/pdf-parsing/)



[概述](https://crawl4ai.docslib.dev/advanced/advanced-features.md)
[自适应策略](https://crawl4ai.docslib.dev/advanced/adaptive-strategies/)
[虚拟滚动](https://crawl4ai.docslib.dev/advanced/virtual-scroll/)
[文件下载](https://crawl4ai.docslib.dev/advanced/file-downloading/)
[钩子与认证](https://crawl4ai.docslib.dev/advanced/hooks-auth/)
[代理与安全](https://crawl4ai.docslib.dev/advanced/proxy-security/)
[无痕浏览器](https://crawl4ai.docslib.dev/advanced/undetected-browser/)
[会话管理](https://crawl4ai.docslib.dev/advanced/session-management/)
[多URL爬取](https://crawl4ai.docslib.dev/advanced/multi-url-crawling/)
[爬虫调度器](https://crawl4ai.docslib.dev/advanced/crawl-dispatcher/)
[基于身份的爬取](https://crawl4ai.docslib.dev/advanced/identity-based-crawling/)
[SSL证书](https://crawl4ai.docslib.dev/advanced/ssl-certificate/)
[网络与控制台捕获](https://crawl4ai.docslib.dev/advanced/network-console-capture/)
[PDF解析](https://crawl4ai.docslib.dev/advanced/pdf-parsing/)
- 



            [无LLM策略](https://crawl4ai.docslib.dev/extraction/no-llm-strategies/)



- 



            [LLM策略](https://crawl4ai.docslib.dev/extraction/llm-strategies/)



- 



            [聚类策略](https://crawl4ai.docslib.dev/extraction/clustring-strategies/)



- 



            [分块](https://crawl4ai.docslib.dev/extraction/chunking/)



[无LLM策略](https://crawl4ai.docslib.dev/extraction/no-llm-strategies/)
[LLM策略](https://crawl4ai.docslib.dev/extraction/llm-strategies/)
[聚类策略](https://crawl4ai.docslib.dev/extraction/clustring-strategies/)
[分块](https://crawl4ai.docslib.dev/extraction/chunking/)
- 



            [异步网页爬虫](https://crawl4ai.docslib.dev/api/async-webcrawler/)



- 



            [arun()](https://crawl4ai.docslib.dev/api/arun/)



- 



            [arun_many()](https://crawl4ai.docslib.dev/api/arun_many/)



- 



            [浏览器、爬虫和LLM配置](https://crawl4ai.docslib.dev/api/parameters/)



- 



            [爬取结果](https://crawl4ai.docslib.dev/api/crawl-result/)



- 



            [策略](https://crawl4ai.docslib.dev/api/strategies/)



- 



            [C4A脚本参考](https://crawl4ai.docslib.dev/api/c4a-script-reference/)



[异步网页爬虫](https://crawl4ai.docslib.dev/api/async-webcrawler/)
[arun()](https://crawl4ai.docslib.dev/api/arun/)
[arun_many()](https://crawl4ai.docslib.dev/api/arun_many/)
[浏览器、爬虫和LLM配置](https://crawl4ai.docslib.dev/api/parameters/)
[爬取结果](https://crawl4ai.docslib.dev/api/crawl-result/)
[策略](https://crawl4ai.docslib.dev/api/strategies/)
[C4A脚本参考](https://crawl4ai.docslib.dev/api/c4a-script-reference/)
- [处理懒加载图片](https://crawl4ai.docslib.dev/advanced/lazy-loading/#_1)

- [示例：确保懒加载图片出现](https://crawl4ai.docslib.dev/advanced/lazy-loading/#_2)
- [与其他链接和媒体过滤器结合使用](https://crawl4ai.docslib.dev/advanced/lazy-loading/#_3)
- [技巧与故障排除](https://crawl4ai.docslib.dev/advanced/lazy-loading/#_4)
[处理懒加载图片](https://crawl4ai.docslib.dev/advanced/lazy-loading/#_1)
[示例：确保懒加载图片出现](https://crawl4ai.docslib.dev/advanced/lazy-loading/#_2)
[与其他链接和媒体过滤器结合使用](https://crawl4ai.docslib.dev/advanced/lazy-loading/#_3)
[技巧与故障排除](https://crawl4ai.docslib.dev/advanced/lazy-loading/#_4)

## 处理懒加载图片

如今许多网站在滚动时会懒加载图片。若需确保它们出现在最终爬取结果（及 result.media 中），可考虑：

```
result.media
```

1. wait_for_images=True – 等待图片完全加载。 2. scan_full_page – 强制爬虫滚动整个页面以触发懒加载。 3. scroll_delay – 在滚动步骤间添加短暂延迟。

```
wait_for_images=True
```


```
scan_full_page
```


```
scroll_delay
```

注意：若网站需要多次触发"加载更多"或复杂交互，请参阅[页面交互文档](https://crawl4ai.docslib.dev/core/page-interaction/)。对于虚拟滚动（Twitter/Instagram风格）的网站，请查看[虚拟滚动文档](https://crawl4ai.docslib.dev/advanced/virtual-scroll/)。

### 示例：确保懒加载图片出现

```
https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-33https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-32https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-31https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-30https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-29https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-28https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-27https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-26https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-25https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-24https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-23https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-22https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-21https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-20https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-19https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-18https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-17https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-16https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-15https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-14https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-13https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-12https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-11https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-10https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-9https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-8https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-7https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-6https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-5https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-4https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-3https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-2https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-1import asyncio from crawl4ai import AsyncWebCrawler, CrawlerRunConfig, BrowserConfig from crawl4ai.async_configs import CacheMode async def main(): config = CrawlerRunConfig( # 强制爬虫等待图片完全加载 wait_for_images=True, # 选项1：若需自动滚动页面加载图片 scan_full_page=True, # 指示爬虫尝试滚动整个页面 scroll_delay=0.5, # 滚动步骤间的延迟（秒） # 选项2：若网站使用"加载更多"或JS触发器加载图片， # 可在此指定js_code或wait_for逻辑 cache_mode=CacheMode.BYPASS, verbose=True ) async with AsyncWebCrawler(config=BrowserConfig(headless=True)) as crawler: result = await crawler.arun("https://www.example.com/gallery", config=config) if result.success: images = result.media.get("images", []) print("Images found:", len(images)) for i, img in enumerate(images[:5]): print(f"[Image {i}] URL: {img['src']}, Score: {img.get('score','N/A')}") else: print("Error:", result.error_message) if __name__ == "__main__": asyncio.run(main())
```

### 示例：确保懒加载图片出现

```
https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-33https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-32https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-31https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-30https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-29https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-28https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-27https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-26https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-25https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-24https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-23https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-22https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-21https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-20https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-19https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-18https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-17https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-16https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-15https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-14https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-13https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-12https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-11https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-10https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-9https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-8https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-7https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-6https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-5https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-4https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-3https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-2https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-0-1import asyncio from crawl4ai import AsyncWebCrawler, CrawlerRunConfig, BrowserConfig from crawl4ai.async_configs import CacheMode async def main(): config = CrawlerRunConfig( # 强制爬虫等待图片完全加载 wait_for_images=True, # 选项1：若需自动滚动页面加载图片 scan_full_page=True, # 指示爬虫尝试滚动整个页面 scroll_delay=0.5, # 滚动步骤间的延迟（秒） # 选项2：若网站使用"加载更多"或JS触发器加载图片， # 可在此指定js_code或wait_for逻辑 cache_mode=CacheMode.BYPASS, verbose=True ) async with AsyncWebCrawler(config=BrowserConfig(headless=True)) as crawler: result = await crawler.arun("https://www.example.com/gallery", config=config) if result.success: images = result.media.get("images", []) print("Images found:", len(images)) for i, img in enumerate(images[:5]): print(f"[Image {i}] URL: {img['src']}, Score: {img.get('score','N/A')}") else: print("Error:", result.error_message) if __name__ == "__main__": asyncio.run(main())
```

说明：
- wait_for_images=True
  爬虫会尝试确保图片完成加载后再生成最终HTML。  
- scan_full_page=True
  指示爬虫尝试从顶部滚动到底部。每次滚动有助于触发懒加载。  
- scroll_delay=0.5
  每次滚动步骤间暂停0.5秒。让网站在继续前完成图片加载。

```
wait_for_images=True
```


```
scan_full_page=True
```


```
scroll_delay=0.5
```

适用场景：
- 懒加载：若图片仅在用户滚动到视窗内才显示，scan_full_page + scroll_delay 可帮助爬虫捕获它们。  
- 重型页面：若页面极长，需注意完整扫描可能较慢。可调整 scroll_delay 或最大滚动步数。

```
scan_full_page
```


```
scroll_delay
```


```
scroll_delay
```

## 与其他链接和媒体过滤器结合使用
仍可将懒加载逻辑与常规的exclude_external_images、exclude_domains或链接过滤结合：

```
https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-11https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-10https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-9https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-8https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-7https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-6https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-5https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-4https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-3https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-2https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-1config = CrawlerRunConfig( wait_for_images=True, scan_full_page=True, scroll_delay=0.5, # 若只需本地图片则过滤外部图片 exclude_external_images=True, # 排除特定域名的链接 exclude_domains=["spammycdn.com"], )
```


```
https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-11https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-10https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-9https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-8https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-7https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-6https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-5https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-4https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-3https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-2https://crawl4ai.docslib.dev/advanced/lazy-loading/#__codelineno-1-1config = CrawlerRunConfig( wait_for_images=True, scan_full_page=True, scroll_delay=0.5, # 若只需本地图片则过滤外部图片 exclude_external_images=True, # 排除特定域名的链接 exclude_domains=["spammycdn.com"], )
```

此方法确保捕获主域名的所有图片同时忽略外部图片，且爬虫会实际滚动整个页面以触发懒加载。

## 技巧与故障排除
1. 长页面 - 在极长或无限滚动页面上设置 scan_full_page=True 可能消耗较多资源。 - 可考虑使用[钩子](https://crawl4ai.docslib.dev/core/page-interaction/)或专用逻辑来加载特定区域或重复触发"加载更多"。

```
scan_full_page=True
```

2. 混合图片行为 - 部分网站滚动时批量加载图片。若遗漏图片，可增加 scroll_delay 或用JS代码/钩子循环调用部分滚动。

```
scroll_delay
```

3. 与动态等待结合 - 若网站占位图需特定事件后才转为真实图片，可使用 wait_for="css:img.loaded" 或自定义JS wait_for。

```
wait_for="css:img.loaded"
```


```
wait_for
```

4. 缓存 - 若启用 cache_mode，重复爬取可能跳过部分网络请求。若怀疑缓存遗漏新图片，可设 cache_mode=CacheMode.BYPASS 强制重新获取。

```
cache_mode
```


```
cache_mode=CacheMode.BYPASS
```

通过懒加载支持、wait_for_images和scan_full_page设置，可捕获预期的完整图片库或信息流——即使网站仅在用户滚动时加载它们。结合标准媒体过滤和域名排除策略，形成完整的链接与媒体处理方案。

##### Search
Type to start searching