{% extends "core/base.html" %} {% load static %} {% load i18n %} {% block breadcrumbs %} {% endblock %} {% block extra_head %} {% endblock %} {% block body %}

{% if stdout %}

Add new URLs to your archive: results

                {{ stdout | safe }}
                


  Add more URLs ➕
{% else %}
{% csrf_token %}

Create a new Crawl

A Crawl is a job that processes URLs and creates Snapshots (archived copies) for each URL discovered.
The settings below apply to the entire crawl and all snapshots it creates.

Chrome Extension  Get the extension 💡 Tip: Instantly save a single URL by visiting: {{ web_base_url }}/web/https://example.com/url_to_save

{{ form.url.label_tag }}
0 URLs detected
{{ form.url }}
{% if form.url.errors %}
{{ form.url.errors }}
{% endif %}
{{ form.tag.label_tag }}
{{ form.tag }} {% if form.tag.errors %}
{{ form.tag.errors }}
{% endif %}
Tags will be applied to all snapshots created by this crawl.
{{ form.persona.label_tag }}
{{ form.persona }} {% if form.persona.errors %}
{{ form.persona.errors }}
{% endif %}
Authentication + configuration settings to use when saving URLs (cookies, user agent, resolution, timeouts, etc.) {% if can_override_crawl_config %} Create new profile / import from Chrome -> {% endif %}
{{ form.permissions.label_tag }}
{{ form.permissions }} {% if form.permissions.errors %}
{{ form.permissions.errors }}
{% endif %}
Public lists it. Unlisted only serves direct links. Private requires admin login.
{{ form.depth.label_tag }} {{ form.depth }} {% if form.depth.errors %}
{{ form.depth.errors }}
{% endif %}
Controls how many links deep the crawl will follow from the starting URLs.
{{ form.url_filters }} {% if form.url_filters.errors %}
{{ form.url_filters.errors }}
{% endif %}
{{ form.max_urls.label_tag }}
{{ form.max_urls }} {% if form.max_urls.errors %}
{{ form.max_urls.errors }}
{% endif %}
0 = unlimited. Whole numbers, e.g. 25, 300.
{{ form.crawl_max_size.label_tag }}
{{ form.crawl_max_size }} {% if form.crawl_max_size.errors %}
{{ form.crawl_max_size.errors }}
{% endif %}
0 = unlimited. Sizes: 45mb, 1.5gb, 2tb.
{{ form.crawl_timeout.label_tag }}
{{ form.crawl_timeout }} {% if form.crawl_timeout.errors %}
{{ form.crawl_timeout.errors }}
{% endif %}
0 = unlimited. Must be >10s: 11, 1.5m, 1hr.
{{ form.timeout.label_tag }}
{{ form.timeout }} {% if form.timeout.errors %}
{{ form.timeout.errors }}
{% endif %}
Must be >10s: 11, 1.5m, 1hr.
{{ form.snapshot_max_size.label_tag }}
{{ form.snapshot_max_size }} {% if form.snapshot_max_size.errors %}
{{ form.snapshot_max_size.errors }}
{% endif %}
0 = unlimited. Sizes: 45mb, 1.5gb, 2tb.
{{ form.delete_after.label_tag }}
{{ form.delete_after }} {% if form.delete_after.errors %}
{{ form.delete_after.errors }}
{% endif %}
0 = keep forever. Durations: 1hr, 30d, 3mo.
{{ form.crawl_max_concurrent_snapshots.label_tag }}
{{ form.crawl_max_concurrent_snapshots }} {% if form.crawl_max_concurrent_snapshots.errors %}
{{ form.crawl_max_concurrent_snapshots.errors }}
{% endif %}
Whole numbers, e.g. 1, 4, 12.
{{ form.notes.label_tag }} {{ form.notes }} {% if form.notes.errors %}
{{ form.notes.errors }}
{% endif %}
Optional description for this crawl (visible in the admin interface).
{% if can_override_crawl_config %}

Crawl Plugins

Select which archiving methods to run for all snapshots in this crawl. If none selected, all available plugins will be used. View plugin details →

Presets: {% for persona in recent_personas %} {% endfor %}
{% include "core/plugin_config_grid.html" with plugin_groups=form.plugin_groups %}

Advanced Crawl Options

Additional settings that control how this crawl processes URLs and creates snapshots.

{{ form.schedule.label_tag }} {{ form.schedule }} {% if form.schedule.errors %}
{{ form.schedule.errors }}
{% endif %}
Optional: Schedule this crawl to repeat automatically. Examples:
daily - Run once per day
weekly - Run once per week
0 */6 * * * - Every 6 hours (cron format)
0 0 * * 0 - Every Sunday at midnight (cron format)
{{ form.start_paused }} {{ form.start_paused.label_tag }} {% if form.start_paused.errors %}
{{ form.start_paused.errors }}
{% endif %}
Create the crawl in a paused state. No snapshots will be created until you resume it.
{{ form.config.label_tag }} {{ form.config }} {% if form.config.errors %}
{{ form.config.errors }}
{% endif %}
Override any config option for this crawl (e.g., TIMEOUT, USER_AGENT, CHROME_BINARY, etc.). URL_ALLOWLIST, URL_DENYLIST, and ENABLED_PLUGINS are updated automatically from the fields above.
{% endif %}



{% if absolute_add_path %} {% endif %} {% endif %}
{% endblock %} {% block footer %}{% endblock %} {% block sidebar %}{% endblock %}