{% extends "core/base.html" %} {% load static %} {% load i18n %} {% block breadcrumbs %} {% endblock %} {% block extra_head %} {% endblock %} {% block body %}


{% if stdout %}

Add new URLs to your archive: results

                {{ stdout | safe }}
                


  Add more URLs ➕
{% else %}
{% csrf_token %}

Create a new Crawl

A Crawl is a job that processes URLs and creates Snapshots (archived copies) for each URL discovered. The settings below apply to the entire crawl and all snapshots it creates.

{{ form.url.label_tag }}
0 URLs detected
{{ form.url }}
{% if form.url.errors %}
{{ form.url.errors }}
{% endif %}
{{ form.tag.label_tag }}
{{ form.tag }} {% if form.tag.errors %}
{{ form.tag.errors }}
{% endif %}
Tags will be applied to all snapshots created by this crawl.
{{ form.persona.label_tag }}
{{ form.persona }} {% if form.persona.errors %}
{{ form.persona.errors }}
{% endif %}
Authentication + configuration settings to use when saving URLs (cookies, user agent, resolution, timeouts, etc.) {% if can_override_crawl_config %} Create new profile / import from Chrome -> {% endif %}
{{ form.permissions.label_tag }}
{{ form.permissions }} {% if form.permissions.errors %}
{{ form.permissions.errors }}
{% endif %}
Public lists it. Unlisted only serves direct links. Private requires admin login.
{{ form.depth.label_tag }} {{ form.depth }} {% if form.depth.errors %}
{{ form.depth.errors }}
{% endif %}
Controls how many links deep the crawl will follow from the starting URLs.
{{ form.url_filters }} {% if form.url_filters.errors %}
{{ form.url_filters.errors }}
{% endif %}
{{ form.max_urls.label_tag }}
{{ form.max_urls }} {% if form.max_urls.errors %}
{{ form.max_urls.errors }}
{% endif %}
0 = unlimited. Whole numbers, e.g. 25, 300.
{{ form.crawl_max_size.label_tag }}
{{ form.crawl_max_size }} {% if form.crawl_max_size.errors %}
{{ form.crawl_max_size.errors }}
{% endif %}
0 = unlimited. Sizes: 45mb, 1.5gb, 2tb.
{{ form.crawl_timeout.label_tag }}
{{ form.crawl_timeout }} {% if form.crawl_timeout.errors %}
{{ form.crawl_timeout.errors }}
{% endif %}
0 = unlimited. Must be >10s: 11, 1.5m, 1hr.
{{ form.timeout.label_tag }}
{{ form.timeout }} {% if form.timeout.errors %}
{{ form.timeout.errors }}
{% endif %}
Must be >10s: 11, 1.5m, 1hr.
{{ form.snapshot_max_size.label_tag }}
{{ form.snapshot_max_size }} {% if form.snapshot_max_size.errors %}
{{ form.snapshot_max_size.errors }}
{% endif %}
0 = unlimited. Sizes: 45mb, 1.5gb, 2tb.
{{ form.delete_after.label_tag }}
{{ form.delete_after }} {% if form.delete_after.errors %}
{{ form.delete_after.errors }}
{% endif %}
0 = keep forever. Durations: 1hr, 30d, 3mo.
{{ form.crawl_max_concurrent_snapshots.label_tag }}
{{ form.crawl_max_concurrent_snapshots }} {% if form.crawl_max_concurrent_snapshots.errors %}
{{ form.crawl_max_concurrent_snapshots.errors }}
{% endif %}
Whole numbers, e.g. 1, 4, 12.
{{ form.notes.label_tag }} {{ form.notes }} {% if form.notes.errors %}
{{ form.notes.errors }}
{% endif %}
Optional description for this crawl (visible in the admin interface).
{% if can_override_crawl_config %}

Crawl Plugins

Select which archiving methods to run for all snapshots in this crawl. If none selected, all available plugins will be used. View plugin details →

Quick Select: {% for persona in recent_personas %} {% endfor %}
{% include "core/plugin_config_grid.html" with plugin_groups=form.plugin_groups %}

Advanced Crawl Options

Additional settings that control how this crawl processes URLs and creates snapshots.

{{ form.schedule.label_tag }} {{ form.schedule }} {% if form.schedule.errors %}
{{ form.schedule.errors }}
{% endif %}
Optional: Schedule this crawl to repeat automatically. Examples:
daily - Run once per day
weekly - Run once per week
0 */6 * * * - Every 6 hours (cron format)
0 0 * * 0 - Every Sunday at midnight (cron format)
{{ form.index_only }} {{ form.index_only.label_tag }} {% if form.index_only.errors %}
{{ form.index_only.errors }}
{% endif %}
Create the crawl and queue snapshots without running archive plugins yet.
{{ form.config.label_tag }} {{ form.config }} {% if form.config.errors %}
{{ form.config.errors }}
{% endif %}
Override any config option for this crawl (e.g., TIMEOUT, USER_AGENT, CHROME_BINARY, etc.). URL_ALLOWLIST, URL_DENYLIST, and ENABLED_PLUGINS are updated automatically from the fields above.
{% endif %}



{% if absolute_add_path %} {% endif %} {% endif %}
{% endblock %} {% block footer %}{% endblock %} {% block sidebar %}{% endblock %}