{% from 'partials/hero_cta.html' import hero_cta %} {% from 'partials/optimizer_preset_selector.html' import optimizer_preset_selector %} {% if tab_name == 'publishing' %} {% elif tab_name == 'hardware' %} {% elif tab_name == 'model' %} {% elif tab_name == 'basic' %} {% elif tab_name == 'training' %} {% elif tab_name == 'validation' %} {% endif %} {% if prompt_libraries is defined %} {% endif %} {% if tab_name == 'publishing' %} {% elif tab_name == 'hardware' %} {% elif tab_name == 'model' %} {% elif tab_name == 'basic' %} {% elif tab_name == 'training' %} {% elif tab_name == 'validation' %} {% endif %}
{% if tab_name == 'publishing' %} {% call hero_cta( theme='green', icon_class='fa-cloud-upload-alt', icon_shape='rounded', title='Publishing to HuggingFace Hub', description='Configure automatic uploads of your trained models and checkpoints to the HuggingFace Hub. This allows you to share your work, version your models, and make them accessible to others.', features=[ {'icon': 'fa-key', 'color': 'text-warning', 'label': 'Authentication'}, {'icon': 'fa-code-branch', 'color': 'text-info', 'label': 'Repository Config'}, {'icon': 'fa-upload', 'color': 'text-success', 'label': 'Auto-Upload'}, ], show_condition='showHeroCTA()', dismiss_method='dismissHeroCTA()', cta_primary={'label': 'Dismiss', 'icon': 'fa-check', 'action': 'dismissHeroCTA()'}, tip_text='You can upload checkpoints manually from the Checkpoints tab, or enable automatic uploads here.' ) %}
Authentication

Connect your HuggingFace account using an access token. Tokens are stored securely in your local HuggingFace cache (~/.cache/huggingface/token).

Repository Settings

Configure your Hub repository ID, visibility (public/private), and whether to push checkpoints automatically during training.

Automatic Uploads

Enable "Push to Hub" to automatically upload checkpoints as they're saved. Great for long training runs where you want continuous backups.

{% endcall %}
{% elif tab_name == 'hardware' %} {% call hero_cta( theme='purple', icon_class='fa-microchip', icon_shape='rounded', title='Hardware & Distributed Training', description='Configure multi-GPU training, distributed computing, and system resource settings. Most users can leave these at their defaults — SimpleTuner auto-detects your hardware.', features=[ {'icon': 'fa-project-diagram', 'color': 'text-info', 'label': 'Multi-GPU'}, {'icon': 'fa-server', 'color': 'text-warning', 'label': 'Multi-Node'}, {'icon': 'fa-cogs', 'color': 'text-success', 'label': 'Worker Threads'}, ], show_condition='showHeroCTA()', dismiss_method='dismissHeroCTA()', cta_primary={'label': 'Dismiss', 'icon': 'fa-check', 'action': 'dismissHeroCTA()'}, tip_text='For single-GPU training, you typically don\'t need to change anything here. These settings become important for multi-GPU or multi-machine setups.' ) %}
Multi-GPU (FSDP/DeepSpeed)

Enable Fully Sharded Data Parallel (FSDP) or DeepSpeed to split your model across multiple GPUs, reducing memory per device and enabling larger batch sizes.

Multi-Node Training

Scale beyond a single machine. Note: Multi-node via WebUI is experimental — for production use, follow DISTRIBUTED.md and use the command-line approach.

Worker & Thread Settings

Tune the number of data loading workers and PyTorch threads to optimize CPU utilization during preprocessing and training.

{% endcall %}
{% elif tab_name == 'basic' %}

Quick Start — Easy Mode

Get training quickly with these essential settings. This simplified view covers the core options you need to start your first training run. For advanced configuration, dismiss this panel to access the full form.

Loading...
1 Project Identity

Give your training run a name so you can track it in logging platforms and find your checkpoints later.

Groups related training runs together in WandB or TensorBoard. Think of it as a folder name.
Identifies this specific training attempt. Use something descriptive like "flux-style-v2" or "character-lora-attempt3".
WandB provides cloud-based dashboards with loss graphs, sample images, and easy sharing. TensorBoard runs locally. None disables logging entirely.
Where checkpoints and trained weights will be saved. Use absolute paths (like /home/user/models) or relative paths from the SimpleTuner directory.
Continue training from a previous checkpoint. "Latest" automatically finds the most recent one. Useful if training was interrupted or you want to train longer. Note: Cloud training runs are currently stateless and do not support resuming from checkpoints.
2 Dataset Configuration

Connect your training images. Create dataset configurations in the Datasets tab first, then select them here.

The dataset plan that defines where your images are located and how they should be processed. Create new ones in the Datasets tab.
Images processed per training step (per GPU). Higher = faster but needs more VRAM. Start with 1-4.
Advanced Dataset Options
3 Dataset Defaults
These are fallback values. They apply to new datasets or datasets where you haven't specified these options. Individual dataset entries can override these settings.
Default target resolution in pixels. Must be divisible by 64. SDXL/Flux typically use 1024, SD1.5 uses 512. Higher = better quality but more VRAM.
Images smaller than this (on any side) may be upscaled, which can harm quality. Leave empty for no minimum. Recommended: set near your training resolution.
Without this limit, very large images (e.g., 4000×3000) get randomly cropped to training resolution, often cutting out the subject entirely and training on meaningless background patches. Set this to downsample oversized images first, preserving the full scene before cropping.
When downsampling oversized images, resize them to this target before cropping. Works with Maximum Image Size. Leave empty to disable.
Text Files: Look for photo.txt alongside photo.jpg (matching filename, different extension).
Filename: Use the image filename as the caption.
Instance Prompt: Same caption for all images (set below).
Parquet: Load from structured dataset files.
{% elif tab_name == 'training' %}

Training Parameters — Easy Mode

Configure how long and how fast your model learns. These settings control the training process itself—get them right and your model improves steadily; get them wrong and you waste time or damage quality.

Loading...
1 Training Duration

Choose how long to train. You can specify either epochs (passes through your dataset) or a fixed number of steps. For small datasets (under 100 images), 1-3 epochs is typical. For larger datasets, you might use fewer epochs or set a step limit.

One epoch = one complete pass through all your training images. More epochs let the model see your data more times, but too many causes overfitting—the model memorizes instead of learning. Start with 1-3 for LoRA, 5-10 for full fine-tuning.
If set, training stops after this many steps regardless of epochs. Useful for consistent training runs across different dataset sizes. Common ranges: 500-2000 for LoRA, 5000-20000 for full fine-tuning. Leave empty to use epochs instead.
2 Checkpointing

Checkpoints are snapshots of your model during training. They let you resume if training is interrupted, compare quality at different stages, and pick the best version if you overtrain.

Creates a checkpoint every N training steps. Lower values (100-250) give more recovery points but use more disk space. Higher values (500-1000) save less often. Match this to your training length—if training 1000 steps, 250 gives you 4 checkpoints to compare.
Additionally save at the end of every N epochs. Useful for long training runs where you want guaranteed saves at epoch boundaries. Leave empty to only use step-based checkpointing.
Maximum checkpoints kept on disk. Older ones are automatically deleted. Set to 0 for unlimited (warning: can use significant disk space). 3-5 is usually enough to find the best training point.
3 Learning Rate

The learning rate controls how much the model changes with each training step. Too high and training becomes unstable or "fries" the model. Too low and progress is painfully slow. This is often the most impactful hyperparameter.

LoRA typical range: 1e-4 to 5e-4 (0.0001 to 0.0005)
Full fine-tuning: 1e-6 to 1e-5 (0.000001 to 0.00001)
If you see NaN losses or wildly jumping values, your LR is too high. If loss barely moves, it's too low.
Constant: Same LR throughout—simple and predictable.
Constant with Warmup: Starts low, ramps up, then stays flat—helps stability.
Cosine/Sine: Gradually decreases LR—often produces smoother results for longer runs.
Number of steps to gradually increase LR from zero. Prevents early instability when the model hasn't "warmed up" yet. Typical: 5-10% of total steps. For a 1000-step run, try 50-100 warmup steps.
The LR will decay to this value by the end of training. For cosine/sine/polynomial schedulers. Usually set to 10-100x lower than your starting LR (e.g., if LR is 1e-4, try lr_end of 1e-6).
4 Optimizer & Stability

The optimizer is the algorithm that actually updates model weights based on gradients. Different optimizers have different memory requirements and behaviors.

{{ optimizer_preset_selector('getOptimizerPresetCards()', 'selectedOptimizerPreset === preset.key', 'selectOptimizerPreset') }}
adamw_bf16: Fast, reliable default for most GPUs with bfloat16 support.
adamw8bit: Uses less VRAM with minimal quality loss—good for memory-constrained setups.
prodigy: Auto-tunes learning rate—set LR to 1.0 and let it figure things out (experimental).
Clips gradients that exceed this magnitude, preventing "exploding gradients" that can destabilize training. Default 2.0 works well for most cases. If training is unstable with difficult data, try lowering to 1.0. Set to 0 to disable clipping entirely (not recommended).
5 Flow Schedule Flux / SD3 / Flow Models

Flow-matching models (like Flux) use a noise schedule that can be shifted to change what the model focuses on learning. This only applies to flow-based architectures—ignore for standard diffusion models.

Automatically calculates optimal schedule shift based on your training resolution. Enable this for a hands-off approach. When enabled, the manual shift value is ignored.
Higher values (3-5): Focus on composition and large-scale structure.
Lower values (1-2): Focus on fine details and textures.
Default of 3.0 is balanced. Adjust based on what your training data emphasizes.
{% elif tab_name == 'validation' %}

Validation — Easy Mode

Validation generates sample images during training so you can see how your model is learning. These previews help you catch problems early—like overfitting or wrong concepts—before wasting hours on a bad run.

Loading...
1 Validation Schedule

Control how often validation images are generated. More frequent validation gives you better visibility into training progress but adds overhead. Balance between insight and training speed.

Generate validation images every N training steps. Lower values (50-100) catch problems faster but slow training. Higher values (250-500) are less intrusive. For quick 500-step runs, try 100. For longer runs, 250-500 works well.
Additionally validate at the end of every N epochs. Complements step-based validation. Leave empty to only use step interval.
Denoising steps for validation images. More steps = better quality but slower. 20-30 is typical. Use fewer (10-15) for speed during training; you're checking concepts, not final quality.
2 Validation Prompts

These prompts will be used to generate preview images during training. Choose prompts that test what you're training—if you're training a character, use prompts featuring that character. If training a style, use prompts that should show that style.

The prompt used to generate validation images. Include your trigger word if training a concept. This tests whether your training is working—if the generated images don't match what you expect, something's wrong.
What to avoid in generated images. Common values: "blurry, cropped, ugly, deformed". Leave empty to disable negative prompting.
3 Prompt Library Optional

Instead of a single prompt, use a library of prompts to test your model from multiple angles. Each validation run cycles through your library, giving you diverse samples to evaluate training progress.

Choose an existing prompt library or create a new one below.
prompts in library:
Edit Prompt Library
Prompt Entries:
4 Generation Settings

Control how validation images are generated. Consistent seeds help you compare training progress across checkpoints; random seeds give variety but make comparison harder.

How closely to follow the prompt. 7-8 is standard. Higher (10-15) = more literal. For distilled models (Flux schnell), use 1.0.
When enabled, each validation uses a random seed for variety. Disable to use a fixed seed for consistent comparison.
Using the same seed produces the same "random" noise, making it easier to compare outputs across training steps.
{% elif tab_name == 'model' %}

Model Configuration — Easy Mode

Configure the essential model settings quickly. For advanced options, dismiss this panel to access the full form.

Loading...
1 Choose Your Model
2 Training Approach
Train a small adapter that modifies the base model. Uses far less memory and is ideal for most use cases.
Train all model weights. Requires significant VRAM and aggressive memory optimization. Very costly on cloud.
LoRA Rank

Lower ranks (4-32) are best for style transfer and simple concepts. Train faster with less data.

Higher ranks (64-128) can learn more complex concepts but require more training steps, more data, and higher batch sizes to converge properly.

If changing rank without adjusting other settings: lower the learning rate for higher ranks, raise it for lower ranks.

3 Memory Optimization
Use Memory Presets
Not sure where to start? Presets automatically configure optimal settings based on your model and available hardware.
Or configure manually
RamTorch
Streams model weights from system RAM during forward/backward pass. Fast but requires substantial system memory (64GB+ recommended).
Block Swap
Moves transformer blocks to CPU memory. Uses less system RAM than RamTorch but adds latency per block.
Offload Amount / blocks (%)
20% for light offload, 50% for balanced, 80-100% for maximum VRAM savings (slowest).
Block Swap
Not available for this model architecture.
VAE Options
Only enable if you get OOM errors during pre-caching. Mostly needed for high resolutions or video models.
How many images to process at once during VAE caching. Higher = faster but uses more VRAM. Reducing this has no impact on quality. For most datasets, 1 is fine.
4 Quantization Reduce precision to save VRAM
Quantizing the base model has moderate quality impact but saves significant VRAM.
Text encoder quantization has the highest quality impact. Only needed for very large encoders (Mistral, Llama).
Warning: Text encoder quantization significantly affects output quality. Only recommended for very large encoders.
Training Enhancements
{% endif %}
{{ tab_config.title if tab_config else tab_name|title }} Configuration
{% if tab_config and tab_config.description %}

{{ tab_config.description }}

{% endif %}
{% if tab_name in ('publishing', 'hardware') %} {% elif tab_name in ('model', 'basic', 'training', 'validation') %} {% endif %}
{% set disabled_args = fields | selectattr('disabled') | map(attribute='arg_name') | reject('equalto', None) | list %} {% if disabled_args %} {% endif %} {% if sections %} {% for section in sections %} {% set section_fields = fields | selectattr('section_id', 'equalto', section.id) | list %} {% set has_parent = section_fields[0].get('parent_section') if section_fields else false %} {% if not has_parent %}
{% if section.icon %}{% endif %} {{ section.title }} {% if section.get('advanced') %} Advanced {% endif %}
{% if section.description %}

{{ section.description }}

{% endif %} {% if tab_name == 'validation' and section.id == 'prompt_management' %}
Prompt controls disabled.
Dataset caption driven validation is active; prompts are managed automatically for this model.
{% endif %} {% if section.get('template') %}
{% include section.template %}
{% elif section_fields %} {% set subsections = {} %} {% for field in section_fields %} {% set subsection_name = field.get('subsection', 'general') %} {% if subsection_name not in subsections %} {% set _ = subsections.update({subsection_name: []}) %} {% endif %} {% set _ = subsections[subsection_name].append(field) %} {% endfor %}
{% for subsection_name, subsection_fields in subsections.items() %} {% if subsections|length > 1 and subsection_name != 'general' %}
{{ subsection_name | title | replace('_', ' ') }}
{% endif %} {% for field in subsection_fields %} {% include 'partials/form_field_htmx.html' %} {% endfor %} {% if not loop.last and subsections|length > 1 %}

{% endif %} {% endfor %}
{% elif section.empty_message %}
{{ section.empty_message }}
{% endif %} {% for subsection in sections %} {% set subsection_fields = fields | selectattr('section_id', 'equalto', subsection.id) | list %} {% set subsection_parent = subsection_fields[0].get('parent_section') if subsection_fields else false %} {% if subsection_parent == section.id %}
{% if subsection.icon %}{% endif %} {{ subsection.title }} Advanced
{% if subsection.description %}

{{ subsection.description }}

{% endif %} {% if subsection.get('template') %}
{% include subsection.template %}
{% elif subsection_fields %}
{% for field in subsection_fields %} {% include 'partials/form_field_htmx.html' %} {% endfor %}
{% elif subsection.empty_message %}
{{ subsection.empty_message }}
{% endif %}
{% endif %} {% endfor %}
{% endif %} {% endfor %} {% set unsectioned_fields = fields | selectattr('section_id', 'undefined') | list %} {% if unsectioned_fields %}
{% for field in unsectioned_fields %} {% include 'partials/form_field_htmx.html' %} {% endfor %}
{% endif %} {% else %}
{% for field in fields %} {% include 'partials/form_field_htmx.html' %} {% endfor %}
{% endif %}
{% if tab_name in ('publishing', 'hardware', 'model', 'basic', 'training', 'validation') %}
{% endif %}