Training Configuration
Full trains all parameters, LoRA/DoRA use parameter-efficient methods
Use -1 for all layers or specify number (for LoRA/DoRA)
Percentage of validation set used for quick checks
Fraction of data used for validation
When enabled, training stops if validation loss stops improving (disabled by default)
Training Estimates: Configure parameters to see estimates
Learning Rate Schedule Preview
Training Status

No training in progress

Training Sessions

Loading checkpoints...

Training Session
-
Iteration
-
Train Loss
-
Val Loss
-
Perplexity
-
Tokens/sec
-
Trained Tokens
-
Epoch
-
Memory (GB)
-
Learning Rate
-
Warmup Steps
-
LR Decay
-
Weight Decay
Training Progress
E.Time: -- R.Time: -- 0%
Top 3 Best Checkpoints

No training data available

All Available Checkpoints
This list shows all available checkpoints for the selected training session.
Training Sessions

Loading training sessions...

Training Comparison Dashboard

Training Session Comparison

Select 2 or more training sessions from the left panel to compare their performance metrics.

Available Comparisons:
  • Loss Curves
  • Perplexity Evolution
  • Loss Stability
  • Generalization Gap
Tips:
  • Use same training dataset
  • Compare similar model sizes
  • Consider training duration
  • Review hyperparameters
Model Fusion
Choose the trained adapter to fuse. The base model will be automatically detected from the adapter's configuration.
Fusion Progress
Ready for Fusion

Select a trained adapter, then click "Start Fusion" to begin. The base model will be automatically detected.

Model Quantization
Choose from local models or HuggingFace cache
Lower bits = smaller model, slightly lower quality
Smaller groups = better quality, larger file
Quantized Models

No quantized models yet

Quantization Progress
Ready for Quantization

Select a model and click "Start Quantization" to begin

Quantization Tips
4-bit vs 8-bit: 4-bit provides better compression (~75% size reduction) but slightly lower quality. 8-bit offers better quality with ~50% compression.
Group Size: Smaller groups (32) provide better quality but larger files. 64 is the recommended balance.
Performance: Quantized models are faster to load and use less memory, making them ideal for deployment.
Compatibility: Quantized models work with the same inference code as the original models.
Model Configuration
Tip: You can select just an adapter to automatically load it with its base model!
Temperature
Top P
Rep. Penalty
Seed
Set seed for reproducible generation (same seed = same output)
Shows text as it's generated
Generation Output

Model ready! Type a message below to start the conversation.

Loading...
Loading...