⚙️ Build Your Configuration
📁 Output & Logging
▼
🏋️ Training Parameters
▼
📈 Learning Rate
▼
💾 Evaluation & Checkpoints
▼
🔄 Reproducibility
▼
Quick Presets
Generated Config
⚠️ Configuration Warnings
🧠 Quick Check
If you have limited GPU memory, which parameter should you adjust first?
Reducing batch size lowers memory usage, while gradient accumulation maintains effective batch size for stable training.