Q-Learning Config Builder

W19D1: Build Your Hyperparameter Configuration

← Back to Hub

🎯 Tune Your Q-Learning Agent

Configure your Q-Learning hyperparameters and see how they affect learning! After running vanilla_starter.py, experiment with different settings here.

1. Configure
2. Download
3. Run Locally
4. Compare Results

👤 Student Information

💡 Quick Presets

Start with a preset to see different learning behaviors, then customize.

📚 Learning Parameters

0.20
0.01 (slow) 1.0 (fast)
What it does: How much to update Q-values after each step.
Q(s,a) = Q(s,a) + alpha * (target - Q(s,a))
High = learn fast but unstable. Low = learn slow but stable.
0.990
0.8 (short-sighted) 0.999 (far-sighted)
What it does: How much to value future rewards vs immediate rewards.
High (0.99) = plan ahead. Low (0.8) = focus on immediate reward.

🎲 Exploration Parameters (Epsilon-Greedy)

1.00
0.01
0.990
0.95 (fast decay) 0.999 (slow decay)
Epsilon-Greedy: With probability epsilon, take random action (explore). Otherwise, take best known action (exploit).
epsilon = epsilon * decay after each episode.
Start high (explore), decay to low (exploit what you learned).
Explore vs Exploit: Too much exploration = never uses what it learned. Too little = might miss better strategies. The decay balances this over time.

⏱ Training Parameters

Bins: Q-Learning needs discrete states. We divide continuous values into bins (buckets).
More bins = finer distinctions = potentially better learning, but more states to explore.
Pole angle uses 2x bins (most important for balance).

✅ Validation

Student name required
Epsilon range valid (start >= end)
Episode count valid

📄 Configuration JSON

Copied to clipboard!

💻 How to Use Your Config

  1. Download your config JSON file
  2. Place it in the same folder as vanilla_starter.py
  3. Edit vanilla_starter.py to load your config (or copy the Python code)
  4. Run and compare your results to the default!
  5. Try different presets to see how parameters affect learning

🐍 Python Code (paste into vanilla_starter.py)