Commit 14864868 authored by Yuxin Wu's avatar Yuxin Wu

fix typo in dqn

parent d381a5d8
......@@ -153,7 +153,7 @@ class Model(ModelDesc):
lr = symbf.get_scalar_var('learning_rate', 1e-3, summary=True)
opt = tf.train.AdamOptimizer(lr, epsilon=1e-3)
return optimizer.apply_grad_processors(
opt, [gradproc.GlobalNormalClip(10), gradproc.SummaryGradient()])
opt, [gradproc.GlobalNormClip(10), gradproc.SummaryGradient()])
def get_config():
......
......@@ -19,12 +19,12 @@ Claimed performance in the paper can be reproduced, on several games I've tested
![DQN](curve-breakout.png)
DQN typically took 2 days of training to reach a score of 400 on breakout game.
DQN typically took 2 days of training to reach a score of 400 on breakout game (same as the paper).
My Batch-A3C implementation only took <2 hours.
Both were trained on one GPU with an extra GPU for simulation.
The x-axis is the number of iterations, not wall time.
D-DQN is faster at the beginning but will converge to 12 batches/s (768 frames/s) due of exploration annealing.
Double-DQN is faster at the beginning but will converge to 12 batches/s (768 frames/s) due of exploration annealing.
## How to use
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment