Commit ba4fcf2a authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent f8969459
......@@ -13,11 +13,14 @@ Claimed performance in the paper can be reproduced, on several games I've tested
![DQN](curve-breakout.png)
DQN was trained on 1 GPU and it typically took 2~3 days of training to reach a score of 400 on breakout game.
My Batch-A3C implementation only took <2 hours with 2 GPUs (one for training and one for simulation).
DQN typically took 2 days of training to reach a score of 400 on breakout game.
My Batch-A3C implementation only took <2 hours (one for training and one for simulation).
Both were trained on one GPU with an extra GPU for simulation.
This is probably the fastest RL trainer you'd find.
The x-axis is the number of iterations not wall time, but iteration speed is about 7.8it/s for both models.
The x-axis is the number of iterations, not wall time.
Iteration speed on Tesla M40 is about 10.7it/s for B-A3C.
D-DQN is faster at the beginning but will converge to 12it/s due of exploration annealing.
A demo trained with Double-DQN on breakout is available at [youtube](https://youtu.be/o21mddZtE5Y).
......@@ -30,8 +33,6 @@ To train:
```
./DQN.py --rom breakout.bin --gpu 0
```
Training speed is about 7.3 iteration/s on 1 Tesla M40
(faster than this at the beginning, but will slow down due to exploration annealing).
To visualize the agent:
```
......
......@@ -142,9 +142,9 @@ class HumanHyperParamSetter(HyperParamSetter):
ret = dic[self.param.readable_name]
return ret
except:
logger.warn(
"Failed to find {} in {}".format(
self.param.readable_name, self.file_name))
#logger.warn(
#"Cannot find {} in {}".format(
#self.param.readable_name, self.file_name))
return None
class ScheduledHyperParamSetter(HyperParamSetter):
......
......@@ -3,6 +3,7 @@
# File: remote.py
# Author: Yuxin Wu <ppwwyyxxc@gmail.com>
from ..utils import logger
try:
import zmq
except ImportError:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment