Commit ba4fcf2a authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent f8969459
...@@ -13,11 +13,14 @@ Claimed performance in the paper can be reproduced, on several games I've tested ...@@ -13,11 +13,14 @@ Claimed performance in the paper can be reproduced, on several games I've tested
![DQN](curve-breakout.png) ![DQN](curve-breakout.png)
DQN was trained on 1 GPU and it typically took 2~3 days of training to reach a score of 400 on breakout game. DQN typically took 2 days of training to reach a score of 400 on breakout game.
My Batch-A3C implementation only took <2 hours with 2 GPUs (one for training and one for simulation). My Batch-A3C implementation only took <2 hours (one for training and one for simulation).
Both were trained on one GPU with an extra GPU for simulation.
This is probably the fastest RL trainer you'd find. This is probably the fastest RL trainer you'd find.
The x-axis is the number of iterations not wall time, but iteration speed is about 7.8it/s for both models. The x-axis is the number of iterations, not wall time.
Iteration speed on Tesla M40 is about 10.7it/s for B-A3C.
D-DQN is faster at the beginning but will converge to 12it/s due of exploration annealing.
A demo trained with Double-DQN on breakout is available at [youtube](https://youtu.be/o21mddZtE5Y). A demo trained with Double-DQN on breakout is available at [youtube](https://youtu.be/o21mddZtE5Y).
...@@ -30,8 +33,6 @@ To train: ...@@ -30,8 +33,6 @@ To train:
``` ```
./DQN.py --rom breakout.bin --gpu 0 ./DQN.py --rom breakout.bin --gpu 0
``` ```
Training speed is about 7.3 iteration/s on 1 Tesla M40
(faster than this at the beginning, but will slow down due to exploration annealing).
To visualize the agent: To visualize the agent:
``` ```
......
...@@ -142,9 +142,9 @@ class HumanHyperParamSetter(HyperParamSetter): ...@@ -142,9 +142,9 @@ class HumanHyperParamSetter(HyperParamSetter):
ret = dic[self.param.readable_name] ret = dic[self.param.readable_name]
return ret return ret
except: except:
logger.warn( #logger.warn(
"Failed to find {} in {}".format( #"Cannot find {} in {}".format(
self.param.readable_name, self.file_name)) #self.param.readable_name, self.file_name))
return None return None
class ScheduledHyperParamSetter(HyperParamSetter): class ScheduledHyperParamSetter(HyperParamSetter):
......
...@@ -3,6 +3,7 @@ ...@@ -3,6 +3,7 @@
# File: remote.py # File: remote.py
# Author: Yuxin Wu <ppwwyyxxc@gmail.com> # Author: Yuxin Wu <ppwwyyxxc@gmail.com>
from ..utils import logger
try: try:
import zmq import zmq
except ImportError: except ImportError:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment