Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
ba4fcf2a
Commit
ba4fcf2a
authored
Jul 17, 2016
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update docs
parent
f8969459
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
10 additions
and
8 deletions
+10
-8
examples/Atari2600/README.md
examples/Atari2600/README.md
+6
-5
tensorpack/callbacks/param.py
tensorpack/callbacks/param.py
+3
-3
tensorpack/dataflow/remote.py
tensorpack/dataflow/remote.py
+1
-0
No files found.
examples/Atari2600/README.md
View file @
ba4fcf2a
...
...
@@ -13,11 +13,14 @@ Claimed performance in the paper can be reproduced, on several games I've tested

DQN was trained on 1 GPU and it typically took 2~3 days of training to reach a score of 400 on breakout game.
My Batch-A3C implementation only took <2 hours with 2 GPUs (one for training and one for simulation).
DQN typically took 2 days of training to reach a score of 400 on breakout game.
My Batch-A3C implementation only took <2 hours (one for training and one for simulation).
Both were trained on one GPU with an extra GPU for simulation.
This is probably the fastest RL trainer you'd find.
The x-axis is the number of iterations not wall time, but iteration speed is about 7.8it/s for both models.
The x-axis is the number of iterations, not wall time.
Iteration speed on Tesla M40 is about 10.7it/s for B-A3C.
D-DQN is faster at the beginning but will converge to 12it/s due of exploration annealing.
A demo trained with Double-DQN on breakout is available at
[
youtube
](
https://youtu.be/o21mddZtE5Y
)
.
...
...
@@ -30,8 +33,6 @@ To train:
```
./DQN.py --rom breakout.bin --gpu 0
```
Training speed is about 7.3 iteration/s on 1 Tesla M40
(faster than this at the beginning, but will slow down due to exploration annealing).
To visualize the agent:
```
...
...
tensorpack/callbacks/param.py
View file @
ba4fcf2a
...
...
@@ -142,9 +142,9 @@ class HumanHyperParamSetter(HyperParamSetter):
ret
=
dic
[
self
.
param
.
readable_name
]
return
ret
except
:
logger
.
warn
(
"Failed to
find {} in {}"
.
format
(
self
.
param
.
readable_name
,
self
.
file_name
))
#
logger.warn(
#"Cannot
find {} in {}".format(
#
self.param.readable_name, self.file_name))
return
None
class
ScheduledHyperParamSetter
(
HyperParamSetter
):
...
...
tensorpack/dataflow/remote.py
View file @
ba4fcf2a
...
...
@@ -3,6 +3,7 @@
# File: remote.py
# Author: Yuxin Wu <ppwwyyxxc@gmail.com>
from
..utils
import
logger
try
:
import
zmq
except
ImportError
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment