Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
cdbbbc5d
Commit
cdbbbc5d
authored
Jul 12, 2016
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
a3c curve
parent
e750306b
Changes
4
Show whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
14 additions
and
8 deletions
+14
-8
examples/Atari2600/DoubleDQN-breakout.png
examples/Atari2600/DoubleDQN-breakout.png
+0
-0
examples/Atari2600/README.md
examples/Atari2600/README.md
+13
-7
examples/Atari2600/curve-breakout.png
examples/Atari2600/curve-breakout.png
+0
-0
tensorpack/utils/loadcaffe.py
tensorpack/utils/loadcaffe.py
+1
-1
No files found.
examples/Atari2600/DoubleDQN-breakout.png
deleted
100644 → 0
View file @
e750306b
79.4 KB
examples/Atari2600/README.md
View file @
cdbbbc5d
Reproduce
DQN in
:
Reproduce
the following methods
:
+
Nature-DQN in:
[
Human-level Control Through Deep Reinforcement Learning
](
http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html
)
and Double-DQN in:
+
Double-DQN in:
[
Deep Reinforcement Learning with Double Q-learning
](
http://arxiv.org/abs/1509.06461
)
Can reproduce the claimed performance, on several games I've tested with.
+
A3C in
[
Asynchronous Methods for Deep Reinforcement Learning
](
http://arxiv.org/abs/1602.01783
)
. (I
used a modified version where each batch contains transitions from different simulators, which I called "Batch A3C".)
Claimed performance in the paper can be reproduced, on several games I've tested with.


A demo trained with Double-DQN on breakout game is available at
[
youtube
](
https://youtu.be/o21mddZtE5Y
)
.
A demo trained with Double-DQN on breakout is available at
[
youtube
](
https://youtu.be/o21mddZtE5Y
)
.
DQN would typically take 2~3 days of training to reach a score of 400 on breakout, but it only takes <4 hours on 1 GPU with my A3C implementation.
This is probably the fastest RL trainer you'd find.
## How to use
...
...
@@ -30,4 +36,4 @@ To visualize the agent:
./DQN.py --rom breakout.bin --task play --load pretrained.model
```
A3C code
and curve will be available soon. It learns much faster
.
A3C code
will be released at the end of August
.
examples/Atari2600/curve-breakout.png
0 → 100644
View file @
cdbbbc5d
109 KB
tensorpack/utils/loadcaffe.py
View file @
cdbbbc5d
...
...
@@ -15,7 +15,7 @@ from .utils import change_env, get_dataset_dir
from
.fs
import
download
from
.
import
logger
__all__
=
[
'load_caffe'
]
__all__
=
[
'load_caffe'
,
'get_caffe_pb'
]
CAFFE_PROTO_URL
=
"https://github.com/BVLC/caffe/raw/master/src/caffe/proto/caffe.proto"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment