Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
4bc639da
Commit
4bc639da
authored
Jun 23, 2016
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update readme
parent
3270acb8
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
36 additions
and
12 deletions
+36
-12
README.md
README.md
+8
-4
examples/Atari2600/DoubleDQN-breakout.png
examples/Atari2600/DoubleDQN-breakout.png
+0
-0
examples/Atari2600/README.md
examples/Atari2600/README.md
+21
-5
examples/ResNet/README.md
examples/ResNet/README.md
+4
-2
tensorpack/RL/atari.py
tensorpack/RL/atari.py
+3
-1
No files found.
README.md
View file @
4bc639da
...
@@ -12,8 +12,12 @@ See some interesting [examples](https://github.com/ppwwyyxx/tensorpack/tree/mast
...
@@ -12,8 +12,12 @@ See some interesting [examples](https://github.com/ppwwyyxx/tensorpack/tree/mast
## Features:
## Features:
Focused on modularity:
Focused on modularity
. Just have to define the three components in training
:
+
Models has scoped abstraction of common models.
1.
The model, or the graph. Define its input and output.
`models/`
has some scoped abstraction of common models.
+
Dataflow defines data preprocessing in pure Python.
+
Callbacks systems controls training behavior.
2.
The data. All data producer has a unified
`DataFlow`
interface, and this interface can be chained
to perform complex preprocessing. It uses multiprocess to avoid performance bottleneck.
3.
The callbacks. They include everything you want to do besides the training iterations:
change hyperparameters, save model, print logs, run validation, and more.
examples/Atari2600/DoubleDQN-breakout.png
0 → 100644
View file @
4bc639da
79.4 KB
examples/Atari2600/README.md
View file @
4bc639da
Implement
DQN in:
Reproduce
DQN in:
**Human-level Control Through Deep Reinforcement Learning**
**Human-level Control Through Deep Reinforcement Learning**
...
@@ -6,10 +6,26 @@ and Double-DQN in:
...
@@ -6,10 +6,26 @@ and Double-DQN in:
**Deep Reinforcement Learning with Double Q-learning**
**Deep Reinforcement Learning with Double Q-learning**
To run:
Can reproduce the claimed performance, on several games I've tested with.

A demo trained with Double-DQN on breakout is available at
[
youtube
](
https://youtu.be/o21mddZtE5Y
)
.
## How to use
Download
[
atari roms
](
https://github.com/openai/atari-py/tree/master/atari_py/atari_roms
)
to
`$TENSORPACK_DATASET/atari_rom`
(defaults to tensorpack/dataflow/dataset/atari_rom).
To train:
```
```
./DQN.py --rom breakout.
rom
--gpu 0
./DQN.py --rom breakout.
bin
--gpu 0
```
```
Training speed is about 7.3 iteration/s on 1 Tesla M40. It takes days to learn well (see figure above).
Can reproduce the claimed performance, on games I've tested with (curves will be available soon).
To play:
A demo trained with Double-DQN on breakout is available at
[
youtube
](
https://youtu.be/o21mddZtE5Y
)
.
```
./DQN.py --rom breakout.bin --task play --load pretrained.model
```
A3C code and curve will be available soon. It learns much faster.
examples/ResNet/README.md
View file @
4bc639da
...
@@ -9,5 +9,7 @@ The validation error here is computed on test set.
...
@@ -9,5 +9,7 @@ The validation error here is computed on test set.


Download model:
<!--
[
Cifar10 n=18
](
https://drive.google.com/open?id=0B308TeQzmFDLeHpSaHAxWGV1WDg
)
-Download model:
-
[
Cifar10 n=18
](
https://drive.google.com/open?id=0B308TeQzmFDLeHpSaHAxWGV1WDg
)
-->
tensorpack/RL/atari.py
View file @
4bc639da
...
@@ -26,6 +26,7 @@ __all__ = ['AtariPlayer']
...
@@ -26,6 +26,7 @@ __all__ = ['AtariPlayer']
def
log_once
():
def
log_once
():
logger
.
warn
(
"https://github.com/mgbellemare/Arcade-Learning-Environment/pull/171 is not merged!"
)
logger
.
warn
(
"https://github.com/mgbellemare/Arcade-Learning-Environment/pull/171 is not merged!"
)
ROM_URL
=
"https://github.com/openai/atari-py/tree/master/atari_py/atari_roms"
_ALE_LOCK
=
threading
.
Lock
()
_ALE_LOCK
=
threading
.
Lock
()
class
AtariPlayer
(
RLEnvironment
):
class
AtariPlayer
(
RLEnvironment
):
...
@@ -51,7 +52,8 @@ class AtariPlayer(RLEnvironment):
...
@@ -51,7 +52,8 @@ class AtariPlayer(RLEnvironment):
super
(
AtariPlayer
,
self
)
.
__init__
()
super
(
AtariPlayer
,
self
)
.
__init__
()
if
not
os
.
path
.
isfile
(
rom_file
)
and
'/'
not
in
rom_file
:
if
not
os
.
path
.
isfile
(
rom_file
)
and
'/'
not
in
rom_file
:
rom_file
=
os
.
path
.
join
(
get_dataset_dir
(
'atari_rom'
),
rom_file
)
rom_file
=
os
.
path
.
join
(
get_dataset_dir
(
'atari_rom'
),
rom_file
)
assert
os
.
path
.
isfile
(
rom_file
),
"rom {} not found"
.
format
(
rom_file
)
assert
os
.
path
.
isfile
(
rom_file
),
\
"rom {} not found. Please download at {}"
.
format
(
rom_file
,
ROM_URL
)
try
:
try
:
ALEInterface
.
setLoggerMode
(
ALEInterface
.
Logger
.
Warning
)
ALEInterface
.
setLoggerMode
(
ALEInterface
.
Logger
.
Warning
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment