Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
1a5d3f4f
Commit
1a5d3f4f
authored
Feb 10, 2017
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
gpu_nr_gpu() now really count the gpus
parent
1e790a93
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
20 additions
and
10 deletions
+20
-10
docs/tutorial/efficient-data.md
docs/tutorial/efficient-data.md
+3
-2
examples/A3C-Gym/README.md
examples/A3C-Gym/README.md
+6
-4
examples/A3C-Gym/train-atari.py
examples/A3C-Gym/train-atari.py
+2
-2
examples/SpatialTransformer/mnist-addition.py
examples/SpatialTransformer/mnist-addition.py
+1
-0
tensorpack/utils/gpu.py
tensorpack/utils/gpu.py
+8
-2
No files found.
docs/tutorial/efficient-data.md
View file @
1a5d3f4f
...
@@ -59,9 +59,10 @@ there are ways to understand which one is the bottleneck:
...
@@ -59,9 +59,10 @@ there are ways to understand which one is the bottleneck:
### Load ImageNet efficiently
### Load ImageNet efficiently
We take ImageNet dataset as an example of how to optimize a DataFlow
for speed
.
We take ImageNet dataset as an example of how to optimize a DataFlow.
We use ILSVRC12 training set, which contains 1.28 million images.
We use ILSVRC12 training set, which contains 1.28 million images.
Following the
[
ResNet example
](
../examples/ResNet
)
, our pre-processing need images in their original resolution, so we don't resize them.
Following the
[
ResNet example
](
../examples/ResNet
)
, our pre-processing need images in their original resolution, so we'll read the original
dataset instead of a down-sampled version here.
The average resolution is about 400x350
<sup>
[[1]]
</sup>
.
The average resolution is about 400x350
<sup>
[[1]]
</sup>
.
The original images (JPEG compressed) are 140G in total.
The original images (JPEG compressed) are 140G in total.
...
...
examples/A3C-Gym/README.md
View file @
1a5d3f4f
### Code and models for Atari games in gym
### Code and models for Atari games in gym
Implemented
A3C
in
[
Asynchronous Methods for Deep Reinforcement Learning
](
http://arxiv.org/abs/1602.01783
)
.
Implemented
Multi-GPU version of the A3C algorithm
in
[
Asynchronous Methods for Deep Reinforcement Learning
](
http://arxiv.org/abs/1602.01783
)
.
Results of the same code trained on 47 different Atari games were uploaded on OpenAI Gym.
Results of the same code trained on 47 different Atari games were uploaded on OpenAI Gym.
You can see them in
[
my gym page
](
https://gym.openai.com/users/ppwwyyxx
)
.
You can see them in
[
my gym page
](
https://gym.openai.com/users/ppwwyyxx
)
.
...
@@ -8,14 +8,16 @@ Most of them are the best reproducible results on gym.
...
@@ -8,14 +8,16 @@ Most of them are the best reproducible results on gym.
### To train on an Atari game:
### To train on an Atari game:
`
./train-atari.py --env Breakout-v0 --gpu
0`
`
CUDA_VISIBLE_DEVICES=0 ./train-atari.py --env Breakout-v
0`
It should run at a speed of 6~10 iteration/s on 1 GPU plus 12+ CPU cores.
It should run at a speed of 6~10 iteration/s on 1 GPU plus 12+ CPU cores.
Training with a significant slower speed (e.g. on CPU) will
give bad performanc
e,
Training with a significant slower speed (e.g. on CPU) will
result in very bad scor
e,
probably because of async issues.
probably because of async issues.
The pre-trained models are all trained with 4 GPUs for about 2 days.
The pre-trained models are all trained with 4 GPUs for about 2 days.
But note that multi-GPU doesn't give you obvious speedup here,
because the bottleneck is not computation but data.
Occasionally
processes may not get terminated completely, therefore it is suggested to use systemd-run
to run any
Occasionally
, processes may not get terminated completely, therefore it is suggested to use
`systemd-run`
to run any
multiprocess Python program to get a cgroup dedicated for the task.
multiprocess Python program to get a cgroup dedicated for the task.
### To run a pretrained Atari model for 100 episodes:
### To run a pretrained Atari model for 100 episodes:
...
...
examples/A3C-Gym/train-atari.py
View file @
1a5d3f4f
...
@@ -254,8 +254,8 @@ if __name__ == '__main__':
...
@@ -254,8 +254,8 @@ if __name__ == '__main__':
elif
args
.
task
==
'eval'
:
elif
args
.
task
==
'eval'
:
eval_model_multithread
(
cfg
,
EVAL_EPISODE
)
eval_model_multithread
(
cfg
,
EVAL_EPISODE
)
else
:
else
:
if
args
.
gpu
:
nr_gpu
=
get_nr_gpu
()
nr_gpu
=
get_nr_gpu
()
if
nr_gpu
>
0
:
if
nr_gpu
>
1
:
if
nr_gpu
>
1
:
predict_tower
=
range
(
nr_gpu
)[
-
nr_gpu
//
2
:]
predict_tower
=
range
(
nr_gpu
)[
-
nr_gpu
//
2
:]
else
:
else
:
...
...
examples/SpatialTransformer/mnist-addition.py
View file @
1a5d3f4f
...
@@ -7,6 +7,7 @@ import numpy as np
...
@@ -7,6 +7,7 @@ import numpy as np
import
tensorflow
as
tf
import
tensorflow
as
tf
import
os
import
os
import
sys
import
sys
import
cv2
import
argparse
import
argparse
from
tensorpack
import
*
from
tensorpack
import
*
...
...
tensorpack/utils/gpu.py
View file @
1a5d3f4f
...
@@ -5,6 +5,7 @@
...
@@ -5,6 +5,7 @@
import
os
import
os
from
.utils
import
change_env
from
.utils
import
change_env
from
.
import
logger
__all__
=
[
'change_gpu'
,
'get_nr_gpu'
]
__all__
=
[
'change_gpu'
,
'get_nr_gpu'
]
...
@@ -26,5 +27,10 @@ def get_nr_gpu():
...
@@ -26,5 +27,10 @@ def get_nr_gpu():
int: the number of GPU from ``CUDA_VISIBLE_DEVICES``.
int: the number of GPU from ``CUDA_VISIBLE_DEVICES``.
"""
"""
env
=
os
.
environ
.
get
(
'CUDA_VISIBLE_DEVICES'
,
None
)
env
=
os
.
environ
.
get
(
'CUDA_VISIBLE_DEVICES'
,
None
)
assert
env
is
not
None
,
'gpu not set!'
# TODO
if
env
is
not
None
:
return
len
(
env
.
split
(
','
))
return
len
(
env
.
split
(
','
))
logger
.
info
(
"Loading local devices by TensorFlow ..."
)
from
tensorflow.python.client
import
device_lib
device_protos
=
device_lib
.
list_local_devices
()
gpus
=
[
x
.
name
for
x
in
device_protos
if
x
.
device_type
==
'GPU'
]
return
len
(
gpus
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment