Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
1a5d3f4f
Commit
1a5d3f4f
authored
Feb 10, 2017
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
gpu_nr_gpu() now really count the gpus
parent
1e790a93
Changes
5
Show whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
20 additions
and
10 deletions
+20
-10
docs/tutorial/efficient-data.md
docs/tutorial/efficient-data.md
+3
-2
examples/A3C-Gym/README.md
examples/A3C-Gym/README.md
+6
-4
examples/A3C-Gym/train-atari.py
examples/A3C-Gym/train-atari.py
+2
-2
examples/SpatialTransformer/mnist-addition.py
examples/SpatialTransformer/mnist-addition.py
+1
-0
tensorpack/utils/gpu.py
tensorpack/utils/gpu.py
+8
-2
No files found.
docs/tutorial/efficient-data.md
View file @
1a5d3f4f
...
...
@@ -59,9 +59,10 @@ there are ways to understand which one is the bottleneck:
### Load ImageNet efficiently
We take ImageNet dataset as an example of how to optimize a DataFlow
for speed
.
We take ImageNet dataset as an example of how to optimize a DataFlow.
We use ILSVRC12 training set, which contains 1.28 million images.
Following the
[
ResNet example
](
../examples/ResNet
)
, our pre-processing need images in their original resolution, so we don't resize them.
Following the
[
ResNet example
](
../examples/ResNet
)
, our pre-processing need images in their original resolution, so we'll read the original
dataset instead of a down-sampled version here.
The average resolution is about 400x350
<sup>
[[1]]
</sup>
.
The original images (JPEG compressed) are 140G in total.
...
...
examples/A3C-Gym/README.md
View file @
1a5d3f4f
### Code and models for Atari games in gym
Implemented
A3C
in
[
Asynchronous Methods for Deep Reinforcement Learning
](
http://arxiv.org/abs/1602.01783
)
.
Implemented
Multi-GPU version of the A3C algorithm
in
[
Asynchronous Methods for Deep Reinforcement Learning
](
http://arxiv.org/abs/1602.01783
)
.
Results of the same code trained on 47 different Atari games were uploaded on OpenAI Gym.
You can see them in
[
my gym page
](
https://gym.openai.com/users/ppwwyyxx
)
.
...
...
@@ -8,14 +8,16 @@ Most of them are the best reproducible results on gym.
### To train on an Atari game:
`
./train-atari.py --env Breakout-v0 --gpu
0`
`
CUDA_VISIBLE_DEVICES=0 ./train-atari.py --env Breakout-v
0`
It should run at a speed of 6~10 iteration/s on 1 GPU plus 12+ CPU cores.
Training with a significant slower speed (e.g. on CPU) will
give bad performanc
e,
Training with a significant slower speed (e.g. on CPU) will
result in very bad scor
e,
probably because of async issues.
The pre-trained models are all trained with 4 GPUs for about 2 days.
But note that multi-GPU doesn't give you obvious speedup here,
because the bottleneck is not computation but data.
Occasionally
processes may not get terminated completely, therefore it is suggested to use systemd-run
to run any
Occasionally
, processes may not get terminated completely, therefore it is suggested to use
`systemd-run`
to run any
multiprocess Python program to get a cgroup dedicated for the task.
### To run a pretrained Atari model for 100 episodes:
...
...
examples/A3C-Gym/train-atari.py
View file @
1a5d3f4f
...
...
@@ -254,8 +254,8 @@ if __name__ == '__main__':
elif
args
.
task
==
'eval'
:
eval_model_multithread
(
cfg
,
EVAL_EPISODE
)
else
:
if
args
.
gpu
:
nr_gpu
=
get_nr_gpu
()
if
nr_gpu
>
0
:
if
nr_gpu
>
1
:
predict_tower
=
range
(
nr_gpu
)[
-
nr_gpu
//
2
:]
else
:
...
...
examples/SpatialTransformer/mnist-addition.py
View file @
1a5d3f4f
...
...
@@ -7,6 +7,7 @@ import numpy as np
import
tensorflow
as
tf
import
os
import
sys
import
cv2
import
argparse
from
tensorpack
import
*
...
...
tensorpack/utils/gpu.py
View file @
1a5d3f4f
...
...
@@ -5,6 +5,7 @@
import
os
from
.utils
import
change_env
from
.
import
logger
__all__
=
[
'change_gpu'
,
'get_nr_gpu'
]
...
...
@@ -26,5 +27,10 @@ def get_nr_gpu():
int: the number of GPU from ``CUDA_VISIBLE_DEVICES``.
"""
env
=
os
.
environ
.
get
(
'CUDA_VISIBLE_DEVICES'
,
None
)
assert
env
is
not
None
,
'gpu not set!'
# TODO
if
env
is
not
None
:
return
len
(
env
.
split
(
','
))
logger
.
info
(
"Loading local devices by TensorFlow ..."
)
from
tensorflow.python.client
import
device_lib
device_protos
=
device_lib
.
list_local_devices
()
gpus
=
[
x
.
name
for
x
in
device_protos
if
x
.
device_type
==
'GPU'
]
return
len
(
gpus
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment