Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
44f603c0
Commit
44f603c0
authored
Jul 13, 2017
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
work around device name problem with colocation (#329)
parent
61c113b8
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
2 deletions
+5
-2
tensorpack/train/multigpu.py
tensorpack/train/multigpu.py
+5
-2
No files found.
tensorpack/train/multigpu.py
View file @
44f603c0
...
@@ -131,10 +131,13 @@ class LeastLoadedDeviceSetter(object):
...
@@ -131,10 +131,13 @@ class LeastLoadedDeviceSetter(object):
self
.
ps_sizes
=
[
0
]
*
len
(
self
.
ps_devices
)
self
.
ps_sizes
=
[
0
]
*
len
(
self
.
ps_devices
)
def
__call__
(
self
,
op
):
def
__call__
(
self
,
op
):
def
sanitize_name
(
name
):
# tensorflow/tensorflow#11484
return
tf
.
DeviceSpec
.
from_string
(
name
)
.
to_string
()
if
op
.
device
:
if
op
.
device
:
return
op
.
device
return
op
.
device
if
op
.
type
not
in
[
'Variable'
,
'VariableV2'
]:
if
op
.
type
not
in
[
'Variable'
,
'VariableV2'
]:
return
s
elf
.
worker_device
return
s
anitize_name
(
self
.
worker_device
)
device_index
,
_
=
min
(
enumerate
(
device_index
,
_
=
min
(
enumerate
(
self
.
ps_sizes
),
key
=
operator
.
itemgetter
(
1
))
self
.
ps_sizes
),
key
=
operator
.
itemgetter
(
1
))
...
@@ -142,7 +145,7 @@ class LeastLoadedDeviceSetter(object):
...
@@ -142,7 +145,7 @@ class LeastLoadedDeviceSetter(object):
var_size
=
op
.
outputs
[
0
]
.
get_shape
()
.
num_elements
()
var_size
=
op
.
outputs
[
0
]
.
get_shape
()
.
num_elements
()
self
.
ps_sizes
[
device_index
]
+=
var_size
self
.
ps_sizes
[
device_index
]
+=
var_size
return
device_name
return
sanitize_name
(
device_name
)
class
SyncMultiGPUTrainerParameterServer
(
MultiGPUTrainerBase
,
SingleCostFeedfreeTrainer
):
class
SyncMultiGPUTrainerParameterServer
(
MultiGPUTrainerBase
,
SingleCostFeedfreeTrainer
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment