Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
353cd04f
Commit
353cd04f
authored
Mar 17, 2019
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Workaround dilated conv bugs in tf.layers.Conv2D (#1110).
Another bug in tf.layers .. maybe I should never switch to it.
parent
505e28eb
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
34 additions
and
8 deletions
+34
-8
docs/tutorial/save-load.md
docs/tutorial/save-load.md
+24
-1
docs/tutorial/symbolic.md
docs/tutorial/symbolic.md
+5
-5
tensorpack/models/conv2d.py
tensorpack/models/conv2d.py
+5
-2
No files found.
docs/tutorial/save-load.md
View file @
353cd04f
...
@@ -10,7 +10,7 @@ Both are necessary.
...
@@ -10,7 +10,7 @@ Both are necessary.
`tf.train.NewCheckpointReader`
is the offical tool to parse TensorFlow checkpoint.
`tf.train.NewCheckpointReader`
is the offical tool to parse TensorFlow checkpoint.
Read
[
TF docs
](
https://www.tensorflow.org/api_docs/python/tf/train/NewCheckpointReader
)
for details.
Read
[
TF docs
](
https://www.tensorflow.org/api_docs/python/tf/train/NewCheckpointReader
)
for details.
Tensorpack also provides a small tool to load checkpoints, see
Tensorpack also provides a small tool to load checkpoints, see
[
load_chkpt_vars
](
../modules/tfutils.html#tensorpack.tfutils.varmanip.load_chkpt_vars
)
[
load_chkpt_vars
](
../modules/tfutils.html#tensorpack.tfutils.varmanip.load_chkpt_vars
)
for details.
for details.
...
@@ -51,3 +51,26 @@ Therefore, transfer learning is trivial.
...
@@ -51,3 +51,26 @@ Therefore, transfer learning is trivial.
If you want to load a pre-trained model, just use the same variable names.
If you want to load a pre-trained model, just use the same variable names.
If you want to re-train some layer, just rename either the variables in the
If you want to re-train some layer, just rename either the variables in the
graph or the variables in your loader.
graph or the variables in your loader.
## Resume Training
"resume training" means "loading the last known checkpoint".
Therefore you should refer to the
[
previous section
](
#load-a-model-to-a-session
)
on how to load a model.
```
eval_rst
.. note:: **A checkpoint does not resume everything!**
The TensorFlow checkpoint only saves TensorFlow variables,
which means other Python states that are not TensorFlow variables will not be saved
and resumed. This often include:
1. Training epoch number. You can set it by providing a `starting_epoch` to
your resume job.
2. State in your callbacks. Certain callbacks maintain a state
(e.g., current best accuracy) in Python, which cannot be saved automatically.
The [AutoResumeTrainConfig](../modules/train.html#tensorpack.train.AutoResumeTrainConfig)
is an alternative of `TrainConfig` which applies some heuristics to
automatically resume both checkpoint and the epoch number from your log directory.
docs/tutorial/symbolic.md
View file @
353cd04f
...
@@ -7,14 +7,14 @@ However, tensorpack is model-agnostic, which means
...
@@ -7,14 +7,14 @@ However, tensorpack is model-agnostic, which means
**you can skip this tutorial and do not need to use tensorpack's symbolic layers.**
**you can skip this tutorial and do not need to use tensorpack's symbolic layers.**
These layers were written only because there were no alternatives when tensorpack was first developed.
These layers were written only because there were no alternatives when tensorpack was first developed.
Nowadays, these implementation actually call
`tf.layers`
directly.
Nowadays,
many of
these implementation actually call
`tf.layers`
directly.
__Tensorpack will not add any more layers__
into its core library because this is
__Tensorpack will not add any more layers__
into its core library because this is
not the focus of tensorpack, and there are many other alternative symbolic
not the focus of tensorpack, and there are many other alternative symbolic
libraries today.
libraries today.
Today, you can just use
`tf.layers`
or any other symbolic libraries inside tensorpack.
Today, you can just use
`tf.layers`
or any other symbolic libraries inside tensorpack.
If you use the tensorpack implementations, you can also benefit from
`argscope`
and
`LinearWrap`
to
If you use the tensorpack implementations, you can also benefit from
`argscope`
and
`LinearWrap`
to
simplify the code.
simplify the code
, and also fewer bugs than
`tf.layers`
.
Note that to keep backward compatibility of code and pre-trained models, tensorpack layers
Note that to keep backward compatibility of code and pre-trained models, tensorpack layers
have some small differences with
`tf.layers`
, including variable names and default options.
have some small differences with
`tf.layers`
, including variable names and default options.
...
@@ -111,13 +111,13 @@ always creates new variable scope. See the [Keras example](../examples/keras) fo
...
@@ -111,13 +111,13 @@ always creates new variable scope. See the [Keras example](../examples/keras) fo
```
eval_rst
```
eval_rst
.. note:: **It's best to not trust others' layers!**
.. note:: **It's best to not trust others' layers!**
For non-standard layers that's not included in TensorFlow or Tensorpack, it's best to implement them yourself.
For non-standard layers that's not included in TensorFlow or Tensorpack, it's best to implement them yourself.
Non-standard layers often do not have a mathematical definition that people
Non-standard layers often do not have a mathematical definition that people
all agree on, and different people can implement it differently.
all agree on, and different people can implement it differently.
Also, deep learning models on github often have bugs, especially when there is
Also, deep learning models on github often have bugs, especially when there is
no reproduced experiments with the code.
no reproduced experiments with the code.
For your own good, it's best to implement the layers yourself.
For your own good, it's best to implement the layers yourself.
This is also why Tensorpack does not contain non-standard layers.
This is also why Tensorpack does not contain non-standard layers.
```
```
tensorpack/models/conv2d.py
View file @
353cd04f
...
@@ -54,7 +54,10 @@ def Conv2D(
...
@@ -54,7 +54,10 @@ def Conv2D(
kernel_initializer
=
tf
.
contrib
.
layers
.
variance_scaling_initializer
(
2.0
)
kernel_initializer
=
tf
.
contrib
.
layers
.
variance_scaling_initializer
(
2.0
)
else
:
else
:
kernel_initializer
=
tf
.
keras
.
initializers
.
VarianceScaling
(
2.0
,
distribution
=
'untruncated_normal'
)
kernel_initializer
=
tf
.
keras
.
initializers
.
VarianceScaling
(
2.0
,
distribution
=
'untruncated_normal'
)
if
split
==
1
:
dilation_rate
=
shape2d
(
dilation_rate
)
if
split
==
1
and
dilation_rate
==
[
1
,
1
]:
# tf.layers.Conv2D has bugs with dilations (https://github.com/tensorflow/tensorflow/issues/26797)
with
rename_get_variable
({
'kernel'
:
'W'
,
'bias'
:
'b'
}):
with
rename_get_variable
({
'kernel'
:
'W'
,
'bias'
:
'b'
}):
layer
=
tf
.
layers
.
Conv2D
(
layer
=
tf
.
layers
.
Conv2D
(
filters
,
filters
,
...
@@ -92,7 +95,7 @@ def Conv2D(
...
@@ -92,7 +95,7 @@ def Conv2D(
out_channel
=
filters
out_channel
=
filters
assert
out_channel
%
split
==
0
assert
out_channel
%
split
==
0
assert
dilation_rate
==
(
1
,
1
)
or
get_tf_version_tuple
()
>=
(
1
,
5
),
'TF>=1.5 required for group dilated conv
'
assert
dilation_rate
==
[
1
,
1
]
or
get_tf_version_tuple
()
>=
(
1
,
5
),
'TF>=1.5 required for dilated conv.
'
kernel_shape
=
shape2d
(
kernel_size
)
kernel_shape
=
shape2d
(
kernel_size
)
filter_shape
=
kernel_shape
+
[
in_channel
/
split
,
out_channel
]
filter_shape
=
kernel_shape
+
[
in_channel
/
split
,
out_channel
]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment