Commit 353cd04f authored by Yuxin Wu's avatar Yuxin Wu

Workaround dilated conv bugs in tf.layers.Conv2D (#1110).

Another bug in tf.layers .. maybe I should never switch to it.
parent 505e28eb
......@@ -51,3 +51,26 @@ Therefore, transfer learning is trivial.
If you want to load a pre-trained model, just use the same variable names.
If you want to re-train some layer, just rename either the variables in the
graph or the variables in your loader.
## Resume Training
"resume training" means "loading the last known checkpoint".
Therefore you should refer to the [previous section](#load-a-model-to-a-session)
on how to load a model.
```eval_rst
.. note:: **A checkpoint does not resume everything!**
The TensorFlow checkpoint only saves TensorFlow variables,
which means other Python states that are not TensorFlow variables will not be saved
and resumed. This often include:
1. Training epoch number. You can set it by providing a `starting_epoch` to
your resume job.
2. State in your callbacks. Certain callbacks maintain a state
(e.g., current best accuracy) in Python, which cannot be saved automatically.
The [AutoResumeTrainConfig](../modules/train.html#tensorpack.train.AutoResumeTrainConfig)
is an alternative of `TrainConfig` which applies some heuristics to
automatically resume both checkpoint and the epoch number from your log directory.
......@@ -7,14 +7,14 @@ However, tensorpack is model-agnostic, which means
**you can skip this tutorial and do not need to use tensorpack's symbolic layers.**
These layers were written only because there were no alternatives when tensorpack was first developed.
Nowadays, these implementation actually call `tf.layers` directly.
Nowadays, many of these implementation actually call `tf.layers` directly.
__Tensorpack will not add any more layers__ into its core library because this is
not the focus of tensorpack, and there are many other alternative symbolic
libraries today.
Today, you can just use `tf.layers` or any other symbolic libraries inside tensorpack.
If you use the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
simplify the code.
simplify the code, and also fewer bugs than `tf.layers`.
Note that to keep backward compatibility of code and pre-trained models, tensorpack layers
have some small differences with `tf.layers`, including variable names and default options.
......
......@@ -54,7 +54,10 @@ def Conv2D(
kernel_initializer = tf.contrib.layers.variance_scaling_initializer(2.0)
else:
kernel_initializer = tf.keras.initializers.VarianceScaling(2.0, distribution='untruncated_normal')
if split == 1:
dilation_rate = shape2d(dilation_rate)
if split == 1 and dilation_rate == [1, 1]:
# tf.layers.Conv2D has bugs with dilations (https://github.com/tensorflow/tensorflow/issues/26797)
with rename_get_variable({'kernel': 'W', 'bias': 'b'}):
layer = tf.layers.Conv2D(
filters,
......@@ -92,7 +95,7 @@ def Conv2D(
out_channel = filters
assert out_channel % split == 0
assert dilation_rate == (1, 1) or get_tf_version_tuple() >= (1, 5), 'TF>=1.5 required for group dilated conv'
assert dilation_rate == [1, 1] or get_tf_version_tuple() >= (1, 5), 'TF>=1.5 required for dilated conv.'
kernel_shape = shape2d(kernel_size)
filter_shape = kernel_shape + [in_channel / split, out_channel]
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment