Commit 7f505225 authored by Yuxin Wu's avatar Yuxin Wu

Pre/Post processing in ImageNetModel

parent 30c7a97c
...@@ -22,29 +22,20 @@ You can use this predicate to choose a different code path in inference mode. ...@@ -22,29 +22,20 @@ You can use this predicate to choose a different code path in inference mode.
## Inference After Training ## Inference After Training
Tensorpack is a training interface -- __it doesn't care what happened after training__. Tensorpack is a training interface -- __it doesn't care what happened after training__.
You have everything you need for inference or model diagnosis after You already have everything you need for inference or model diagnosis after
training: training:
1. The trained weights: tensorpack saves them in standard TF checkpoint format. 1. The model (the graph): you've already written it yourself with TF symbolic functions.
2. The model: you've already written it yourself with TF symbolic functions. 2. The trained parameters: tensorpack saves them in standard TF checkpoint format.
Therefore, you can build the graph for inference, load the checkpoint, and apply Therefore, you can build the graph for inference, load the checkpoint, and apply
any processing or deployment TensorFlow supports. any processing or deployment TensorFlow supports.
And you'll need to read TF docs and __do it on your own__. These are unrelated to tensorpack, and you'll need to read TF docs and __do it on your own__.
### Don't Use Training Metagraph for Inference ### Step 1: build the model
Metagraph is the wrong abstraction for a "model". You can build a graph however you like, with pure TensorFlow. If your model is written with
It stores the entire graph which contains not only the mathematical model, but also all the tensorpack's `ModelDesc`, you can also build it like this:
training settings (queues, iterators, summaries, evaluations, multi-gpu replications).
Therefore it is usually wrong to import a training metagraph for inference.
It's also very common to change the graph for inference.
For example, you may need a different data layout for CPU inference,
or you may need placeholders in the inference graph (which may not even exist in
the training graph). However metagraph is not designed to be easily modified at all.
To do inference, it's best to recreate a clean graph (and save it if needed).
To construct a new graph, you can simply:
```python ```python
a, b = tf.placeholder(...), tf.placeholder(...) a, b = tf.placeholder(...), tf.placeholder(...)
# call ANY symbolic functions on a, b. e.g.: # call ANY symbolic functions on a, b. e.g.:
...@@ -52,11 +43,34 @@ with TowerContext('', is_training=False): ...@@ -52,11 +43,34 @@ with TowerContext('', is_training=False):
model.build_graph(a, b) model.build_graph(a, b)
``` ```
```eval_rst
.. note:: **Do not use metagraph for inference!**.
Metagraph is the wrong abstraction for a "model".
It stores the entire graph which contains not only the mathematical model, but also all the
training settings (queues, iterators, summaries, evaluations, multi-gpu replications).
Therefore it is usually wrong to import a training metagraph for inference.
It's also very common to change the graph for inference.
For example, you may need a different data layout for CPU inference,
or you may need placeholders in the inference graph (which may not even exist in
the training graph). However metagraph is not designed to be easily modified at all.
To do inference, it's best to recreate a clean graph (and save it if needed) by yourself.
```
### Step 2: load the checkpoint
You can just use `tf.train.Saver` for all the work.
Alternatively, use tensorpack's `SaverRestore(path).init(tf.get_default_session())`
### OfflinePredictor ### OfflinePredictor
The only tool tensorpack has for after-training inference is [OfflinePredictor](../modules/predict.html#tensorpack.predict.OfflinePredictor),
a simple function to build the graph and return a callable for you.
Check out examples and docs for its usage.
Tensorpack provides one tool [OfflinePredictor](../modules/predict.html#tensorpack.predict.OfflinePredictor),
to merge the above two steps together.
It has simple functionailities to build the graph, load the checkpoint, and return a callable for you.
Check out examples and docs for its usage.
OfflinePredictor is only for quick demo purposes. OfflinePredictor is only for quick demo purposes.
It runs inference on numpy arrays, therefore may not be the most efficient way. It runs inference on numpy arrays, therefore may not be the most efficient way.
......
...@@ -81,10 +81,15 @@ Note some __common problems__ when using these trainers: ...@@ -81,10 +81,15 @@ Note some __common problems__ when using these trainers:
all GPUs take tensors from the `InputSource`. all GPUs take tensors from the `InputSource`.
So the total batch size across all GPUs would become ``(batch size of InputSource) * #GPU``. So the total batch size across all GPUs would become ``(batch size of InputSource) * #GPU``.
Splitting a tensor for data-parallel training makes no sense at all. First, ```eval_rst
it wastes time because typically data is concatenated into batches by the user. .. note::
Second, this puts unnecessary shape constraints on the data.
By letting each GPU train on its own input tensors, they can train on inputs of different shapes simultaneously. Splitting a tensor for data-parallel training (as done by frameworks like Keras)
makes no sense at all.
First, it wastes time doing the split because typically data is first concatenated by the user.
Second, this puts unnecessary shape constraints on the data, that the
inputs on each GPU needs to have consistent shapes.
```
2. The tower function (your model code) will get called multipile times on each GPU. 2. The tower function (your model code) will get called multipile times on each GPU.
You must follow the abovementieond rules of tower function. You must follow the abovementieond rules of tower function.
......
...@@ -168,18 +168,24 @@ class ImageNetModel(ModelDesc): ...@@ -168,18 +168,24 @@ class ImageNetModel(ModelDesc):
""" """
loss_scale = 1. loss_scale = 1.
"""
Label smoothing (See tf.losses.softmax_cross_entropy)
"""
label_smoothing = 0.
def inputs(self): def inputs(self):
return [tf.placeholder(self.image_dtype, [None, self.image_shape, self.image_shape, 3], 'input'), return [tf.placeholder(self.image_dtype, [None, self.image_shape, self.image_shape, 3], 'input'),
tf.placeholder(tf.int32, [None], 'label')] tf.placeholder(tf.int32, [None], 'label')]
def build_graph(self, image, label): def build_graph(self, image, label):
image = ImageNetModel.image_preprocess(image, bgr=self.image_bgr) image = self.image_preprocess(image)
assert self.data_format in ['NCHW', 'NHWC'] assert self.data_format in ['NCHW', 'NHWC']
if self.data_format == 'NCHW': if self.data_format == 'NCHW':
image = tf.transpose(image, [0, 3, 1, 2]) image = tf.transpose(image, [0, 3, 1, 2])
logits = self.get_logits(image) logits = self.get_logits(image)
loss = ImageNetModel.compute_loss_and_error(logits, label) loss = ImageNetModel.compute_loss_and_error(
logits, label, label_smoothing=self.label_smoothing)
if self.weight_decay > 0: if self.weight_decay > 0:
wd_loss = regularize_cost(self.weight_decay_pattern, wd_loss = regularize_cost(self.weight_decay_pattern,
...@@ -212,26 +218,29 @@ class ImageNetModel(ModelDesc): ...@@ -212,26 +218,29 @@ class ImageNetModel(ModelDesc):
tf.summary.scalar('learning_rate-summary', lr) tf.summary.scalar('learning_rate-summary', lr)
return tf.train.MomentumOptimizer(lr, 0.9, use_nesterov=True) return tf.train.MomentumOptimizer(lr, 0.9, use_nesterov=True)
@staticmethod def image_preprocess(self, image):
def image_preprocess(image, bgr=True):
with tf.name_scope('image_preprocess'): with tf.name_scope('image_preprocess'):
if image.dtype.base_dtype != tf.float32: if image.dtype.base_dtype != tf.float32:
image = tf.cast(image, tf.float32) image = tf.cast(image, tf.float32)
image = image * (1.0 / 255)
mean = [0.485, 0.456, 0.406] # rgb mean = [0.485, 0.456, 0.406] # rgb
std = [0.229, 0.224, 0.225] std = [0.229, 0.224, 0.225]
if bgr: if self.image_bgr:
mean = mean[::-1] mean = mean[::-1]
std = std[::-1] std = std[::-1]
image_mean = tf.constant(mean, dtype=tf.float32) image_mean = tf.constant(mean, dtype=tf.float32) * 255.
image_std = tf.constant(std, dtype=tf.float32) image_std = tf.constant(std, dtype=tf.float32) * 255.
image = (image - image_mean) / image_std image = (image - image_mean) / image_std
return image return image
@staticmethod @staticmethod
def compute_loss_and_error(logits, label): def compute_loss_and_error(logits, label, label_smoothing=0.):
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=label) if label_smoothing == 0.:
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=label)
else:
nclass = logits.shape[-1]
loss = tf.losses.softmax_cross_entropy(
tf.one_hot(label, nclass),
logits, label_smoothing=label_smoothing)
loss = tf.reduce_mean(loss, name='xentropy-loss') loss = tf.reduce_mean(loss, name='xentropy-loss')
def prediction_incorrect(logits, label, topk=1, name='incorrect_vector'): def prediction_incorrect(logits, label, topk=1, name='incorrect_vector'):
......
...@@ -16,7 +16,7 @@ from tensorpack.contrib.keras import KerasModel ...@@ -16,7 +16,7 @@ from tensorpack.contrib.keras import KerasModel
from tensorpack.callbacks import * from tensorpack.callbacks import *
from tensorflow.python.keras.layers import * from tensorflow.python.keras.layers import *
from imagenet_utils import get_imagenet_dataflow, fbresnet_augmentor, ImageNetModel from imagenet_utils import get_imagenet_dataflow, fbresnet_augmentor
TOTAL_BATCH_SIZE = 512 TOTAL_BATCH_SIZE = 512
...@@ -90,7 +90,11 @@ def resnet50(image): ...@@ -90,7 +90,11 @@ def resnet50(image):
input = Input(tensor=image) input = Input(tensor=image)
def image_preprocess(image): def image_preprocess(image):
image = ImageNetModel.image_preprocess(image) image = tf.cast(image, tf.float32)
image = image * (1.0 / 255)
mean = [0.485, 0.456, 0.406][::-1]
std = [0.229, 0.224, 0.225][::-1]
image = (image - tf.constant(mean, dtype=tf.float32)) / tf.constant(std, dtype=tf.float32)
image = tf.transpose(image, [0, 3, 1, 2]) image = tf.transpose(image, [0, 3, 1, 2])
return image return image
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment