Commit 20a4204b authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 30ead05b
...@@ -31,7 +31,7 @@ They may often be inefficient or lack functionalities you need. ...@@ -31,7 +31,7 @@ They may often be inefficient or lack functionalities you need.
If you need anything more complicated, please If you need anything more complicated, please
learn what TensorFlow can do, and __do it on your own__ because Tensorpack learn what TensorFlow can do, and __do it on your own__ because Tensorpack
is a training interface and doesn't care what happened after training. is a training interface and doesn't focus on what happened after training.
### OfflinePredictor ### OfflinePredictor
...@@ -107,7 +107,7 @@ It's safer to write the graph by yourself. ...@@ -107,7 +107,7 @@ It's safer to write the graph by yourself.
## Inference After Training: Do It Yourself ## Inference After Training: Do It Yourself
Tensorpack is a training interface -- __it doesn't care what happened after training__. Tensorpack is a training interface -- __it doesn't care what happened after training__.
It already provides everything you need for inference or model diagnosis after During training it already provides everything you need for inference or model diagnosis after
training: training:
1. The model (the graph): you've already written it yourself with TF symbolic functions. 1. The model (the graph): you've already written it yourself with TF symbolic functions.
...@@ -115,9 +115,9 @@ training: ...@@ -115,9 +115,9 @@ training:
If you use tensorpack layers, they are mainly just wrappers around `tf.layers`. If you use tensorpack layers, they are mainly just wrappers around `tf.layers`.
2. The trained parameters: tensorpack saves them in standard TF checkpoint format. 2. The trained parameters: tensorpack saves them in standard TF checkpoint format.
Nothing about it is related to tensorpack. Nothing about the format is related to tensorpack.
With the model and the weights, you can do inference with whatever approaches With the model and the trained parameters, you can do inference with whatever approaches
TensorFlow supports. Usually it involves the following steps: TensorFlow supports. Usually it involves the following steps:
### Step 1: build the model (graph) ### Step 1: build the model (graph)
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
This tutorial contains some general discussions on the topic of This tutorial contains some general discussions on the topic of
"how to read data efficiently to work with TensorFlow", "how to read data efficiently to work with TensorFlow",
and how tensorpack supports these methods. and how tensorpack supports these methods.
You don't have to read it because these are details under the tensorpack interface, As a beginner you can skip this tutorial, because these are details under the tensorpack interface,
but knowing it could help understand the efficiency and choose the best input pipeline for your task. but knowing it could help understand the efficiency and choose the best input pipeline for your task.
## Prepare Data in Parallel ## Prepare Data in Parallel
......
...@@ -2,17 +2,18 @@ ...@@ -2,17 +2,18 @@
# Symbolic Layers # Symbolic Layers
Tensorpack contains a small collection of common model primitives, Tensorpack contains a small collection of common model primitives,
such as conv/deconv, fc, bn, pooling layers. **You do not need to learn them.** such as conv/deconv, fc, bn, pooling layers.
However, tensorpack is model-agnostic, which means
**you can skip this tutorial and do not need to use tensorpack's symbolic layers.**
These layers were written only because there were no alternatives when These layers were written only because there were no alternatives when tensorpack was first developed.
tensorpack was first developed.
Nowadays, these implementation actually call `tf.layers` directly. Nowadays, these implementation actually call `tf.layers` directly.
__Tensorpack will not add any more layers__ into its core library because this is __Tensorpack will not add any more layers__ into its core library because this is
not the focus of tensorpack, and there are many other alternative symbolic not the focus of tensorpack, and there are many other alternative symbolic
libraries today. libraries today.
Today, you can just use `tf.layers` or any other symbolic libraries inside tensorpack. Today, you can just use `tf.layers` or any other symbolic libraries inside tensorpack.
Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to If you use the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
simplify the code. simplify the code.
Note that to keep backward compatibility of code and pre-trained models, tensorpack layers Note that to keep backward compatibility of code and pre-trained models, tensorpack layers
......
...@@ -52,7 +52,8 @@ The tower function needs to follow some rules: ...@@ -52,7 +52,8 @@ The tower function needs to follow some rules:
On the other hand, for a non-trainable variable, it may be desirable to not reuse it between towers. On the other hand, for a non-trainable variable, it may be desirable to not reuse it between towers.
In this case, `tf.Variable` can be used to ensure creation of new variables in each tower even when `reuse=True`. In this case, `tf.Variable` can be used to ensure creation of new variables in each tower even when `reuse=True`.
* Do not modify the reuse option (e.g., by `scope.reuse_variables()`) of a variable * Do not modify the reuse option (e.g., by `scope.reuse_variables()`) of a variable
scope that is not created by you. This affects other's code. scope that is not created by you. This affects other's code. You can always
open new scopes if you need the reuse option.
4. It cannot create scopes or variables containing the name 'tower', as it is 4. It cannot create scopes or variables containing the name 'tower', as it is
reserved for special use. reserved for special use.
...@@ -66,9 +67,10 @@ follow these conventions and will need some workarounds if used within tensorpac ...@@ -66,9 +67,10 @@ follow these conventions and will need some workarounds if used within tensorpac
[TowerContext](../modules/tfutils.html#tensorpack.tfutils.tower.BaseTowerContext), [TowerContext](../modules/tfutils.html#tensorpack.tfutils.tower.BaseTowerContext),
which can be accessed by [get_current_tower_context()](../modules/tfutils.html#tensorpack.tfutils.tower.get_current_tower_context). which can be accessed by [get_current_tower_context()](../modules/tfutils.html#tensorpack.tfutils.tower.get_current_tower_context).
The context contains information about training/inference mode, scope name, etc. The context contains information about training/inference mode, scope name, etc.
You can use the context to build a different graph under different mode.
### MultiGPU Trainers ### Multi-GPU Trainers
For data-parallel multi-GPU training, different [multi-GPU trainers](../modules/train.html) For data-parallel multi-GPU training, different [multi-GPU trainers](../modules/train.html)
implement different distribution strategies. implement different distribution strategies.
...@@ -79,7 +81,7 @@ It takes only one line of code change to use them, e.g. `trainer=SyncMultiGPUTra ...@@ -79,7 +81,7 @@ It takes only one line of code change to use them, e.g. `trainer=SyncMultiGPUTra
Note some __common problems__ when using these trainers: Note some __common problems__ when using these trainers:
1. In each iteration, instead of taking one tensor for all GPUs and split, 1. In each iteration, instead of taking one input tensor for all GPUs and split,
all GPUs take tensors from the `InputSource`. all GPUs take tensors from the `InputSource`.
So the total batch size across all GPUs would become ``(batch size of InputSource) * #GPU``. So the total batch size across all GPUs would become ``(batch size of InputSource) * #GPU``.
...@@ -90,7 +92,7 @@ Note some __common problems__ when using these trainers: ...@@ -90,7 +92,7 @@ Note some __common problems__ when using these trainers:
makes no sense at all. makes no sense at all.
First, it wastes time doing the split because typically data is first concatenated by the user. First, it wastes time doing the split because typically data is first concatenated by the user.
Second, this puts unnecessary shape constraints on the data, that the Second, this puts unnecessary shape constraints on the data, that the
inputs on each GPU needs to have consistent shapes. inputs on each GPU needs to have compatible shapes.
``` ```
2. The tower function (your model code) will get called once on each GPU. 2. The tower function (your model code) will get called once on each GPU.
......
...@@ -29,9 +29,12 @@ def fbresnet_augmentor(isTrain): ...@@ -29,9 +29,12 @@ def fbresnet_augmentor(isTrain):
""" """
Augmentor used in fb.resnet.torch, for BGR images in range [0,255]. Augmentor used in fb.resnet.torch, for BGR images in range [0,255].
""" """
interpolation = cv2.INTER_CUBIC
# linear seems to have more stable performance.
# but we keep cubic for compatibility with old models
if isTrain: if isTrain:
augmentors = [ augmentors = [
imgaug.GoogleNetRandomCropAndResize(), imgaug.GoogleNetRandomCropAndResize(interp=interpolation),
# It's OK to remove the following augs if your CPU is not fast enough. # It's OK to remove the following augs if your CPU is not fast enough.
# Removing brightness/contrast/saturation does not have a significant effect on accuracy. # Removing brightness/contrast/saturation does not have a significant effect on accuracy.
# Removing lighting leads to a tiny drop in accuracy. # Removing lighting leads to a tiny drop in accuracy.
...@@ -53,7 +56,7 @@ def fbresnet_augmentor(isTrain): ...@@ -53,7 +56,7 @@ def fbresnet_augmentor(isTrain):
] ]
else: else:
augmentors = [ augmentors = [
imgaug.ResizeShortestEdge(256, cv2.INTER_CUBIC), imgaug.ResizeShortestEdge(256, interp=interpolation),
imgaug.CenterCrop((224, 224)), imgaug.CenterCrop((224, 224)),
] ]
return augmentors return augmentors
......
...@@ -22,7 +22,11 @@ if __name__ == '__main__': ...@@ -22,7 +22,11 @@ if __name__ == '__main__':
# this script does not need GPU # this script does not need GPU
os.environ['CUDA_VISIBLE_DEVICES'] = '' os.environ['CUDA_VISIBLE_DEVICES'] = ''
try:
tf.train.import_meta_graph(args.meta, clear_devices=True) tf.train.import_meta_graph(args.meta, clear_devices=True)
except KeyError:
print("If your graph contains non-standard ops, you need to import the relevant library first.")
raise
# loading... # loading...
if args.input.endswith('.npz'): if args.input.endswith('.npz'):
......
...@@ -98,7 +98,7 @@ class GoogleNetRandomCropAndResize(ImageAugmentor): ...@@ -98,7 +98,7 @@ class GoogleNetRandomCropAndResize(ImageAugmentor):
""" """
def __init__(self, crop_area_fraction=(0.08, 1.), def __init__(self, crop_area_fraction=(0.08, 1.),
aspect_ratio_range=(0.75, 1.333), aspect_ratio_range=(0.75, 1.333),
target_shape=224, interp=cv2.INTER_CUBIC): target_shape=224, interp=cv2.INTER_LINEAR):
""" """
Args: Args:
crop_area_fraction (tuple(float)): Defaults to crop 8%-100% area. crop_area_fraction (tuple(float)): Defaults to crop 8%-100% area.
......
...@@ -96,24 +96,27 @@ class MultiThreadMapData(_ParallelMapData): ...@@ -96,24 +96,27 @@ class MultiThreadMapData(_ParallelMapData):
This is useful when the mapping function is the bottleneck, but you don't This is useful when the mapping function is the bottleneck, but you don't
want to start processes for the entire dataflow pipeline. want to start processes for the entire dataflow pipeline.
The semantics of this class is __identical__ to :class:`MapData` except for the ordering.
Threads run in parallel and can take different time to run the
mapping function. Therefore the order of datapoints won't be preserved.
When `strict=True`, `MultiThreadMapData(df, ...)`
is guaranteed to produce the exact set of data as `MapData(df, ...)`,
if both are iterated until `StopIteration`. But the produced data will have different ordering.
The behavior of strict mode is undefined if the given dataflow `df` is infinite.
When `strict=False`, the data that's produced by `MultiThreadMapData(df, ...)`
is a reordering of the data produced by `RepeatedData(MapData(df, ...), -1)`.
In other words, first pass of `MultiThreadMapData.__iter__` may contain
datapoints from the second pass of `df.__iter__`.
Note: Note:
1. There is tiny communication overhead with threads, but you 1. You should avoid starting many threads in your main process to reduce GIL contention.
should avoid starting many threads in your main process to reduce GIL contention.
The threads will only start in the process which calls :meth:`reset_state()`. The threads will only start in the process which calls :meth:`reset_state()`.
Therefore you can use ``PrefetchDataZMQ(MultiThreadMapData(...), 1)`` Therefore you can use ``PrefetchDataZMQ(MultiThreadMapData(...), 1)``
to reduce GIL contention. to reduce GIL contention.
2. Threads run in parallel and can take different time to run the
mapping function. Therefore the order of datapoints won't be
preserved, and datapoints from one pass of `df.__iter__()` might get
mixed with datapoints from the next pass.
You can use **strict mode**, where `MultiThreadMapData.__iter__()`
is guaranteed to produce the exact set which `df.__iter__()`
produces. Although the order of data still isn't preserved.
The behavior of strict mode is undefined if the dataflow is infinite.
""" """
class _Worker(StoppableThread): class _Worker(StoppableThread):
def __init__(self, inq, outq, evt, map_func): def __init__(self, inq, outq, evt, map_func):
...@@ -209,17 +212,19 @@ class MultiProcessMapDataZMQ(_ParallelMapData, _MultiProcessZMQDataFlow): ...@@ -209,17 +212,19 @@ class MultiProcessMapDataZMQ(_ParallelMapData, _MultiProcessZMQDataFlow):
Same as :class:`MapData`, but start processes to run the mapping function, Same as :class:`MapData`, but start processes to run the mapping function,
and communicate with ZeroMQ pipe. and communicate with ZeroMQ pipe.
Note: The semantics of this class is __identical__ to :class:`MapData` except for the ordering.
1. Processes run in parallel and can take different time to run the Processes run in parallel and can take different time to run the
mapping function. Therefore the order of datapoints won't be mapping function. Therefore the order of datapoints won't be preserved.
preserved, and datapoints from one pass of `df.__iter__()` might get
mixed with datapoints from the next pass.
You can use **strict mode**, where `MultiProcessMapData.__iter__()` When `strict=True`, `MultiProcessMapData(df, ...)`
is guaranteed to produce the exact set which `df.__iter__()` is guaranteed to produce the exact set of data as `MapData(df, ...)`,
produces. Although the order of data still isn't preserved. if both are iterated until `StopIteration`. But the produced data will have different ordering.
The behavior of strict mode is undefined if the given dataflow `df` is infinite.
The behavior of strict mode is undefined if the dataflow is infinite. When `strict=False`, the data that's produced by `MultiProcessMapData(df, ...)`
is a reordering of the data produced by `RepeatedData(MapData(df, ...), -1)`.
In other words, first pass of `MultiProcessMapData.__iter__` may contain
datapoints from the second pass of `df.__iter__`.
""" """
class _Worker(mp.Process): class _Worker(mp.Process):
def __init__(self, identity, map_func, pipename, hwm): def __init__(self, identity, map_func, pipename, hwm):
......
...@@ -182,8 +182,8 @@ class AutoResumeTrainConfig(TrainConfig): ...@@ -182,8 +182,8 @@ class AutoResumeTrainConfig(TrainConfig):
Note that the functionality requires the logging directory to obtain Note that the functionality requires the logging directory to obtain
necessary information from a previous run. necessary information from a previous run.
In some cases (e.g. when using Horovod), the directory is not In some cases (e.g. when using Horovod), the directory is not
available or different for different workers and this class may not function available, or the directories are different for different workers,
properly. then this class may not function properly.
""" """
def __init__(self, always_resume=True, **kwargs): def __init__(self, always_resume=True, **kwargs):
""" """
...@@ -196,9 +196,9 @@ class AutoResumeTrainConfig(TrainConfig): ...@@ -196,9 +196,9 @@ class AutoResumeTrainConfig(TrainConfig):
Note: Note:
The main goal of this class is to let a training job to resume The main goal of this class is to let a training job to resume
without changing any line of code or command line arguments. without changing any line of code or command line arguments.
So it's useful to let resume take priority over user-provided arguments sometimes: So it's useful to let resume take priority over user-provided arguments sometimes.
If your training starts from a pre-trained model, For example: if your training starts from a pre-trained model,
you would want it to use user-provided model loader at the you would want it to use user-provided model loader at the
beginning, but a "resume" model loader when the job was beginning, but a "resume" model loader when the job was
interrupted and restarted. interrupted and restarted.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment