Commit 20a4204b authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 30ead05b
......@@ -31,7 +31,7 @@ They may often be inefficient or lack functionalities you need.
If you need anything more complicated, please
learn what TensorFlow can do, and __do it on your own__ because Tensorpack
is a training interface and doesn't care what happened after training.
is a training interface and doesn't focus on what happened after training.
### OfflinePredictor
......@@ -107,7 +107,7 @@ It's safer to write the graph by yourself.
## Inference After Training: Do It Yourself
Tensorpack is a training interface -- __it doesn't care what happened after training__.
It already provides everything you need for inference or model diagnosis after
During training it already provides everything you need for inference or model diagnosis after
training:
1. The model (the graph): you've already written it yourself with TF symbolic functions.
......@@ -115,9 +115,9 @@ training:
If you use tensorpack layers, they are mainly just wrappers around `tf.layers`.
2. The trained parameters: tensorpack saves them in standard TF checkpoint format.
Nothing about it is related to tensorpack.
Nothing about the format is related to tensorpack.
With the model and the weights, you can do inference with whatever approaches
With the model and the trained parameters, you can do inference with whatever approaches
TensorFlow supports. Usually it involves the following steps:
### Step 1: build the model (graph)
......
......@@ -4,7 +4,7 @@
This tutorial contains some general discussions on the topic of
"how to read data efficiently to work with TensorFlow",
and how tensorpack supports these methods.
You don't have to read it because these are details under the tensorpack interface,
As a beginner you can skip this tutorial, because these are details under the tensorpack interface,
but knowing it could help understand the efficiency and choose the best input pipeline for your task.
## Prepare Data in Parallel
......
......@@ -2,17 +2,18 @@
# Symbolic Layers
Tensorpack contains a small collection of common model primitives,
such as conv/deconv, fc, bn, pooling layers. **You do not need to learn them.**
such as conv/deconv, fc, bn, pooling layers.
However, tensorpack is model-agnostic, which means
**you can skip this tutorial and do not need to use tensorpack's symbolic layers.**
These layers were written only because there were no alternatives when
tensorpack was first developed.
These layers were written only because there were no alternatives when tensorpack was first developed.
Nowadays, these implementation actually call `tf.layers` directly.
__Tensorpack will not add any more layers__ into its core library because this is
not the focus of tensorpack, and there are many other alternative symbolic
libraries today.
Today, you can just use `tf.layers` or any other symbolic libraries inside tensorpack.
Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
If you use the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
simplify the code.
Note that to keep backward compatibility of code and pre-trained models, tensorpack layers
......
......@@ -52,7 +52,8 @@ The tower function needs to follow some rules:
On the other hand, for a non-trainable variable, it may be desirable to not reuse it between towers.
In this case, `tf.Variable` can be used to ensure creation of new variables in each tower even when `reuse=True`.
* Do not modify the reuse option (e.g., by `scope.reuse_variables()`) of a variable
scope that is not created by you. This affects other's code.
scope that is not created by you. This affects other's code. You can always
open new scopes if you need the reuse option.
4. It cannot create scopes or variables containing the name 'tower', as it is
reserved for special use.
......@@ -66,9 +67,10 @@ follow these conventions and will need some workarounds if used within tensorpac
[TowerContext](../modules/tfutils.html#tensorpack.tfutils.tower.BaseTowerContext),
which can be accessed by [get_current_tower_context()](../modules/tfutils.html#tensorpack.tfutils.tower.get_current_tower_context).
The context contains information about training/inference mode, scope name, etc.
You can use the context to build a different graph under different mode.
### MultiGPU Trainers
### Multi-GPU Trainers
For data-parallel multi-GPU training, different [multi-GPU trainers](../modules/train.html)
implement different distribution strategies.
......@@ -79,7 +81,7 @@ It takes only one line of code change to use them, e.g. `trainer=SyncMultiGPUTra
Note some __common problems__ when using these trainers:
1. In each iteration, instead of taking one tensor for all GPUs and split,
1. In each iteration, instead of taking one input tensor for all GPUs and split,
all GPUs take tensors from the `InputSource`.
So the total batch size across all GPUs would become ``(batch size of InputSource) * #GPU``.
......@@ -90,7 +92,7 @@ Note some __common problems__ when using these trainers:
makes no sense at all.
First, it wastes time doing the split because typically data is first concatenated by the user.
Second, this puts unnecessary shape constraints on the data, that the
inputs on each GPU needs to have consistent shapes.
inputs on each GPU needs to have compatible shapes.
```
2. The tower function (your model code) will get called once on each GPU.
......
......@@ -29,9 +29,12 @@ def fbresnet_augmentor(isTrain):
"""
Augmentor used in fb.resnet.torch, for BGR images in range [0,255].
"""
interpolation = cv2.INTER_CUBIC
# linear seems to have more stable performance.
# but we keep cubic for compatibility with old models
if isTrain:
augmentors = [
imgaug.GoogleNetRandomCropAndResize(),
imgaug.GoogleNetRandomCropAndResize(interp=interpolation),
# It's OK to remove the following augs if your CPU is not fast enough.
# Removing brightness/contrast/saturation does not have a significant effect on accuracy.
# Removing lighting leads to a tiny drop in accuracy.
......@@ -53,7 +56,7 @@ def fbresnet_augmentor(isTrain):
]
else:
augmentors = [
imgaug.ResizeShortestEdge(256, cv2.INTER_CUBIC),
imgaug.ResizeShortestEdge(256, interp=interpolation),
imgaug.CenterCrop((224, 224)),
]
return augmentors
......
......@@ -22,7 +22,11 @@ if __name__ == '__main__':
# this script does not need GPU
os.environ['CUDA_VISIBLE_DEVICES'] = ''
try:
tf.train.import_meta_graph(args.meta, clear_devices=True)
except KeyError:
print("If your graph contains non-standard ops, you need to import the relevant library first.")
raise
# loading...
if args.input.endswith('.npz'):
......
......@@ -98,7 +98,7 @@ class GoogleNetRandomCropAndResize(ImageAugmentor):
"""
def __init__(self, crop_area_fraction=(0.08, 1.),
aspect_ratio_range=(0.75, 1.333),
target_shape=224, interp=cv2.INTER_CUBIC):
target_shape=224, interp=cv2.INTER_LINEAR):
"""
Args:
crop_area_fraction (tuple(float)): Defaults to crop 8%-100% area.
......
......@@ -96,24 +96,27 @@ class MultiThreadMapData(_ParallelMapData):
This is useful when the mapping function is the bottleneck, but you don't
want to start processes for the entire dataflow pipeline.
The semantics of this class is __identical__ to :class:`MapData` except for the ordering.
Threads run in parallel and can take different time to run the
mapping function. Therefore the order of datapoints won't be preserved.
When `strict=True`, `MultiThreadMapData(df, ...)`
is guaranteed to produce the exact set of data as `MapData(df, ...)`,
if both are iterated until `StopIteration`. But the produced data will have different ordering.
The behavior of strict mode is undefined if the given dataflow `df` is infinite.
When `strict=False`, the data that's produced by `MultiThreadMapData(df, ...)`
is a reordering of the data produced by `RepeatedData(MapData(df, ...), -1)`.
In other words, first pass of `MultiThreadMapData.__iter__` may contain
datapoints from the second pass of `df.__iter__`.
Note:
1. There is tiny communication overhead with threads, but you
should avoid starting many threads in your main process to reduce GIL contention.
1. You should avoid starting many threads in your main process to reduce GIL contention.
The threads will only start in the process which calls :meth:`reset_state()`.
Therefore you can use ``PrefetchDataZMQ(MultiThreadMapData(...), 1)``
to reduce GIL contention.
2. Threads run in parallel and can take different time to run the
mapping function. Therefore the order of datapoints won't be
preserved, and datapoints from one pass of `df.__iter__()` might get
mixed with datapoints from the next pass.
You can use **strict mode**, where `MultiThreadMapData.__iter__()`
is guaranteed to produce the exact set which `df.__iter__()`
produces. Although the order of data still isn't preserved.
The behavior of strict mode is undefined if the dataflow is infinite.
"""
class _Worker(StoppableThread):
def __init__(self, inq, outq, evt, map_func):
......@@ -209,17 +212,19 @@ class MultiProcessMapDataZMQ(_ParallelMapData, _MultiProcessZMQDataFlow):
Same as :class:`MapData`, but start processes to run the mapping function,
and communicate with ZeroMQ pipe.
Note:
1. Processes run in parallel and can take different time to run the
mapping function. Therefore the order of datapoints won't be
preserved, and datapoints from one pass of `df.__iter__()` might get
mixed with datapoints from the next pass.
The semantics of this class is __identical__ to :class:`MapData` except for the ordering.
Processes run in parallel and can take different time to run the
mapping function. Therefore the order of datapoints won't be preserved.
You can use **strict mode**, where `MultiProcessMapData.__iter__()`
is guaranteed to produce the exact set which `df.__iter__()`
produces. Although the order of data still isn't preserved.
When `strict=True`, `MultiProcessMapData(df, ...)`
is guaranteed to produce the exact set of data as `MapData(df, ...)`,
if both are iterated until `StopIteration`. But the produced data will have different ordering.
The behavior of strict mode is undefined if the given dataflow `df` is infinite.
The behavior of strict mode is undefined if the dataflow is infinite.
When `strict=False`, the data that's produced by `MultiProcessMapData(df, ...)`
is a reordering of the data produced by `RepeatedData(MapData(df, ...), -1)`.
In other words, first pass of `MultiProcessMapData.__iter__` may contain
datapoints from the second pass of `df.__iter__`.
"""
class _Worker(mp.Process):
def __init__(self, identity, map_func, pipename, hwm):
......
......@@ -182,8 +182,8 @@ class AutoResumeTrainConfig(TrainConfig):
Note that the functionality requires the logging directory to obtain
necessary information from a previous run.
In some cases (e.g. when using Horovod), the directory is not
available or different for different workers and this class may not function
properly.
available, or the directories are different for different workers,
then this class may not function properly.
"""
def __init__(self, always_resume=True, **kwargs):
"""
......@@ -196,9 +196,9 @@ class AutoResumeTrainConfig(TrainConfig):
Note:
The main goal of this class is to let a training job to resume
without changing any line of code or command line arguments.
So it's useful to let resume take priority over user-provided arguments sometimes:
So it's useful to let resume take priority over user-provided arguments sometimes.
If your training starts from a pre-trained model,
For example: if your training starts from a pre-trained model,
you would want it to use user-provided model loader at the
beginning, but a "resume" model loader when the job was
interrupted and restarted.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment