Commit 59801ff8 authored by Yuxin Wu's avatar Yuxin Wu

docs update

parent ac2fa1bc
......@@ -41,7 +41,7 @@ It's Yet Another TF wrapper, but different in:
+ Data-Parallel Multi-GPU training is off-the-shelf to use. It is as fast as Google's [benchmark code](https://github.com/tensorflow/benchmarks).
3. Focus on large datasets.
+ __DataFlow__ allows you to process large datasets such as ImageNet in pure Python without blocking the training.
+ It's painful to read/preprocess data from TF. Use __DataFlow__ to process large datasets such as ImageNet in pure Python.
+ DataFlow has a unified interface, so you can compose and reuse them to perform complex preprocessing.
4. Interface of extensible __Callbacks__.
......
......@@ -14,13 +14,13 @@ a numpy array of shape (64, 28, 28), and an array of shape (64,).
### Composition of DataFlow
One good thing about having a standard interface is to be able to provide
the greatest code reusability.
There are a lot of existing modules in tensorpack which you can use to compose
complex DataFlow instances with a long pre-processing pipeline. A whole pipeline usually
There are a lot of existing modules in tensorpack, which you can use to compose
complex DataFlow with a long pre-processing pipeline. A whole pipeline usually
would __read from disk (or other sources), apply augmentations, group into batches,
prefetch data__, etc. A simple example is as the following:
````python
# a DataFlow you implement to produce [image,label] pairs from whatever sources:
# a DataFlow you implement to produce [tensor1, tensor2, ..] lists from whatever sources:
df = MyDataFlow(shuffle=True)
# resize the image component of each datapoint
df = AugmentImageComponent(df, [imgaug.Resize((225, 225))])
......@@ -35,8 +35,6 @@ with all the data preprocessing.
All these modules are written in Python,
so you can easily implement whatever operations/transformations you need,
without worrying about adding operators to TensorFlow.
In the meantime, thanks to the prefetching, it can still run fast enough for
tasks as large as ImageNet training.
Unless you are working with standard data types (image folders, LMDB, etc),
you would usually want to write your own DataFlow.
......@@ -70,3 +68,4 @@ training: we only need data to be __fast enough__.
DataFlow is fast enough for problems up to the scale of multi-GPU ImageNet training.
See [efficient dataflow tutorial](http://tensorpack.readthedocs.io/en/latest/tutorial/efficient-dataflow.html)
for details.
Therefore, for most usecases, writing format conversion/preprocessing code with TensorFlow operators doesn't help you at all.
......@@ -11,7 +11,7 @@ If you have such a mapping function `f` already, you can simply use `imgaug.MapI
augmentor, or use `MapDataComponent(df, f, index)` as the DataFlow.
In other words, for simple mapping you do not need to write an augmentor.
An augmentor does something more than applying the mapping. The interface you will need to implement
An augmentor may do something more than applying a mapping. The interface you will need to implement
is:
```python
......@@ -28,8 +28,8 @@ It does the following extra things for you:
1. `self.rng` is a `np.random.RandomState` object,
guaranteed to have different seeds when you use multiprocess prefetch.
In multiprocess settings, you will always need it to generate random numbers.
In multiprocess settings, you have to use it to generate random numbers.
2. Random parameters and the actual augmentation is separated. This allows you to apply the
same random transformation to several images (with `AugmentImageComponents`),
2. Random parameter generation and the actual augmentation is separated. This allows you to apply the
same transformation to several images together (with `AugmentImageComponents`),
which is essential to tasks such as segmentation.
......@@ -2,12 +2,8 @@
### Write a DataFlow
There are several existing DataFlow, e.g. ImageFromFile, DataFromList, which you can
use to read images or load data from a list.
However, in general, you will probably need to write a new DataFlow to produce data for your task.
DataFlow implementations for several well-known datasets are provided in the
[dataflow.dataset](http://tensorpack.readthedocs.io/en/latest/modules/tensorpack.dataflow.dataset.html)
module, you can take them as a reference.
use if your data format is simple.
However in general, you will probably need to write a new DataFlow to produce data for your task.
Usually, you just need to implement the `get_data()` method which yields a datapoint every time.
```python
......@@ -21,12 +17,17 @@ class MyDataFlow(DataFlow):
Optionally, DataFlow can implement the following two methods:
+ `size()`. Return the number of elements the generator can produce. Certain modules might require this.
+ `size()`. Return the number of elements the generator can produce. Certain tensorpack features might require this.
+ `reset_state()`. It is guaranteed that the actual process which runs a DataFlow will invoke this method before using it.
So if this DataFlow needs to something after a `fork()`, you should put it here.
So if this DataFlow needs to do something after a `fork()`, you should put it here.
A typical situation is when your DataFlow uses random number generator (RNG). Then you would need to reset the RNG here.
Otherwise, child processes will have the same random seed. The `RNGDataFlow` class does this already.
Otherwise, child processes will have the same random seed. The `RNGDataFlow` base class does this for you.
With a "low-level" DataFlow defined, you can then compose it with existing modules (e.g. batching, prefetching, ...).
DataFlow implementations for several well-known datasets are provided in the
[dataflow.dataset](http://tensorpack.readthedocs.io/en/latest/modules/tensorpack.dataflow.dataset.html)
module, you can take them as a reference.
With a "low-level" DataFlow defined, you can then compose it with existing modules.
......@@ -2,9 +2,9 @@
## Implement a layer
Symbolic functions should be nothing new to you.
Using symbolic functions is not special in tensorpack: you can use any symbolic functions you have
made or seen elsewhere with tensorpack layers.
You can use symbolic functions from slim/tflearn/tensorlayer, and even Keras ([with some tricks](../../examples/mnist-keras.py)).
Using symbolic functions in tensorpack is same as in TensorFlow: you can use any symbolic functions you have
made or seen elsewhere together with tensorpack layers.
You can use symbolic functions from slim/tflearn/tensorlayer, and even Keras/sonnet ([with some tricks](../../examples/mnist-keras.py)).
So you never **have to** implement a tensorpack layer.
If you would like, you can make a symbolic function become a "layer" by following some simple rules, and then gain benefits from the framework.
......@@ -19,7 +19,7 @@ def Conv2D(x, out_channel, kernel_shape,
nl=tf.nn.relu, split=1, use_bias=True):
```
Basically, a layer is a symbolic function with the following rules:
Basically, a tensorpack layer is just a symbolic function, but with the following rules:
+ It is decorated by `@layer_register`.
+ The first argument is its "input". It must be a **tensor or a list of tensors**.
......@@ -31,7 +31,7 @@ By making a symbolic function a "layer", the following things will happen:
Everything happening in this function will be under the variable scope 'conv0'.
You can register the layer with `use_scope=False` to disable this feature.
+ Static shapes of input/output will be printed to screen.
+ `argscope` will then work for all its arguments except the input tensor(s).
+ `argscope` will work for all its arguments except the input tensor(s).
+ It will work with `LinearWrap`: you can use it if the output of one layer matches the input of the next layer.
There are also some (non-layer) symbolic functions in the `tfutils.symbolic_functions` module.
......
......@@ -5,13 +5,14 @@
The library tries to __support__ everything, but it could not really __include__ everything.
For your XYZ, you can either implement them or use any existing python code and wrap it
with tensorpack interface. See [Extend Tensorpack](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack)
The interface tries to be flexible enough so you can put any XYZ on it.
You can either implement them under the interface or simply wrap some existing Python code.
See [Extend Tensorpack](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack)
for more details.
If you think:
1. The framework has limitation in its interface so your XYZ cannot be supported, OR
2. Your XYZ is very common, or very well-defined, so it would be nice to include it.
2. Your XYZ is very common / very well-defined, so it would be nice to include it.
Then it is a good time to open an issue.
......@@ -34,8 +35,8 @@ The script expects a metagraph file which is also saved by `ModelSaver`.
All model loading (in either training or testing) is through the `session_init` option
in `TrainConfig` or `PredictConfig`.
It accepts a `SessionInit` instance, where the common options are `SaverRestore` which restores
TF checkpoint, or `DictRestore` which restores a dict. `get_model_loader` is a small helper to
decide which one to use from a file name.
TF checkpoint, or `DictRestore` which restores a dict. (`get_model_loader` is a small helper to
decide which one to use from a file name.)
Doing transfer learning is straightforward. Variable restoring is completely based on name match between
the current graph and the `SessionInit` initializer.
......
......@@ -2,7 +2,7 @@
# Input Sources
This tutorial covers how data goes from DataFlow or other sources to TensorFlow graph.
You don't have to know it, but it may help with efficiency.
You don't have to read it because these are details under the tensorpack interface, but knowing it could help understand the efficiency.
`InputSource` is an abstract interface in tensorpack describing where the input come from and how they enter the graph.
For example,
......@@ -18,7 +18,7 @@ to customize your `InputSource`.
## Use Prefetch
In general, `feed_dict` is slow and should never appear in your critical loop.
In general, `feed_dict` is slow and should never appear in training loops.
i.e., when you use TensorFlow without any wrappers, you should avoid loops like this:
```python
while True:
......@@ -26,9 +26,9 @@ while True:
minimize_op.run(feed_dict={'X': X, 'y': y})
```
However, when you need to load data from Python-side, this is the only available interface in frameworks such as Keras, tflearn.
This is part of the reason why [tensorpack is faster](https://gist.github.com/ppwwyyxx/8d95da79f8d97036a7d67c2416c851b6) than examples from other frameworks.
This is part of the reason why [tensorpack is faster](https://gist.github.com/ppwwyyxx/8d95da79f8d97036a7d67c2416c851b6).
You should use something like this instead, to prefetch data into the graph in one thread and hide the copy latency:
You could use something like this instead, to prefetch data into the graph in one thread and hide the copy latency:
```python
# Thread 1:
while True:
......
......@@ -18,7 +18,7 @@ class MyModel(ModelDesc):
Basically, `_get_inputs` should define the metainfo of all the possible placeholders your graph may need.
`_build_graph` should add tensors/operations to the graph, where
the argument `input_tensors` is the list of input tensors matching `_get_inputs`.
the argument `inputs` is the list of input tensors matching `_get_inputs`.
You can use any symbolic functions in `_build_graph`, including TensorFlow core library
functions and other symbolic libraries (see below).
......
......@@ -9,6 +9,7 @@ import glob
from .base import Callback
from ..utils import logger
from ..tfutils.common import get_tf_version_number
__all__ = ['ModelSaver', 'MinSaver', 'MaxSaver']
......@@ -43,11 +44,19 @@ class ModelSaver(Callback):
for key in self.var_collections:
vars.extend(tf.get_collection(key))
self.path = os.path.join(self.checkpoint_dir, 'model')
if get_tf_version_number() <= 1.1:
self.saver = tf.train.Saver(
var_list=vars,
max_to_keep=self.keep_recent,
keep_checkpoint_every_n_hours=self.keep_freq,
write_version=tf.train.SaverDef.V2)
else:
self.saver = tf.train.Saver(
var_list=vars,
max_to_keep=self.keep_recent,
keep_checkpoint_every_n_hours=self.keep_freq,
write_version=tf.train.SaverDef.V2,
save_relative_paths=True)
self.meta_graph_written = False
def _trigger(self):
......
......@@ -15,6 +15,7 @@ __all__ = ['get_default_sess_config',
'get_op_tensor_name',
'get_tensors_by_names',
'get_op_or_tensor_by_name',
'get_tf_version_number',
]
......@@ -134,3 +135,10 @@ def get_op_or_tensor_by_name(name):
return f(name)
else:
return list(map(f, name))
def get_tf_version_number():
"""
Return a float (for comparison), indicating tensorflow version.
"""
return float('.'.join(tf.VERSION.split('.')[:2]))
......@@ -12,6 +12,7 @@ from six.moves import zip, range
from ..utils import logger
from ..utils.naming import TOWER_FREEZE_KEYS
from ..utils.concurrency import LoopThread
from ..tfutils.common import get_tf_version_number
from ..tfutils.tower import TowerContext
from ..tfutils.collection import backup_collection, restore_collection
from ..tfutils.gradproc import FilterNoneGrad, ScaleGradient
......@@ -28,8 +29,8 @@ __all__ = ['MultiGPUTrainerBase', 'SyncMultiGPUTrainer',
def _check_tf_version():
ver = float('.'.join(tf.VERSION.split('.')[:2]))
assert ver >= 1.1, "TF version {} is too old to run multi GPU training!".format(tf.VERSION)
assert get_tf_version_number() >= 1.1, \
"TF version {} is too old to run multi GPU training!".format(tf.VERSION)
def apply_prefetch_policy(config, use_stage=True):
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment