docs update

59801ff8 · Yuxin Wu · ac2fa1bc · 59801ff8 · 59801ff8 · 59801ff8
Commit 59801ff8 authored May 23, 2017 by Yuxin Wu
11 changed files
--- a/README.md
+++ b/README.md
@@ -41,7 +41,7 @@ It's Yet Another TF wrapper, but different in:
 	+ Data-Parallel Multi-GPU training is off-the-shelf to use. It is as fast as Google's [benchmark code](https://github.com/tensorflow/benchmarks).

 3. Focus on large datasets.
-	+ __DataFlow__ allows you to process large datasets such as ImageNet in pure Python without blocking the training.
+	+ It's painful to read/preprocess data from TF. Use __DataFlow__ to process large datasets such as ImageNet in pure Python.
 	+ DataFlow has a unified interface, so you can compose and reuse them to perform complex preprocessing.

 4. Interface of extensible __Callbacks__.

--- a/docs/tutorial/dataflow.md
+++ b/docs/tutorial/dataflow.md
@@ -14,13 +14,13 @@ a numpy array of shape (64, 28, 28), and an array of shape (64,).
 ### Composition of DataFlow
 One good thing about having a standard interface is to be able to provide
 the greatest code reusability.
-There are a lot of existing modules in tensorpack which you can use to compose
-complex DataFlow instances with a long pre-processing pipeline. A whole pipeline usually
+There are a lot of existing modules in tensorpack, which you can use to compose
+complex DataFlow with a long pre-processing pipeline. A whole pipeline usually
 would __read from disk (or other sources), apply augmentations, group into batches,
 prefetch data__, etc. A simple example is as the following:

 ````python
-# a DataFlow you implement to produce [image,label] pairs from whatever sources:
+# a DataFlow you implement to produce [tensor1, tensor2, ..] lists from whatever sources:
 df = MyDataFlow(shuffle=True)
 # resize the image component of each datapoint
 df = AugmentImageComponent(df, [imgaug.Resize((225, 225))])
@@ -35,8 +35,6 @@ with all the data preprocessing.
 All these modules are written in Python,
 so you can easily implement whatever operations/transformations you need,
 without worrying about adding operators to TensorFlow.
-In the meantime, thanks to the prefetching, it can still run fast enough for
-tasks as large as ImageNet training.

 Unless you are working with standard data types (image folders, LMDB, etc),
 you would usually want to write your own DataFlow.
@@ -70,3 +68,4 @@ training: we only need data to be __fast enough__.
 DataFlow is fast enough for problems up to the scale of multi-GPU ImageNet training.
 See [efficient dataflow tutorial](http://tensorpack.readthedocs.io/en/latest/tutorial/efficient-dataflow.html)
 for details.
+Therefore, for most usecases, writing format conversion/preprocessing code with TensorFlow operators doesn't help you at all.
--- a/docs/tutorial/extend/augmentor.md
+++ b/docs/tutorial/extend/augmentor.md
@@ -11,7 +11,7 @@ If you have such a mapping function `f` already, you can simply use `imgaug.MapI
 augmentor, or use `MapDataComponent(df, f, index)` as the DataFlow.
 In other words, for simple mapping you do not need to write an augmentor.

-An augmentor does something more than applying the mapping. The interface you will need to implement
+An augmentor may do something more than applying a mapping. The interface you will need to implement
 is:

 ```python
@@ -28,8 +28,8 @@ It does the following extra things for you:

 1. `self.rng` is a `np.random.RandomState` object,
 	guaranteed to have different seeds when you use multiprocess prefetch.
-	In multiprocess settings, you will always need it to generate random numbers.
+	In multiprocess settings, you have to use it to generate random numbers.

-2. Random parameters and the actual augmentation is separated. This allows you to apply the
-	same random transformation to several images (with `AugmentImageComponents`),
+2. Random parameter generation and the actual augmentation is separated. This allows you to apply the
+	same transformation to several images together (with `AugmentImageComponents`),
 	which is essential to tasks such as segmentation.
--- a/docs/tutorial/extend/dataflow.md
+++ b/docs/tutorial/extend/dataflow.md
@@ -2,12 +2,8 @@
 ### Write a DataFlow

 There are several existing DataFlow, e.g. ImageFromFile, DataFromList, which you can
-use to read images or load data from a list.
-However, in general, you will probably need to write a new DataFlow to produce data for your task.
-
-DataFlow implementations for several well-known datasets are provided in the
-[dataflow.dataset](http://tensorpack.readthedocs.io/en/latest/modules/tensorpack.dataflow.dataset.html)
-module, you can take them as a reference.
+use if your data format is simple.
+However in general, you will probably need to write a new DataFlow to produce data for your task.

 Usually, you just need to implement the `get_data()` method which yields a datapoint every time.
 ```python
@@ -21,12 +17,17 @@ class MyDataFlow(DataFlow):

 Optionally, DataFlow can implement the following two methods:

-+ `size()`. Return the number of elements the generator can produce. Certain modules might require this.
+ `size()`. Return the number of elements the generator can produce. Certain tensorpack features might require this.

 + `reset_state()`. It is guaranteed that the actual process which runs a DataFlow will invoke this method before using it.
-	So if this DataFlow needs to something after a `fork()`, you should put it here.
+	So if this DataFlow needs to do something after a `fork()`, you should put it here.

 	A typical situation is when your DataFlow uses random number generator (RNG). Then you would need to reset the RNG here.
-	Otherwise, child processes will have the same random seed. The `RNGDataFlow` class does this already.
+	Otherwise, child processes will have the same random seed. The `RNGDataFlow` base class does this for you.
+
+With a "low-level" DataFlow defined, you can then compose it with existing modules (e.g. batching, prefetching, ...).
+
+DataFlow implementations for several well-known datasets are provided in the
+[dataflow.dataset](http://tensorpack.readthedocs.io/en/latest/modules/tensorpack.dataflow.dataset.html)
+module, you can take them as a reference.

-With a "low-level" DataFlow defined, you can then compose it with existing modules.
--- a/docs/tutorial/extend/model.md
+++ b/docs/tutorial/extend/model.md
@@ -2,9 +2,9 @@
 ## Implement a layer

 Symbolic functions should be nothing new to you.
-Using symbolic functions is not special in tensorpack: you can use any symbolic functions you have
-made or seen elsewhere with tensorpack layers.
-You can use symbolic functions from slim/tflearn/tensorlayer, and even Keras ([with some tricks](../../examples/mnist-keras.py)).
+Using symbolic functions in tensorpack is same as in TensorFlow: you can use any symbolic functions you have
+made or seen elsewhere together with tensorpack layers.
+You can use symbolic functions from slim/tflearn/tensorlayer, and even Keras/sonnet ([with some tricks](../../examples/mnist-keras.py)).
 So you never **have to** implement a tensorpack layer.

 If you would like, you can make a symbolic function become a "layer" by following some simple rules, and then gain benefits from the framework.
@@ -19,7 +19,7 @@ def Conv2D(x, out_channel, kernel_shape,
           nl=tf.nn.relu, split=1, use_bias=True):
 ```

-Basically, a layer is a symbolic function with the following rules:
+Basically, a tensorpack layer is just a symbolic function, but with the following rules:

 + It is decorated by `@layer_register`.
 + The first argument is its "input". It must be a **tensor or a list of tensors**.
@@ -31,7 +31,7 @@ By making a symbolic function a "layer", the following things will happen:
 	Everything happening in this function will be under the variable scope 'conv0'.
 	You can register the layer with `use_scope=False` to disable this feature.
 + Static shapes of input/output will be printed to screen.
-+ `argscope` will then work for all its arguments except the input tensor(s).
+ `argscope` will work for all its arguments except the input tensor(s).
 + It will work with `LinearWrap`: you can use it if the output of one layer matches the input of the next layer.

 There are also some (non-layer) symbolic functions in the `tfutils.symbolic_functions` module.

--- a/docs/tutorial/faq.md
+++ b/docs/tutorial/faq.md
@@ -5,13 +5,14 @@

 The library tries to __support__ everything, but it could not really __include__ everything.

-For your XYZ, you can either implement them or use any existing python code and wrap it
-with tensorpack interface. See [Extend Tensorpack](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack)
+The interface tries to be flexible enough so you can put any XYZ on it.
+You can either implement them under the interface or simply wrap some existing Python code.
+See [Extend Tensorpack](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack)
 for more details.

 If you think:
 1. The framework has limitation in its interface so your XYZ cannot be supported, OR
-2. Your XYZ is very common, or very well-defined, so it would be nice to include it.
+2. Your XYZ is very common / very well-defined, so it would be nice to include it.

 Then it is a good time to open an issue.

@@ -34,8 +35,8 @@ The script expects a metagraph file which is also saved by `ModelSaver`.
 All model loading (in either training or testing) is through the `session_init` option
 in `TrainConfig` or `PredictConfig`.
 It accepts a `SessionInit` instance, where the common options are `SaverRestore` which restores
-TF checkpoint, or `DictRestore` which restores a dict. `get_model_loader` is a small helper to
-decide which one to use from a file name.
+TF checkpoint, or `DictRestore` which restores a dict. (`get_model_loader` is a small helper to
+decide which one to use from a file name.)

 Doing transfer learning is straightforward. Variable restoring is completely based on name match between
 the current graph and the `SessionInit` initializer.

--- a/docs/tutorial/input-source.md
+++ b/docs/tutorial/input-source.md
@@ -2,7 +2,7 @@
 # Input Sources

 This tutorial covers how data goes from DataFlow or other sources to TensorFlow graph.
-You don't have to know it, but it may help with efficiency.
+You don't have to read it because these are details under the tensorpack interface, but knowing it could help understand the efficiency.

 `InputSource` is an abstract interface in tensorpack describing where the input come from and how they enter the graph.
 For example,
@@ -18,7 +18,7 @@ to customize your `InputSource`.

 ## Use Prefetch

-In general, `feed_dict` is slow and should never appear in your critical loop.
+In general, `feed_dict` is slow and should never appear in training loops.
 i.e., when you use TensorFlow without any wrappers, you should avoid loops like this:
 ```python
 while True:
@@ -26,9 +26,9 @@ while True:
  minimize_op.run(feed_dict={'X': X, 'y': y})
 ```
 However, when you need to load data from Python-side, this is the only available interface in frameworks such as Keras, tflearn.
-This is part of the reason why [tensorpack is faster](https://gist.github.com/ppwwyyxx/8d95da79f8d97036a7d67c2416c851b6) than examples from other frameworks.
+This is part of the reason why [tensorpack is faster](https://gist.github.com/ppwwyyxx/8d95da79f8d97036a7d67c2416c851b6).

-You should use something like this instead, to prefetch data into the graph in one thread and hide the copy latency:
+You could use something like this instead, to prefetch data into the graph in one thread and hide the copy latency:
 ```python
 # Thread 1:
 while True:

--- a/docs/tutorial/model.md
+++ b/docs/tutorial/model.md
@@ -18,7 +18,7 @@ class MyModel(ModelDesc):

 Basically, `_get_inputs` should define the metainfo of all the possible placeholders your graph may need.
 `_build_graph` should add tensors/operations to the graph, where
-the argument `input_tensors` is the list of input tensors matching `_get_inputs`.
+the argument `inputs` is the list of input tensors matching `_get_inputs`.

 You can use any symbolic functions in `_build_graph`, including TensorFlow core library
 functions and other symbolic libraries (see below).

--- a/tensorpack/callbacks/saver.py
+++ b/tensorpack/callbacks/saver.py
@@ -9,6 +9,7 @@ import glob

 from .base import Callback
 from ..utils import logger
+from ..tfutils.common import get_tf_version_number

 __all__ = ['ModelSaver', 'MinSaver', 'MaxSaver']

@@ -43,11 +44,19 @@ class ModelSaver(Callback):
        for key in self.var_collections:
            vars.extend(tf.get_collection(key))
        self.path = os.path.join(self.checkpoint_dir, 'model')
-        self.saver = tf.train.Saver(
-            var_list=vars,
-            max_to_keep=self.keep_recent,
-            keep_checkpoint_every_n_hours=self.keep_freq,
-            write_version=tf.train.SaverDef.V2)
+        if get_tf_version_number() <= 1.1:
+            self.saver = tf.train.Saver(
+                var_list=vars,
+                max_to_keep=self.keep_recent,
+                keep_checkpoint_every_n_hours=self.keep_freq,
+                write_version=tf.train.SaverDef.V2)
+        else:
+            self.saver = tf.train.Saver(
+                var_list=vars,
+                max_to_keep=self.keep_recent,
+                keep_checkpoint_every_n_hours=self.keep_freq,
+                write_version=tf.train.SaverDef.V2,
+                save_relative_paths=True)
        self.meta_graph_written = False

    def _trigger(self):

--- a/tensorpack/tfutils/common.py
+++ b/tensorpack/tfutils/common.py
@@ -15,6 +15,7 @@ __all__ = ['get_default_sess_config',
           'get_op_tensor_name',
           'get_tensors_by_names',
           'get_op_or_tensor_by_name',
+           'get_tf_version_number',
           ]


@@ -134,3 +135,10 @@ def get_op_or_tensor_by_name(name):
        return f(name)
    else:
        return list(map(f, name))
+
+
+def get_tf_version_number():
+    """
+    Return a float (for comparison), indicating tensorflow version.
+    """
+    return float('.'.join(tf.VERSION.split('.')[:2]))
--- a/tensorpack/train/multigpu.py
+++ b/tensorpack/train/multigpu.py
@@ -12,6 +12,7 @@ from six.moves import zip, range
 from ..utils import logger
 from ..utils.naming import TOWER_FREEZE_KEYS
 from ..utils.concurrency import LoopThread
+from ..tfutils.common import get_tf_version_number
 from ..tfutils.tower import TowerContext
 from ..tfutils.collection import backup_collection, restore_collection
 from ..tfutils.gradproc import FilterNoneGrad, ScaleGradient
@@ -28,8 +29,8 @@ __all__ = ['MultiGPUTrainerBase', 'SyncMultiGPUTrainer',


 def _check_tf_version():
-    ver = float('.'.join(tf.VERSION.split('.')[:2]))
-    assert ver >= 1.1, "TF version {} is too old to run multi GPU training!".format(tf.VERSION)
+    assert get_tf_version_number() >= 1.1, \
+        "TF version {} is too old to run multi GPU training!".format(tf.VERSION)


 def apply_prefetch_policy(config, use_stage=True):