update docs about models

e8c6cbf8 · Yuxin Wu · f3923789 · e8c6cbf8 · e8c6cbf8 · e8c6cbf8
Commit e8c6cbf8 authored Jan 28, 2017 by Yuxin Wu
7 changed files
--- a/docs/user/dataflow.md
+++ b/docs/user/dataflow.md
@@ -8,35 +8,39 @@ which yields a `datapoint` when called.
 A datapoint must be a **list** of Python objects which I called the `components` of this datapoint.
 For example, to train on MNIST dataset, you can define a Dataflow
-that produces datapoints of shape `[(BATCH, 28, 28), (BATCH,)]`.
+that produces datapoints of two elements: a numpy array of shape (64, 28, 28), and an array of shape (64,).
 ### Composition of DataFlow
 One good thing about having a standard interface is to be able to provide
 the greatest code reusablility.
 There are a lot of existing modules in tensorpack which you can use to compose
 complex Dataflow instances with a long pre-processing pipeline. A whole pipeline usually
-includes __read from disk (or other sources), augmentations, group into batches,
+would __read from disk (or other sources), apply augmentations, group into batches,
-prefetching__, etc. An example is as the following:
+prefetch data__, etc. An example is as the following:
 ````python
 # define a Dataflow which produces image-label pairs from a caffe lmdb database
-ds = CaffeLMDB('/path/to/caffe/lmdb', shuffle=False)
+df = CaffeLMDB('/path/to/caffe/lmdb', shuffle=False)
 # resize the image component of each datapoint
-ds = AugmentImageComponent(ds, [imgaug.Resize((225, 225))])
+df = AugmentImageComponent(df, [imgaug.Resize((225, 225))])
 # group data into batches of size 128
-ds = BatchData(ds, 128)
+df = BatchData(df, 128)
 # start 3 processes to run the dataflow in parallel, and transfer the data with ZeroMQ
-ds = PrefetchDataZMQ(ds, 3)
+df = PrefetchDataZMQ(df, 3)
 ````
-Another complicated example is the [ResNet training script](https://github.com/ppwwyyxx/tensorpack/blob/master/examples/ResNet/imagenet-resnet.py)
+Another complicated example is the [ResNet training script](../examples/ResNet/imagenet-resnet.py)
 with all the data preprocessing.
 All these modules are written in Python,
 so you can easily implement whatever opeartions/transformations you need,
-without worring about adding data reading operators to TensorFlow.
+without worrying about adding operators to TensorFlow.
 In the mean time, thanks to the prefetching, it can still run fast enough for
 tasks as large as ImageNet training.
+<!--
+   - TODO mention RL, distributed data, and zmq operator in the future.
+	 -->
 ### Reuse in other frameworks
 Another good thing about Dataflow is that it is independent of
 tensorpack internals. You can just use it as an efficient data processing pipeline,
@@ -66,6 +70,6 @@ Optionally, Dataflow can implement the following two methods:
 + `reset_state()`. It's necessary if your Dataflow uses RNG. This
 	method should reset the internal state of this Dataflow (including RNG). It get called after a fork, so that different child
-	processes will have different random seed.
+	processes will have different random seed. You can also inherit `RNGDataFlow` which does this for `self.rng` already.
-With this "low-level" Dataflow implemented, you can then compose it with existing modules.
+With this "low-level" Dataflow defined, you can then compose it with existing modules.
--- a/docs/user/glance.md
+++ b/docs/user/glance.md
@@ -3,32 +3,6 @@
 The following guide introduces some core concepts of TensorPack. In contrast to several other libraries TensorPack contains of several modules to build complex deep learning algorithms and train models with high accuracy and high speed.
-### Layers and Architectures
-The library also contains several pre-implemented neural network modules and layers:
- Convolution, Deconvolution
- FullyConnected
- nonlinearities such as ReLU, leakyReLU, tanh and sigmoid
- pooling operations
- regularization operations
- batchnorm
-We also support of tfSlim out-of-the box. A LeNet architecture for MNIST would look like
-````python
-logits = (LinearWrap(image)  # the starting brace is only for line-breaking
-          .Conv2D('conv0')
-          .MaxPooling('pool0', 2)
-          .Conv2D('conv1', padding='SAME')
-          .Conv2D('conv2')
-          .MaxPooling('pool1', 2)
-          .Conv2D('conv3')
-          .FullyConnected('fc0', 512, nl=tf.nn.relu)
-          .Dropout('dropout', 0.5)
-          .FullyConnected('fc1', out_dim=10, nl=tf.identity)())
-````
-You should build your model within the ModelDesc-class.
 ### Training
 Given TensorFlow's optimizers this library provides several training protocols even for efficient multi-GPU environments. There is support for single GPU, training on one machine with multiple GPUs (synchron or asyncron), training of Generative Adversarial networks and reinforcement learning.

--- a/docs/user/models.md
+++ b/docs/user/models.md
-## Model Primitives
+# Model
-`models` in tensorpack contains a collection of common model primitives.
+To define a model (i.e. the computation graph) that will be used for training,
-Take a look at the [Convolutional Layer](../../tensorpack/models/conv2d.py#L14) implementation for an example of how to define a
+you'll need to subclass `ModelDesc` and implement several methods:
+```python
+class MyModel(ModelDesc):
+	def _get_input_vars(self):
+		return [InputVar(...), InputVar(...)]
+	def _build_graph(self, input_tensors):
+		# build the graph
+```
+Basically, `_get_input_vars` should define the metainfo of the input
+of the model. It should match what is produced by the data you're training with.
+`_build_graph` should add tensors/operations to the graph, where
+the argument `input_tensors` is the list of input tensors matching the return value of
+`_get_input_vars`.
+You can use any symbolic functions in `_build_graph`, including TensorFlow core library
+functions, TensorFlow slim layers, or functions in other packages such as tflean, tensorlayer.
+tensorpack also contains a small collection of common model primitives,
+such as conv/deconv, fc, pooling layers, nonlinearities, and some custom loss functions.
+Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
+simplify the code.
+## argscope and LinearWrap
+`argscope` gives you a context with default arguments.
+`LinearWrap` allows you to simplify "linear structure" models by
+adding the layers one by one.
+The following code:
+```python
+with argscope(Conv2D, out_channel=32, kernel_shape=3, nl=tf.nn.relu):
+	l = (LinearWrap(image)  # the starting brace is only for line-breaking
+			 .Conv2D('conv0')
+			 .MaxPooling('pool0', 2)
+			 .Conv2D('conv1', padding='SAME')
+			 .Conv2D('conv2', kernel_shape=5)
+			 .FullyConnected('fc0', 512, nl=tf.nn.relu)
+			 .Dropout('dropout', 0.5)
+			 .FullyConnected('fc1', out_dim=10, nl=tf.identity)())
+```
+is equivalent to:
+```
+l = Conv2D('conv0', image, 32, 3, nl=tf.nn.relu)
+l = MaxPooling('pool0', l, 2)
+l = Conv2D('conv1', l, 32, 3, padding='SAME')
+l = Conv2D('conv2', l, 32, 5)
+l = FullyConnected('fc0', l, 512, nl=tf.nn.relu)
+l = Dropout('dropout', l, 0.5)
+l = FullyConnected('fc1', l, 10, nl=tf.identity)
+```
+## Implement a layer
+Symbolic functions should be nothing new to you, and writing a simple symbolic function is nothing special in tensorpack.
+But you can make a symbolic function become a "layer" by following some very simple rules, and then gain benefits from the framework.
+Take a look at the [Convolutional Layer](../tensorpack/models/conv2d.py#L14) implementation for an example of how to define a
 model primitive:
 ```python
@@ -13,19 +71,20 @@ def Conv2D(x, out_channel, kernel_shape,
           nl=tf.nn.relu, split=1, use_bias=True):
 ```
-A primitive has the same interface as a tensorflow symbolic function: it takes a symbolic input `x` with
+Basically, a layer is a symbolic function with the following rules:
-some parameters, and return some symbolic outputs.
-`@layer_register()` will make this symbolic function become a `layer`, with the following benefits:
-+ A variable scope for everything happening in this function.
+ It is decorated by `@layer_register`.
-+ Auto-inferred input/output shapes can be logged to terminal.
+ The first argument is its "input". It must be a tensor or a list of tensors.
-+ Work with `argscope` to define default arguments in a simple way.
+ It returns either a tensor or a list of tensors as its "output".
-Some convention when working with a primitive defined under `@layer_register()`:
-+ The input must be the first argument in the signature so that logging will know. It can be either a Tensor or a list of Tensor.
+By making a symbolic function a "layer", the following thing will happen:
-+ When called, the first argument should be the name scope and the second be the input.
+ You will call the function with a scope argument, e.g. `Conv2D('conv0', x, 32, 3)`.
+	Everything happening in this function will be under the variable scope 'conv0'.
+ Static shapes of input/output will be logged.
+ It will then work with `argscope` to easily define default arguments. `argscope` will work for all
+	the arguments except the input.
+ It will work with `LinearWrap` if the output of the previous layer matches the input of the next layer.
-Take a look at the [Inception example](../../examples/Inception/inception-bn.py#L36) to see how a complicated model can be described with these primitives.
+Take a look at the [Inception example](../examples/Inception/inception-bn.py#L36) to see how a complicated model can be described with these primitives.
--- a/examples/Inception/inception-bn.py
+++ b/examples/Inception/inception-bn.py
@@ -29,7 +29,6 @@ Learning rate may need a different schedule for different number of GPUs (becaus
 class Model(ModelDesc):
    def _get_input_vars(self):
        return [InputVar(tf.float32, [None, INPUT_SHAPE, INPUT_SHAPE, 3], 'input'),
                InputVar(tf.int32, [None], 'label')]
@@ -62,19 +61,21 @@ class Model(ModelDesc):
                return tf.concat(outs, 3, name='concat')
        with argscope(Conv2D, nl=BNReLU, use_bias=False):
-            l = Conv2D('conv0', image, 64, 7, stride=2)
+            l = (LinearWrap(image)
-            l = MaxPooling('pool0', l, 3, 2, padding='SAME')
+                 .Conv2D('conv0', 64, 7, stride=2)
-            l = Conv2D('conv1', l, 64, 1)
+                 .MaxPooling('pool0', 3, 2, padding='SAME')
-            l = Conv2D('conv2', l, 192, 3)
+                 .Conv2D('conv1', 64, 1)
-            l = MaxPooling('pool2', l, 3, 2, padding='SAME')
+                 .Conv2D('conv2', 192, 3)
+                 .MaxPooling('pool2', 3, 2, padding='SAME')())
            # 28
            l = inception('incep3a', l, 64, 64, 64, 64, 96, 32, 'avg')
            l = inception('incep3b', l, 64, 64, 96, 64, 96, 64, 'avg')
            l = inception('incep3c', l, 0, 128, 160, 64, 96, 0, 'max')
-            br1 = Conv2D('loss1conv', l, 128, 1)
+            br1 = (LinearWrap(l)
-            br1 = FullyConnected('loss1fc', br1, 1024, nl=tf.nn.relu)
+                   .Conv2D('loss1conv', 128, 1)
-            br1 = FullyConnected('loss1logit', br1, 1000, nl=tf.identity)
+                   .FullyConnected('loss1fc', 1024, nl=tf.nn.relu)
+                   .FullyConnected('loss1logit', 1000, nl=tf.identity)())
            loss1 = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=br1, labels=label)
            loss1 = tf.reduce_mean(loss1, name='loss1')

--- a/examples/Inception/inceptionv3.py
+++ b/examples/Inception/inceptionv3.py
@@ -35,7 +35,6 @@ INPUT_SHAPE = 299
 class Model(ModelDesc):
    def _get_input_vars(self):
        return [InputVar(tf.float32, [None, INPUT_SHAPE, INPUT_SHAPE, 3], 'input'),
                InputVar(tf.int32, [None], 'label')]
@@ -209,7 +208,6 @@ def get_data(train_or_test):
    if isTrain:
        class Resize(imgaug.ImageAugmentor):
            def __init__(self):
                self._init(locals())

--- a/examples/mnist-convnet.py
+++ b/examples/mnist-convnet.py
@@ -23,7 +23,6 @@ USE_SLIM = False
 class Model(ModelDesc):
    def _get_input_vars(self):
        """Define all the input variables (with type, shape, name) that'll be
        fed into the graph to produce a cost.  """

--- a/tensorpack/models/__init__.py
+++ b/tensorpack/models/__init__.py
@@ -91,6 +91,17 @@ class LinearWrap(object):
        ret = func(self._t, *args, **kwargs)
        return LinearWrap(ret)
+    def apply2(self, func, *args, **kwargs):
+        """
+        Apply a function on the wrapped tensor. The tensor
+        will be the second argument of func.
+        Returns:
+            LinearWrap: ``LinearWrap(func(args[0], self.tensor(), *args[1:], **kwargs))``.
+        """
+        ret = func(args[0], self._t, *(args[1:]), **kwargs)
+        return LinearWrap(ret)
    def __call__(self):
        """
        Returns: