Commit e8c6cbf8 authored by Yuxin Wu's avatar Yuxin Wu

update docs about models

parent f3923789
......@@ -8,35 +8,39 @@ which yields a `datapoint` when called.
A datapoint must be a **list** of Python objects which I called the `components` of this datapoint.
For example, to train on MNIST dataset, you can define a Dataflow
that produces datapoints of shape `[(BATCH, 28, 28), (BATCH,)]`.
that produces datapoints of two elements: a numpy array of shape (64, 28, 28), and an array of shape (64,).
### Composition of DataFlow
One good thing about having a standard interface is to be able to provide
the greatest code reusablility.
There are a lot of existing modules in tensorpack which you can use to compose
complex Dataflow instances with a long pre-processing pipeline. A whole pipeline usually
includes __read from disk (or other sources), augmentations, group into batches,
prefetching__, etc. An example is as the following:
would __read from disk (or other sources), apply augmentations, group into batches,
prefetch data__, etc. An example is as the following:
````python
# define a Dataflow which produces image-label pairs from a caffe lmdb database
ds = CaffeLMDB('/path/to/caffe/lmdb', shuffle=False)
df = CaffeLMDB('/path/to/caffe/lmdb', shuffle=False)
# resize the image component of each datapoint
ds = AugmentImageComponent(ds, [imgaug.Resize((225, 225))])
df = AugmentImageComponent(df, [imgaug.Resize((225, 225))])
# group data into batches of size 128
ds = BatchData(ds, 128)
df = BatchData(df, 128)
# start 3 processes to run the dataflow in parallel, and transfer the data with ZeroMQ
ds = PrefetchDataZMQ(ds, 3)
df = PrefetchDataZMQ(df, 3)
````
Another complicated example is the [ResNet training script](https://github.com/ppwwyyxx/tensorpack/blob/master/examples/ResNet/imagenet-resnet.py)
Another complicated example is the [ResNet training script](../examples/ResNet/imagenet-resnet.py)
with all the data preprocessing.
All these modules are written in Python,
so you can easily implement whatever opeartions/transformations you need,
without worring about adding data reading operators to TensorFlow.
without worrying about adding operators to TensorFlow.
In the mean time, thanks to the prefetching, it can still run fast enough for
tasks as large as ImageNet training.
<!--
- TODO mention RL, distributed data, and zmq operator in the future.
-->
### Reuse in other frameworks
Another good thing about Dataflow is that it is independent of
tensorpack internals. You can just use it as an efficient data processing pipeline,
......@@ -66,6 +70,6 @@ Optionally, Dataflow can implement the following two methods:
+ `reset_state()`. It's necessary if your Dataflow uses RNG. This
method should reset the internal state of this Dataflow (including RNG). It get called after a fork, so that different child
processes will have different random seed.
processes will have different random seed. You can also inherit `RNGDataFlow` which does this for `self.rng` already.
With this "low-level" Dataflow implemented, you can then compose it with existing modules.
With this "low-level" Dataflow defined, you can then compose it with existing modules.
......@@ -3,32 +3,6 @@
The following guide introduces some core concepts of TensorPack. In contrast to several other libraries TensorPack contains of several modules to build complex deep learning algorithms and train models with high accuracy and high speed.
### Layers and Architectures
The library also contains several pre-implemented neural network modules and layers:
- Convolution, Deconvolution
- FullyConnected
- nonlinearities such as ReLU, leakyReLU, tanh and sigmoid
- pooling operations
- regularization operations
- batchnorm
We also support of tfSlim out-of-the box. A LeNet architecture for MNIST would look like
````python
logits = (LinearWrap(image) # the starting brace is only for line-breaking
.Conv2D('conv0')
.MaxPooling('pool0', 2)
.Conv2D('conv1', padding='SAME')
.Conv2D('conv2')
.MaxPooling('pool1', 2)
.Conv2D('conv3')
.FullyConnected('fc0', 512, nl=tf.nn.relu)
.Dropout('dropout', 0.5)
.FullyConnected('fc1', out_dim=10, nl=tf.identity)())
````
You should build your model within the ModelDesc-class.
### Training
Given TensorFlow's optimizers this library provides several training protocols even for efficient multi-GPU environments. There is support for single GPU, training on one machine with multiple GPUs (synchron or asyncron), training of Generative Adversarial networks and reinforcement learning.
......
## Model Primitives
# Model
`models` in tensorpack contains a collection of common model primitives.
Take a look at the [Convolutional Layer](../../tensorpack/models/conv2d.py#L14) implementation for an example of how to define a
To define a model (i.e. the computation graph) that will be used for training,
you'll need to subclass `ModelDesc` and implement several methods:
```python
class MyModel(ModelDesc):
def _get_input_vars(self):
return [InputVar(...), InputVar(...)]
def _build_graph(self, input_tensors):
# build the graph
```
Basically, `_get_input_vars` should define the metainfo of the input
of the model. It should match what is produced by the data you're training with.
`_build_graph` should add tensors/operations to the graph, where
the argument `input_tensors` is the list of input tensors matching the return value of
`_get_input_vars`.
You can use any symbolic functions in `_build_graph`, including TensorFlow core library
functions, TensorFlow slim layers, or functions in other packages such as tflean, tensorlayer.
tensorpack also contains a small collection of common model primitives,
such as conv/deconv, fc, pooling layers, nonlinearities, and some custom loss functions.
Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
simplify the code.
## argscope and LinearWrap
`argscope` gives you a context with default arguments.
`LinearWrap` allows you to simplify "linear structure" models by
adding the layers one by one.
The following code:
```python
with argscope(Conv2D, out_channel=32, kernel_shape=3, nl=tf.nn.relu):
l = (LinearWrap(image) # the starting brace is only for line-breaking
.Conv2D('conv0')
.MaxPooling('pool0', 2)
.Conv2D('conv1', padding='SAME')
.Conv2D('conv2', kernel_shape=5)
.FullyConnected('fc0', 512, nl=tf.nn.relu)
.Dropout('dropout', 0.5)
.FullyConnected('fc1', out_dim=10, nl=tf.identity)())
```
is equivalent to:
```
l = Conv2D('conv0', image, 32, 3, nl=tf.nn.relu)
l = MaxPooling('pool0', l, 2)
l = Conv2D('conv1', l, 32, 3, padding='SAME')
l = Conv2D('conv2', l, 32, 5)
l = FullyConnected('fc0', l, 512, nl=tf.nn.relu)
l = Dropout('dropout', l, 0.5)
l = FullyConnected('fc1', l, 10, nl=tf.identity)
```
## Implement a layer
Symbolic functions should be nothing new to you, and writing a simple symbolic function is nothing special in tensorpack.
But you can make a symbolic function become a "layer" by following some very simple rules, and then gain benefits from the framework.
Take a look at the [Convolutional Layer](../tensorpack/models/conv2d.py#L14) implementation for an example of how to define a
model primitive:
```python
......@@ -13,19 +71,20 @@ def Conv2D(x, out_channel, kernel_shape,
nl=tf.nn.relu, split=1, use_bias=True):
```
A primitive has the same interface as a tensorflow symbolic function: it takes a symbolic input `x` with
some parameters, and return some symbolic outputs.
`@layer_register()` will make this symbolic function become a `layer`, with the following benefits:
Basically, a layer is a symbolic function with the following rules:
+ A variable scope for everything happening in this function.
+ Auto-inferred input/output shapes can be logged to terminal.
+ Work with `argscope` to define default arguments in a simple way.
+ It is decorated by `@layer_register`.
+ The first argument is its "input". It must be a tensor or a list of tensors.
+ It returns either a tensor or a list of tensors as its "output".
Some convention when working with a primitive defined under `@layer_register()`:
+ The input must be the first argument in the signature so that logging will know. It can be either a Tensor or a list of Tensor.
+ When called, the first argument should be the name scope and the second be the input.
By making a symbolic function a "layer", the following thing will happen:
+ You will call the function with a scope argument, e.g. `Conv2D('conv0', x, 32, 3)`.
Everything happening in this function will be under the variable scope 'conv0'.
+ Static shapes of input/output will be logged.
+ It will then work with `argscope` to easily define default arguments. `argscope` will work for all
the arguments except the input.
+ It will work with `LinearWrap` if the output of the previous layer matches the input of the next layer.
Take a look at the [Inception example](../../examples/Inception/inception-bn.py#L36) to see how a complicated model can be described with these primitives.
Take a look at the [Inception example](../examples/Inception/inception-bn.py#L36) to see how a complicated model can be described with these primitives.
......@@ -29,7 +29,6 @@ Learning rate may need a different schedule for different number of GPUs (becaus
class Model(ModelDesc):
def _get_input_vars(self):
return [InputVar(tf.float32, [None, INPUT_SHAPE, INPUT_SHAPE, 3], 'input'),
InputVar(tf.int32, [None], 'label')]
......@@ -62,19 +61,21 @@ class Model(ModelDesc):
return tf.concat(outs, 3, name='concat')
with argscope(Conv2D, nl=BNReLU, use_bias=False):
l = Conv2D('conv0', image, 64, 7, stride=2)
l = MaxPooling('pool0', l, 3, 2, padding='SAME')
l = Conv2D('conv1', l, 64, 1)
l = Conv2D('conv2', l, 192, 3)
l = MaxPooling('pool2', l, 3, 2, padding='SAME')
l = (LinearWrap(image)
.Conv2D('conv0', 64, 7, stride=2)
.MaxPooling('pool0', 3, 2, padding='SAME')
.Conv2D('conv1', 64, 1)
.Conv2D('conv2', 192, 3)
.MaxPooling('pool2', 3, 2, padding='SAME')())
# 28
l = inception('incep3a', l, 64, 64, 64, 64, 96, 32, 'avg')
l = inception('incep3b', l, 64, 64, 96, 64, 96, 64, 'avg')
l = inception('incep3c', l, 0, 128, 160, 64, 96, 0, 'max')
br1 = Conv2D('loss1conv', l, 128, 1)
br1 = FullyConnected('loss1fc', br1, 1024, nl=tf.nn.relu)
br1 = FullyConnected('loss1logit', br1, 1000, nl=tf.identity)
br1 = (LinearWrap(l)
.Conv2D('loss1conv', 128, 1)
.FullyConnected('loss1fc', 1024, nl=tf.nn.relu)
.FullyConnected('loss1logit', 1000, nl=tf.identity)())
loss1 = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=br1, labels=label)
loss1 = tf.reduce_mean(loss1, name='loss1')
......
......@@ -35,7 +35,6 @@ INPUT_SHAPE = 299
class Model(ModelDesc):
def _get_input_vars(self):
return [InputVar(tf.float32, [None, INPUT_SHAPE, INPUT_SHAPE, 3], 'input'),
InputVar(tf.int32, [None], 'label')]
......@@ -209,7 +208,6 @@ def get_data(train_or_test):
if isTrain:
class Resize(imgaug.ImageAugmentor):
def __init__(self):
self._init(locals())
......
......@@ -23,7 +23,6 @@ USE_SLIM = False
class Model(ModelDesc):
def _get_input_vars(self):
"""Define all the input variables (with type, shape, name) that'll be
fed into the graph to produce a cost. """
......
......@@ -91,6 +91,17 @@ class LinearWrap(object):
ret = func(self._t, *args, **kwargs)
return LinearWrap(ret)
def apply2(self, func, *args, **kwargs):
"""
Apply a function on the wrapped tensor. The tensor
will be the second argument of func.
Returns:
LinearWrap: ``LinearWrap(func(args[0], self.tensor(), *args[1:], **kwargs))``.
"""
ret = func(args[0], self._t, *(args[1:]), **kwargs)
return LinearWrap(ret)
def __call__(self):
"""
Returns:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment