Commit dbfa9982 authored by Yuxin Wu's avatar Yuxin Wu

tutorial update

parent fe2b4f97
...@@ -36,10 +36,10 @@ Describe your training task with three components: ...@@ -36,10 +36,10 @@ Describe your training task with three components:
1. __DataFlow__. process data in Python, with ease and speed. 1. __DataFlow__. process data in Python, with ease and speed.
+ Allows you to process data in Python without blocking the training, thanks to multiprocess prefetch & TF Queue prefetch. + Allows you to process data in Python without blocking the training, by multiprocess prefetch & TF Queue prefetch.
+ All data producer has a unified interface, so they can be composed and reused to perform complex preprocessing. + All data producer has a unified interface, you can compose and reuse them to perform complex preprocessing.
2. __Callbacks__, including everything you want to do apart from the training iterations, such as: 2. __Callbacks__, customizable, like `tf.train.SessionRunHook` but more than that. Includes everything you want to do apart from the training iterations, such as:
+ Change hyperparameters during training + Change hyperparameters during training
+ Print some tensors of interest + Print some tensors of interest
+ Run inference on a test dataset + Run inference on a test dataset
...@@ -47,11 +47,11 @@ Describe your training task with three components: ...@@ -47,11 +47,11 @@ Describe your training task with three components:
+ Send loss to your phone + Send loss to your phone
3. __Model__, or graph. `models/` has some scoped abstraction of common models, but you can just use 3. __Model__, or graph. `models/` has some scoped abstraction of common models, but you can just use
symbolic functions available in tensorflow, or in slim/tflearn/tensorlayer. symbolic functions in tensorflow or slim/tflearn/tensorlayer/etc.
`LinearWrap` and `argscope` simplify large models ([e.g. vgg example](https://github.com/ppwwyyxx/tensorpack/blob/master/examples/load-vgg16.py)). `LinearWrap` and `argscope` simplify large models (e.g. [vgg example](https://github.com/ppwwyyxx/tensorpack/blob/master/examples/load-vgg16.py)).
With the above components defined, tensorpack trainer will run the training iterations for you. With the above components defined, tensorpack trainer runs the training iterations for you.
Trainer was written with performance in mind. Trainer was written with performance in mind:
Even on a small CNN example, the training runs [2x faster](https://gist.github.com/ppwwyyxx/8d95da79f8d97036a7d67c2416c851b6) than the equivalent Keras code. Even on a small CNN example, the training runs [2x faster](https://gist.github.com/ppwwyyxx/8d95da79f8d97036a7d67c2416c851b6) than the equivalent Keras code.
Multi-GPU training is off-the-shelf by simply switching the trainer. Multi-GPU training is off-the-shelf by simply switching the trainer.
......
...@@ -39,11 +39,13 @@ TrainConfig( ...@@ -39,11 +39,13 @@ TrainConfig(
-d body={val-error-top1} > /dev/null 2>&1', -d body={val-error-top1} > /dev/null 2>&1',
'val-error-top1') 'val-error-top1')
], ],
extra_callbacks=[ # these callbacks are already enabled by default extra_callbacks=[ # these callbacks are enabled by default already
# maintain and summarize moving average of some tensors (e.g. training loss, training error) # maintain and summarize moving average of some tensors (e.g. training loss, training error)
MovingAverageSummary(), MovingAverageSummary(),
# draw a nice progress bar # draw a nice progress bar
ProgressBar(), ProgressBar(),
# run `tf.summary.merge_all` and save results every epoch
MergeAllSummaries(),
# print all the statistics I've created, and scalar tensors I've summarized # print all the statistics I've created, and scalar tensors I've summarized
StatPrinter(), StatPrinter(),
] ]
......
...@@ -62,9 +62,9 @@ use to read images or load data from a list. ...@@ -62,9 +62,9 @@ use to read images or load data from a list.
But in general, you'll probably need to write a new Dataflow to produce data for your task. But in general, you'll probably need to write a new Dataflow to produce data for your task.
Dataflow implementations for several well-known datasets are provided in the Dataflow implementations for several well-known datasets are provided in the
[dataflow.dataset](http://tensorpack.readthedocs.io/en/latest/modules/tensorpack.dataflow.dataset.html) [dataflow.dataset](http://tensorpack.readthedocs.io/en/latest/modules/tensorpack.dataflow.dataset.html)
module, which you can take as a reference. module, you can take them as a reference.
A Dataflow has a `get_data()` method which yields a datapoint every time. Usually you just need to implement the `get_data()` method which yields a datapoint every time.
```python ```python
class MyDataFlow(DataFlow): class MyDataFlow(DataFlow):
def get_data(self): def get_data(self):
...@@ -79,7 +79,7 @@ Optionally, Dataflow can implement the following two methods: ...@@ -79,7 +79,7 @@ Optionally, Dataflow can implement the following two methods:
+ `size()`. Return the number of elements the generator can produce. Certain modules might require this. + `size()`. Return the number of elements the generator can produce. Certain modules might require this.
For example, only Dataflows with the same number of elements can be joined together. For example, only Dataflows with the same number of elements can be joined together.
+ `reset_state()`. It's guaranteed that the process which uses this DataFlow will invoke this method before using it. + `reset_state()`. It's guaranteed that the actual process which runs a DataFlow will invoke this method before using it.
So if this DataFlow needs to something after a `fork()`, you should put it here. So if this DataFlow needs to something after a `fork()`, you should put it here.
A typical situation is when your Dataflow uses random number generator (RNG). Then you'd need to reset the RNG here, A typical situation is when your Dataflow uses random number generator (RNG). Then you'd need to reset the RNG here,
......
...@@ -7,7 +7,6 @@ Test. ...@@ -7,7 +7,6 @@ Test.
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
glance
dataflow dataflow
tf-queue tf-queue
efficient-dataflow efficient-dataflow
......
...@@ -11,19 +11,20 @@ class MyModel(ModelDesc): ...@@ -11,19 +11,20 @@ class MyModel(ModelDesc):
def _build_graph(self, inputs): def _build_graph(self, inputs):
# build the graph # build the graph
def _get_optimizer(self):
return tf.train.GradientDescentOptimizer(0.1)
``` ```
Basically, `_get_inputs` should define the metainfo of the input Basically, `_get_inputs` should define the metainfo of all the possible placeholders your graph may need.
of the model. It should match what is produced by the data you're training with.
`_build_graph` should add tensors/operations to the graph, where `_build_graph` should add tensors/operations to the graph, where
the argument `input_tensors` is the list of input tensors matching the return value of the argument `input_tensors` is the list of input tensors matching `_get_inputs`.
`_get_inputs`.
You can use any symbolic functions in `_build_graph`, including TensorFlow core library You can use any symbolic functions in `_build_graph`, including TensorFlow core library
functions, TensorFlow slim layers, or functions in other packages such as tflean, tensorlayer. functions, TensorFlow slim layers, or functions in other packages such as tflean, tensorlayer.
tensorpack also contains a small collection of common model primitives, tensorpack also contains a small collection of common model primitives,
such as conv/deconv, fc, pooling layers, nonlinearities, and some custom loss functions. such as conv/deconv, fc, batch normalization, pooling layers, and some custom loss functions.
Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
simplify the code. simplify the code.
...@@ -48,8 +49,8 @@ is equivalent to: ...@@ -48,8 +49,8 @@ is equivalent to:
``` ```
l = Conv2D('conv0', image, 32, 3, nl=tf.nn.relu) l = Conv2D('conv0', image, 32, 3, nl=tf.nn.relu)
l = MaxPooling('pool0', l, 2) l = MaxPooling('pool0', l, 2)
l = Conv2D('conv1', l, 32, 3, padding='SAME') l = Conv2D('conv1', l, 32, 3, padding='SAME', nl=tf.nn.relu)
l = Conv2D('conv2', l, 32, 5) l = Conv2D('conv2', l, 32, 5, nl=tf.nn.relu)
l = FullyConnected('fc0', l, 512, nl=tf.nn.relu) l = FullyConnected('fc0', l, 512, nl=tf.nn.relu)
l = Dropout('dropout', l, 0.5) l = Dropout('dropout', l, 0.5)
l = FullyConnected('fc1', l, 10, nl=tf.identity) l = FullyConnected('fc1', l, 10, nl=tf.identity)
...@@ -60,8 +61,7 @@ l = FullyConnected('fc1', l, 10, nl=tf.identity) ...@@ -60,8 +61,7 @@ l = FullyConnected('fc1', l, 10, nl=tf.identity)
Symbolic functions should be nothing new to you, and writing a simple symbolic function is nothing special in tensorpack. Symbolic functions should be nothing new to you, and writing a simple symbolic function is nothing special in tensorpack.
But you can make a symbolic function become a "layer" by following some very simple rules, and then gain benefits from the framework. But you can make a symbolic function become a "layer" by following some very simple rules, and then gain benefits from the framework.
Take a look at the [Convolutional Layer](../tensorpack/models/conv2d.py#L14) implementation for an example of how to define a Take a look at the [Convolutional Layer](../tensorpack/models/conv2d.py#L14) implementation for an example of how to define a layer:
model primitive:
```python ```python
@layer_register() @layer_register()
...@@ -78,18 +78,17 @@ Basically, a layer is a symbolic function with the following rules: ...@@ -78,18 +78,17 @@ Basically, a layer is a symbolic function with the following rules:
+ It returns either a tensor or a list of tensors as its "output". + It returns either a tensor or a list of tensors as its "output".
By making a symbolic function a "layer", the following thing will happen: By making a symbolic function a "layer", the following things will happen:
+ You will call the function with a scope argument, e.g. `Conv2D('conv0', x, 32, 3)`. + You will call the function with a scope argument, e.g. `Conv2D('conv0', x, 32, 3)`.
Everything happening in this function will be under the variable scope 'conv0'. You can register Everything happening in this function will be under the variable scope 'conv0'. You can register
the layer with `use_scope=False` to disable this feature. the layer with `use_scope=False` to disable this feature.
+ Static shapes of input/output will be logged. + Static shapes of input/output will be logged.
+ It will then work with `argscope` to easily define default arguments. `argscope` will work for all + `argscope` will then work for all its arguments except the first one (input).
the arguments except the input. + It will work with `LinearWrap`: you can use it if the output of a previous layer is the input of a next layer.
+ It will work with `LinearWrap` if the output of the previous layer matches the input of the next layer.
Take a look at the [Inception example](../examples/Inception/inception-bn.py#L36) to see how a complicated model can be described with these primitives. Take a look at the [Inception example](../examples/Inception/inception-bn.py#L36) to see how a complicated model can be described with these primitives.
There are also a number of symbolic functions in the `tfutils.symbolic_functions` module. There are also a number of (non-layer) symbolic functions in the `tfutils.symbolic_functions` module.
There isn't a rule about what kind of symbolic functions should be made a layer -- they're quite There isn't a rule about what kind of symbolic functions should be made a layer -- they're quite
similar anyway. But in general I define the following kinds of symbolic functions as layers: similar anyway. But in general I define the following kinds of symbolic functions as layers:
+ Functions which contain variables. A variable scope is almost always helpful for such function. + Functions which contain variables. A variable scope is almost always helpful for such function.
......
...@@ -29,16 +29,16 @@ while True: ...@@ -29,16 +29,16 @@ while True:
minimize_op.run() # minimize_op was built from dequeued tensors minimize_op.run() # minimize_op was built from dequeued tensors
``` ```
This is now automatically handled by tensorpack trainers already (unless you used the demo ``SimpleTrainer``), This is now automatically handled by tensorpack trainers already,
see [Trainer](trainer.md) for details. see [Trainer](trainer.md) for details.
TensorFlow provides staging interface which will further improve the speed in the future. This is TensorFlow provides staging interface which will further improve the speed in the future. This is
[issue#140](https://github.com/ppwwyyxx/tensorpack/issues/140). [issue#140](https://github.com/ppwwyyxx/tensorpack/issues/140).
You can also avoid `feed_dict` by using TensorFlow native operators to read data, which is also You can also avoid `feed_dict` by using TensorFlow native operators to read data, which is also
supported here. supported in tensorpack.
It probably allows you to reach the best performance, but at the cost of implementing the It probably allows you to reach the best performance, but at the cost of implementing the
reading / preprocessing ops in C++ if there isn't one for your task. We won't talk about it here. reading / preprocessing ops in C++ if there isn't one for your task.
## Figure out the bottleneck ## Figure out the bottleneck
...@@ -49,9 +49,8 @@ So the overall throughput will appear to be the slower one. ...@@ -49,9 +49,8 @@ So the overall throughput will appear to be the slower one.
There isn't a way to accurately benchmark the two threads while they are running, without introducing overhead. But There isn't a way to accurately benchmark the two threads while they are running, without introducing overhead. But
there are ways to understand which one is the bottleneck: there are ways to understand which one is the bottleneck:
1. Use the average occupancy (size) of the queue. This information is summarized after every epoch (TODO depend on #125). 1. Use the average occupancy (size) of the queue. This information is summarized after every epoch.
If the queue is nearly empty, then the data thread is the bottleneck. If the queue is nearly empty, then the data thread is the bottleneck.
2. Benchmark them separately. You can use `TestDataSpeed` to benchmark a DataFlow, and 2. Benchmark them separately. You can use `TestDataSpeed` to benchmark a DataFlow, and
use `FakeData` as a fast replacement in a dry run to benchmark the training use `FakeData` as a fast replacement in a dry run, to benchmark the training iterations.
iterations.
...@@ -9,7 +9,7 @@ Most neural network training tasks are single-cost optimization. ...@@ -9,7 +9,7 @@ Most neural network training tasks are single-cost optimization.
Tensorpack provides some trainer implementations for such tasks. Tensorpack provides some trainer implementations for such tasks.
These trainers will by default minimizes `ModelDesc.cost`, These trainers will by default minimizes `ModelDesc.cost`,
therefore you can use these trainers as long as you set `self.cost` in `ModelDesc._build_graph()`, therefore you can use these trainers as long as you set `self.cost` in `ModelDesc._build_graph()`,
as did in most examples. as most examples did.
Most existing trainers were implemented with a TensorFlow queue to prefetch and buffer Most existing trainers were implemented with a TensorFlow queue to prefetch and buffer
training data, which is faster than a naive `sess.run(..., feed_dict={...})`. training data, which is faster than a naive `sess.run(..., feed_dict={...})`.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment