Commit a9cba3c6 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent fde338ea
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
### ModelDesc ### ModelDesc
`ModelDesc` is an abstraction over the most common type of models people train: `ModelDesc` is an abstraction over the most common type of models people train.
It assumes: It assumes:
1. Training is a single-cost optimized by a single `tf.train.Optimizer`. 1. Training is a single-cost optimized by a single `tf.train.Optimizer`.
...@@ -18,8 +18,9 @@ class MyModel(ModelDesc): ...@@ -18,8 +18,9 @@ class MyModel(ModelDesc):
return [InputDesc(...), InputDesc(...)] return [InputDesc(...), InputDesc(...)]
def _build_graph(self, inputs): def _build_graph(self, inputs):
tensorA, tensorB = inputs tensorA, tensorB = inputs
# build the graph # build the graph
self.cost = xxx # define the cost tensor
def _get_optimizer(self): def _get_optimizer(self):
return tf.train.GradientDescentOptimizer(0.1) return tf.train.GradientDescentOptimizer(0.1)
......
...@@ -7,7 +7,7 @@ such as conv/deconv, fc, batch normalization, pooling layers, and some custom lo ...@@ -7,7 +7,7 @@ such as conv/deconv, fc, batch normalization, pooling layers, and some custom lo
Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
simplify the code. simplify the code.
Note that the layers are written because there are no other alternatives back at that time. Note that the layers were written because there are no other alternatives back at that time.
In the future we may shift to `tf.layers` because they will be better maintained. In the future we may shift to `tf.layers` because they will be better maintained.
### argscope and LinearWrap ### argscope and LinearWrap
......
...@@ -2,21 +2,18 @@ ...@@ -2,21 +2,18 @@
# Trainer # Trainer
Training is **running something again and again**. Training is **running something again and again**.
Tensorpack base trainer implements the logic of __running the iteration__, Tensorpack base trainer implements the logic of __running the iteration__.
and derived trainers implement __what the iteration is__. Users or derived trainers should implement __what the iteration is__.
Most neural network training tasks are single-cost optimization. Most neural network training tasks are single-cost optimization.
Tensorpack provides some trainer implementations for such tasks. Tensorpack provides some trainer implementations for such tasks:
These trainers will by default minimizes `ModelDesc.cost`. These trainers will build the graph based on the given `ModelDesc`, and minimizes `ModelDesc.cost`.
Therefore, you can use these trainers as long as you set `self.cost` in `ModelDesc._build_graph()`,
as most examples did.
Existing trainers were implemented with certain prefetch mechanism, Existing trainers were implemented with certain prefetch mechanism,
which will run significantly faster than a naive `sess.run(..., feed_dict={...})`. which will run significantly faster than a naive `sess.run(..., feed_dict={...})`.
There are also Multi-GPU trainers which include the logic of data-parallel Multi-GPU training. There are also Multi-GPU trainers which include the logic of data-parallel Multi-GPU training.
You can enable them by just changing one line, and all the necessary logic to achieve the best You can enable them by just one line, and all the necessary logic to achieve the best performance was baked into the trainers already.
performance was baked into the trainers already.
For example, SyncMultiGPUTrainer can train ResNet50 as fast as the [official tensorflow benchmark](https://github.com/tensorflow/benchmarks). For example, SyncMultiGPUTrainer can train ResNet50 as fast as the [official tensorflow benchmark](https://github.com/tensorflow/benchmarks).
To use trainers, pass a `TrainConfig` to configure them: To use trainers, pass a `TrainConfig` to configure them:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment