Commit a9cba3c6 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent fde338ea
......@@ -3,7 +3,7 @@
### ModelDesc
`ModelDesc` is an abstraction over the most common type of models people train:
`ModelDesc` is an abstraction over the most common type of models people train.
It assumes:
1. Training is a single-cost optimized by a single `tf.train.Optimizer`.
......@@ -18,8 +18,9 @@ class MyModel(ModelDesc):
return [InputDesc(...), InputDesc(...)]
def _build_graph(self, inputs):
tensorA, tensorB = inputs
tensorA, tensorB = inputs
# build the graph
self.cost = xxx # define the cost tensor
def _get_optimizer(self):
return tf.train.GradientDescentOptimizer(0.1)
......
......@@ -7,7 +7,7 @@ such as conv/deconv, fc, batch normalization, pooling layers, and some custom lo
Using the tensorpack implementations, you can also benefit from `argscope` and `LinearWrap` to
simplify the code.
Note that the layers are written because there are no other alternatives back at that time.
Note that the layers were written because there are no other alternatives back at that time.
In the future we may shift to `tf.layers` because they will be better maintained.
### argscope and LinearWrap
......
......@@ -2,21 +2,18 @@
# Trainer
Training is **running something again and again**.
Tensorpack base trainer implements the logic of __running the iteration__,
and derived trainers implement __what the iteration is__.
Tensorpack base trainer implements the logic of __running the iteration__.
Users or derived trainers should implement __what the iteration is__.
Most neural network training tasks are single-cost optimization.
Tensorpack provides some trainer implementations for such tasks.
These trainers will by default minimizes `ModelDesc.cost`.
Therefore, you can use these trainers as long as you set `self.cost` in `ModelDesc._build_graph()`,
as most examples did.
Tensorpack provides some trainer implementations for such tasks:
These trainers will build the graph based on the given `ModelDesc`, and minimizes `ModelDesc.cost`.
Existing trainers were implemented with certain prefetch mechanism,
which will run significantly faster than a naive `sess.run(..., feed_dict={...})`.
There are also Multi-GPU trainers which include the logic of data-parallel Multi-GPU training.
You can enable them by just changing one line, and all the necessary logic to achieve the best
performance was baked into the trainers already.
You can enable them by just one line, and all the necessary logic to achieve the best performance was baked into the trainers already.
For example, SyncMultiGPUTrainer can train ResNet50 as fast as the [official tensorflow benchmark](https://github.com/tensorflow/benchmarks).
To use trainers, pass a `TrainConfig` to configure them:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment