Commit a8e9aa95 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent da062119
......@@ -27,6 +27,8 @@ Then it is a good time to open an issue.
3. The [ProgressBar](../modules/callbacks.html#tensorpack.callbacks.ProgressBar)
callback can print some scalar statistics, though not enabled by default.
4. Read [Summary and Loggin](summary.html) for more options on logging.
## How to freeze some variables in training
1. You can simply use `tf.stop_gradient` in your model code in some situations (e.g. to freeze first several layers).
......
......@@ -30,8 +30,8 @@ or you may need placeholders in the inference graph, or the training graph conta
which you want to remove.
In this case, you can always create a new graph with pure TensorFlow.
The only thing tensorpack has for this purpose is `OfflinePredictor`,
a simple function to build the graph and a callable for you.
It is mainly for quick demo purpose.
The only tool tensorpack has for after-training inference is `OfflinePredictor`,
a simple function to build the graph and return a callable for you.
It is mainly for quick demo purposes.
It only runs inference on Python data, therefore may not be the most efficient way.
Check out some examples for its usage.
......@@ -2,10 +2,12 @@
# Performance Tuning
__We do not know why your training is slow__ (and most of the times it's not a tensorpack problem).
Performance is different on every machine. So you need to figure out most parts by your own.
Performance is different across machines and tasks.
So you need to figure out most parts by your own.
Here's a list of things you can do when your training is slow.
If you're going to open an issue about slow training, PLEASE do them and include your findings.
If you're going to open an issue about slow training,
PLEASE do them and include your findings.
## Figure out the bottleneck
......@@ -75,4 +77,4 @@ If you're unable to scale to multiple GPUs almost linearly:
There isn't a simple answer to this.
You may try a different multi-GPU trainer; the speed can vary a lot sometimes.
Note that scalibility is always measured with the same "batch size per GPU", not the same total equivalent batch size.
Note that scalibility is always measured by keeping "batch size per GPU" constant.
# Summary and Logging
During training, everything other than the training iterations themselves are executed through callbacks, including all the logging.
During training, everything other than the iterations are executed through callbacks.
This tutorial will explain how summaries and logging are handled in callbacks and how can you customize them.
The default logging behavior should be good enough for normal use cases, so you may skip this tutorial.
......
......@@ -23,6 +23,7 @@ The concept of tower is used mainly to support:
2. Automatically building the graph for inference, where a replicate is built under inference mode.
You'll specify a tower function when you use `TowerTrainer`.
If you use `ModelDesc`, the `build_graph` method is the tower function.
The function needs to follow some conventions:
1. It will always be called under a :class:`TowerContext`.
......@@ -41,7 +42,7 @@ in the efficient way and all reach the same performance as the
[official TF benchmarks](https://www.tensorflow.org/performance/benchmarks).
It takes only one line of code change to use them.
Note some common problems when using these trainers:
Note some __common problems__ when using these trainers:
1. In each iteration all GPUs (all replicates of the model) will take tensors from the `InputSource`,
instead of taking one for all and split.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment