docs update

edbcf18a · Yuxin Wu · 03c16776 · edbcf18a · edbcf18a · edbcf18a
Commit edbcf18a authored Aug 08, 2017 by Yuxin Wu
5 changed files
--- a/docs/tutorial/callback.md
+++ b/docs/tutorial/callback.md
@@ -37,13 +37,17 @@ TrainConfig(
    # schedule the learning rate based on epoch number
    ScheduledHyperParamSetter('learning_rate',
                              [(30, 1e-2), (60, 1e-3), (85, 1e-4), (95, 1e-5)]),
-    # allow manually setting the learning rate during training
+    # can manually set the learning rate during training
    HumanHyperParamSetter('learning_rate'),
    # send validation error to my phone through pushbullet
    SendStat('curl -u your_id_xxx: https://api.pushbullet.com/v2/pushes \\
               -d type=note -d title="validation error" \\
               -d body={val-error-top1} > /dev/null 2>&1',
-               'val-error-top1')
+               'val-error-top1'),
+		# record GPU utilizations during training
+		GPUUtilizationTracker(),
+		# can pause the training and start a debug shell, to observe what's going on
+		InjectShell(shell='ipython')
  ],
  extra_callbacks=[    # these callbacks are enabled by default already
    # maintain and summarize moving average of some tensors defined in the model (e.g. training loss, training error)
@@ -69,6 +73,8 @@ TrainConfig(
 Notice that callbacks cover every detail of training, ranging from graph operations to the progress bar.
 This means you can customize every part of the training to your preference, e.g. display something
 different in the progress bar, evaluating part of the summaries at a different frequency, etc.
+These features may not be always useful, but think about how messy the main loop would look like if you
+were to write the logic together with the loops.
 See [Write a callback](http://tensorpack.readthedocs.io/en/latest/tutorial/extend/callback.html)
 on how to implement a callback.
--- a/docs/tutorial/efficient-dataflow.md
+++ b/docs/tutorial/efficient-dataflow.md
@@ -126,7 +126,7 @@ If you identify this as a bottleneck, you can also use:
 Let's summarize what the above dataflow does:
 1. One thread iterates over a shuffled list of (filename, label) pairs, and put them into a queue of size 1000.
-2. 25 worker threads takes pairs and make them into (preprocessed image, label) pairs.
+2. 25 worker threads take pairs and make them into (preprocessed image, label) pairs.
 3. Both 1 and 2 happen in one separate process, and the results are sent back to main process through ZeroMQ.
 4. Main process makes batches, and other tensorpack modules will then take care of how they should go into the graph.

--- a/docs/tutorial/faq.md
+++ b/docs/tutorial/faq.md
@@ -32,15 +32,17 @@ The script expects a metagraph file which is also saved by `ModelSaver`.
 ## How to load a model / do transfer learning
-All model loading (in either training or testing) is through the `session_init` option
+All model loading (in either training or testing) is through the `session_init` initializer
 in `TrainConfig` or `PredictConfig`.
-It accepts a `SessionInit` instance, where the common options are `SaverRestore` which restores
+The common choices for this option are `SaverRestore` which restores a
 TF checkpoint, or `DictRestore` which restores a dict. (`get_model_loader` is a small helper to
 decide which one to use from a file name.)
-Doing transfer learning is straightforward. Variable restoring is completely based on name match between
+Doing transfer learning is trivial.
+Variable restoring is completely based on name match between
 the current graph and the `SessionInit` initializer.
-Therefore, if you want to load some model, just use the same variable name.
+Therefore, if you want to load some model, just use the same variable name
+so the old value will be loaded into the variable.
 If you want to re-train some layer, just rename it.
 Unmatched variables on both sides will be printed as a warning.

--- a/docs/tutorial/graph.md
+++ b/docs/tutorial/graph.md
@@ -32,10 +32,10 @@ the argument `inputs` is the list of input tensors matching `_get_inputs`.
 You can use any symbolic functions in `_build_graph`, including TensorFlow core library
 functions and other symbolic libraries.
-Most tensorpack trainers expect a `ModelDesc`.
+**How does it work**: Most tensorpack trainers expect a `ModelDesc`.
-The trainers will call these methods to create the model,
+The trainers will use `_get_inputs` to connect `InputSource` to the graph,
-connect `InputSource` to the model, create the minimization op, and so on.
+use `_build_graph` to create the backbone model and minimization op, and so on.
-Data-parallel Multi-GPU trainers will call `_build_graph` __multiple times__ on each GPU.
+Note that data-parallel multi-GPU trainers will call `_build_graph` __multiple times__ on each GPU.
 A trainer may also make __extra calls__ to `_build_graph` for inference, if used by some callbacks.
 ### Build It Manually
@@ -43,4 +43,5 @@ A trainer may also make __extra calls__ to `_build_graph` for inference, if used
 When you need to deal with complicated graph, it may be easier to build the graph manually.
 You are free to do so as long as you tell the trainer what to do in each step.
-More details to come.
+Check out [Write a Trainer](http://localhost:8000/tutorial/extend/trainer.html)
+for using a custom graph with trainer.
--- a/tensorpack/callbacks/stats.py
+++ b/tensorpack/callbacks/stats.py
@@ -31,10 +31,10 @@ class SendStat(Callback):
 class InjectShell(Callback):
    """
-    When triggered, opens an IPython/pdb shell if a file exists.
+    Allow users to create a specific file as a signal to pause
-    Useful for interactive debug during training.
+    and iteratively debug the training.
+    When triggered, it detects whether the file exists, and opens an
-    Using this callback requires ipython to be installed.
+    IPython/pdb shell if yes.
    """
    def __init__(self, file='INJECT_SHELL.tmp', shell='ipython'):