update docs

7b33a43c · Yuxin Wu · 480bc6e6 · 7b33a43c · 7b33a43c · 7b33a43c
Commit 7b33a43c authored Nov 26, 2018 by Yuxin Wu
6 changed files
--- a/.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
+++ b/.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
@@ -38,6 +38,8 @@ For example, CPU/GPU utilization, output images, tensorboard curves, if relevant
  + Tensorpack version: `python -c 'import tensorpack; print(tensorpack.__version__)'`.
      You can install Tensorpack master by `pip install -U git+https://github.com/ppwwyyxx/tensorpack.git`
      and see if your issue is already solved.
+  + If you're not using tensorpack under a normal command line shell (e.g.,
+    using an IDE or jupyter notebook), please retry under a normal command line shell. 
  + Hardware information, e.g. number of GPUs used.
 About efficiency issues, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html
--- a/examples/FasterRCNN/README.md
+++ b/examples/FasterRCNN/README.md
@@ -60,7 +60,7 @@ Not training with 8 GPUs may result in different performance from the table belo
 ### Inference:
-To predict on an image (and show output in a window):
+To predict on an image (needs DISPLAY to show the outputs):
 ```
 ./train.py --predict input.jpg --load /path/to/model --config SAME-AS-TRAINING
 ```

--- a/examples/FasterRCNN/train.py
+++ b/examples/FasterRCNN/train.py
@@ -411,6 +411,8 @@ def predict(pred_func, input_file):
    results = detect_one_image(img, pred_func)
    final = draw_final_outputs(img, results)
    viz = np.concatenate((img, final), axis=1)
+    cv2.imwrite("output.png", viz)
+    logger.info("Inference output written to output.png")
    tpviz.interactive_imshow(viz)

--- a/examples/ResNet/README.md
+++ b/examples/ResNet/README.md
@@ -9,10 +9,10 @@ __Training__ code of three variants of ResNet on ImageNet:
 The training follows the __exact__ recipe used by the [Training ImageNet in 1 Hour paper](https://arxiv.org/abs/1706.02677)
 and gets the same performance.
-Models trained with 8 GPUs and a total batch size of 256 is listed in the table below.
+Models trained with 8 GPUs and a total batch size of 256 are listed in the table below.
 This recipe has better performance than most open source implementations.
-In fact, many papers that claim to "improve" ResNet only compete with a lower
+In fact, many papers that claim to "improve" ResNet by .5% only compete with a lower
 baseline and they actually cannot beat this ResNet recipe.
 | Model       | Top 5 Error | Top 1 Error | Download                                                                     |
@@ -42,7 +42,7 @@ See the [tutorial](http://tensorpack.readthedocs.io/en/latest/tutorial/efficient
 This script only converts and runs ImageNet-ResNet{50,101,152} Caffe models [released by MSRA](https://github.com/KaimingHe/deep-residual-networks).
 Note that the architecture is different from the `imagenet-resnet.py` script and the models are not compatible.
-ResNets have evolved, generally you should not cite these numbers as baselines in your paper.
+ResNets have evolved, generally you'd better not cite these old numbers as baselines in your paper.
 Usage:
 ```bash
@@ -53,7 +53,8 @@ python -m tensorpack.utils.loadcaffe PATH/TO/{ResNet-101-deploy.prototxt,ResNet-
 ```
 The converted models are verified on ILSVRC12 validation set.
-The per-pixel mean used here is slightly different from the original.
+The per-pixel mean used here is slightly different from the original, but has
+negligible effect.
 | Model              | Top 5 Error | Top 1 Error |
 |:-------------------|-------------|------------:|
@@ -77,7 +78,8 @@ Reproduce the mixup pre-act ResNet-18 CIFAR10 experiment, in the paper:
 * [mixup: Beyond Empirical Risk Minimization](https://arxiv.org/abs/1710.09412).
 This implementation follows exact settings from the [author's code](https://github.com/hongyi-zhang/mixup).
-Note that the architecture is different from the offcial preact-ResNet18.
+Note that the architecture is different from the offcial preact-ResNet18 in the
+ResNet paper.
 Usage:
 ```bash

--- a/tensorpack/callbacks/prof.py
+++ b/tensorpack/callbacks/prof.py
@@ -53,6 +53,7 @@ class GPUUtilizationTracker(Callback):
        assert len(self._devices), "[GPUUtilizationTracker] No GPU device given!"
    def _before_train(self):
+        assert tf.test.is_gpu_available()
        self._evt = mp.Event()
        self._stop_evt = mp.Event()
        self._queue = mp.Queue()
@@ -212,6 +213,7 @@ class PeakMemoryTracker(Callback):
        self._fetches = tf.train.SessionRunArgs(fetches=ops)
    def _before_run(self, _):
+        assert tf.test.is_gpu_available(), "PeakMemoryTracker only supports GPU!"
        if self.local_step == self.trainer.steps_per_epoch - 1:
            return self._fetches
        return None

--- a/tensorpack/train/trainers.py
+++ b/tensorpack/train/trainers.py
@@ -430,6 +430,8 @@ class HorovodTrainer(SingleCostTrainer):
        except AttributeError:  # old horovod does not have local_size
            pass
        super(HorovodTrainer, self).initialize(session_creator, session_init)
+        if not tf.test.is_gpu_available():
+            logger.error("tf.test.is_gpu_available() == False")
        # This broadcast belongs to the "intialize" stage
        # It should not be delayed to the "before_train" stage.