update docs

02eb02e2 · Yuxin Wu · 5df194d9 · 02eb02e2 · 02eb02e2 · 02eb02e2
Commit 02eb02e2 authored Aug 22, 2018 by Yuxin Wu
7 changed files
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
-PLEASE finish reading to show some respect to the authors.
 An issue has to be one of the following:
 - Unexpected Problems / Potential Bugs
 - Feature Requests
 - Questions on Using/Understanding Tensorpack
-## For any unexpected problems, __PLEASE ALWAYS INCLUDE__:
-1. What you did:
-  + If you're using examples:
-    + What's the command you run:
-    + Have you made any changes to code? Paste them if any:
-  + If not, tell us what you did that may be relevant.
-    But we may not investigate it if there is no reproducible code.
-  + Better to paste what you did instead of describing them.
-2. What you observed, including but not limited to the __entire__ logs.
-  + Better to paste what you observed instead of describing them.
-3. What you expected, if not obvious.
-4. Your environment:
-  + Python version.
-  + TF version: `python -c 'import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)'`.
-  + Tensorpack version: `python -c 'import tensorpack; print(tensorpack.__version__)'`.
-      You can install Tensorpack master by `pip install -U git+https://github.com/ppwwyyxx/tensorpack.git`.:
-  + Hardware information, if relevant.
-About efficiency issues, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html
-## Feature Requests:
-+ You can implement a lot of features by extending Tensorpack
-  (See http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack).
-  It does not have to be added to Tensorpack unless you have a good reason.
-+ "Could you improve/implement an example/paper ?" 
-  -- The answer is: we have no plans to do so. We don't take feature requests for
-  examples or implement a paper for you. If you don't know how to do it, you may ask a usage question.
-## Usage Questions:
-+ Read the [tutorials](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#user-tutorials) first.
-+ We answer "HOW to do X with Tensorpack" for a well-defined X.
-  We also answer "HOW/WHY Tensorpack does X" for some X that Tensorpack or its examples are doing.
-  We don't answer general machine learning questions, such as "why my training doesn't converge", "what networks to use" or "I don't understand the paper".
-You can also use gitter (https://gitter.im/tensorpack/users) for more casual discussions.
--- a/.github/ISSUE_TEMPLATE/feature-requests.md
+++ b/.github/ISSUE_TEMPLATE/feature-requests.md
 ---
 name: Feature Requests
-about: Suggest an idea for this project
+about: Suggest an idea for Tensorpack
 ---
-+ You can implement a lot of features by extending Tensorpack
+ Note that you can implement a lot of features by extending Tensorpack
  (See http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack).
  It does not have to be added to Tensorpack unless you have a good reason.
-+ "Could you improve/implement an example/paper ?" 
+ "Could you improve/implement an example/paper ?"
  -- The answer is: we have no plans to do so. We don't take feature requests for
  examples or implement a paper for you. If you don't know how to do it, you may ask a usage question.
--- a/.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
+++ b/.github/ISSUE_TEMPLATE/unexpected-problems---bugs.md
 ---
 name: Unexpected Problems / Bugs
-about: For use with unexpected problems of Tensorpack.
+about: Report unexpected problems about Tensorpack or its examples.
 ---

--- a/.github/ISSUE_TEMPLATE/using-understanding-tensorpack.md
+++ b/.github/ISSUE_TEMPLATE/using-understanding-tensorpack.md
@@ -4,7 +4,8 @@ about: More general questions about Tensorpack.
 ---
-+ Read the [tutorials](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#user-tutorials) first.
+ Your question is probably answered in [tutorials](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#user-tutorials). Read it first.
 + We answer "HOW to do X with Tensorpack" for a well-defined X.
  We also answer "HOW/WHY Tensorpack does X" for some X that Tensorpack or its examples are doing.

--- a/examples/FasterRCNN/NOTES.md
+++ b/examples/FasterRCNN/NOTES.md
@@ -31,35 +31,40 @@ Model:
 <p align="center"> <img src="https://user-images.githubusercontent.com/1381301/31527740-2f1b38ce-af84-11e7-8de1-628e90089826.png"> </p>
-2. We use ROIAlign, and because of (1), `tf.image.crop_and_resize` is __NOT__ ROIAlign.
+2. We use ROIAlign, and `tf.image.crop_and_resize` is __NOT__ ROIAlign.
-3. We only support single image per GPU.
+3. We currently only support single image per GPU.
 4. Because of (3), BatchNorm statistics are supposed to be freezed during fine-tuning.
-   This specific kind of BatchNorm will need [my kernel](https://github.com/tensorflow/tensorflow/pull/12580)
-   which is included since TF 1.4.
 5. An alternative to freezing BatchNorm is to sync BatchNorm statistics across
   GPUs (the `BACKBONE.NORM=SyncBN` option). This would require [my bugfix](https://github.com/tensorflow/tensorflow/pull/20360)
-   which will probably be in TF 1.10. You can manually apply the patch to use it.
+   which is available since TF 1.10. You can manually apply the patch to use it.
   For now the total batch size is at most 8, so this option does not improve the model by much.
+6. Another alternative to BatchNorm is GroupNorm (`BACKBONE.NORM=GN`) which has better performance.
 Speed:
-1. The training will start very slow due to convolution warmup, until about 10k steps to reach a maximum speed.
+1. The training will start very slow due to convolution warmup, until about 10k
-   Then the training speed will slowly decrease due to more accurate proposals.
+   steps to reach a maximum speed. 
+   You can disable warmup by `export TF_CUDNN_USE_AUTOTUNE=0`, which makes the
+   training faster at the beginning, but perhaps not in the end.
-2. This implementation is about 14% slower than detectron,
+1. After warmup the training speed will slowly decrease due to more accurate proposals.
+1. This implementation is about 10% slower than detectron,
   probably due to the lack of specialized ops (e.g. AffineChannel, ROIAlign) in TensorFlow.
   It's certainly faster than other TF implementation.
+1. The code should have around 70% GPU utilization on V100s, and 85%~90% scaling
+   efficiency from 1 V100 to 8 V100s.
 Possible Future Enhancements:
-1. Data-parallel evaluation during training.
+1. Define an interface to load custom dataset.
-2. Define an interface to load custom dataset.
-3. Support batch>1 per GPU.
+1. Support batch>1 per GPU.
-4. Use dedicated ops to improve speed. (e.g. a TF implementation of ROIAlign op
+1. Use dedicated ops to improve speed. (e.g. a TF implementation of ROIAlign op
   can be found in [light-head RCNN](https://github.com/zengarden/light_head_rcnn/tree/master/lib/lib_kernel))
--- a/examples/FasterRCNN/README.md
+++ b/examples/FasterRCNN/README.md
@@ -94,4 +94,4 @@ MaskRCNN results contain both box and mask mAP.
 ## Notes
-See [Notes on This Implementation](NOTES.md)
+[NOTES.md](NOTES.md) has some notes about implementation details & speed.
--- a/examples/FasterRCNN/train.py
+++ b/examples/FasterRCNN/train.py
@@ -494,6 +494,8 @@ class EvalCallback(Callback):
        if len(self.epochs_to_eval) < 15:
            logger.info("[EvalCallback] Will evaluate at epoch " + str(sorted(self.epochs_to_eval)))
        else:
+            if cfg.TRAINER == 'horovod':
+                logger.warn("[EvalCallback] Evaluation is single-GPU only and quite slow under horovod mode.")
            logger.info("[EvalCallback] Will evaluate every {} epochs".format(interval))
    def _eval(self):