update docs

ed6cc793 · Yuxin Wu · bec166dd · ed6cc793 · ed6cc793 · ed6cc793
Commit ed6cc793 authored Jun 05, 2019 by Yuxin Wu
4 changed files
--- a/.github/ISSUE_TEMPLATE/feature-requests.md
+++ b/.github/ISSUE_TEMPLATE/feature-requests.md
@@ -8,8 +8,8 @@ about: Suggest an idea for Tensorpack
  (See http://tensorpack.readthedocs.io/tutorial/index.html#extend-tensorpack).
  It does not have to be added to Tensorpack unless you have a good reason.
-+ "Could you improve/implement an example/paper ?"
+ "Could you implement a paper / other variants of a paper ?"
-  -- The answer is: we have no plans to do so. We don't consider feature
+  -- The answer is: we have no plans to do so.
-  requests for examples or implement a paper for you, unless it demonstrates 
+  We don't implement papers or variants of a paper for you,
-  some Tensorpack features not yet demonstrated in the existing examples.
+  unless it demonstrates some Tensorpack features not yet demonstrated in the existing examples.
  If you don't know how to do something yourself, you may ask a usage question.
--- a/examples/FasterRCNN/NOTES.md
+++ b/examples/FasterRCNN/NOTES.md
@@ -76,13 +76,6 @@ Efficiency:
   set in `train.py`; (2) reduce `buffer_size` or `NUM_WORKERS` in `data.py`
   (which may negatively impact your throughput). The training needs <10G RAM if `NUM_WORKERS=0`.
-1. Inference is unoptimized. Tensorpack is a training interface, therefore it
-   does not help you on optimized inference.
-1. To reduce RAM usage on host: (1) make sure you're using the "spawn" method as
-   set in `train.py`; (2) reduce `buffer_size` or `NUM_WORKERS` in `data.py`
-   (which may negatively impact your throughput). The training needs <10G RAM if `NUM_WORKERS=0`.
 1. Inference is unoptimized. Tensorpack is a training interface, therefore it
   does not help you on optimized inference. In fact, the current implementation
   uses some slow numpy operations in inference (in `eval.py:_paste_mask`).

--- a/examples/FasterRCNN/README.md
+++ b/examples/FasterRCNN/README.md
@@ -92,7 +92,7 @@ Performance in [Detectron](https://github.com/facebookresearch/Detectron/) can b
 | R50-C4                         | 36.2;31.8 [:arrow_down:][R50C41x]                                       | 35.8;31.4                                          | 23.5h                  | <details><summary>standard</summary>this is the default </details>                                                                                                                                                                                                                                                                                                                                            |
 | R50-FPN                        | 38.2;34.8                                                               | 37.7;33.9                                          | 13.5h                  | <details><summary>standard</summary>`MODE_FPN=True` </details>                                                                                                                                                                                                                                                                                                                                                |
 | R50-FPN                        | 38.9;35.4 [:arrow_down:][R50FPN2x]                                      | 38.6;34.5                                          | 25h                    | <details><summary>2x</summary>`MODE_FPN=True`<br/>`TRAIN.LR_SCHEDULE=[240000,320000,360000]` </details>                                                                                                                                                                                                                                                                                                       |
- | R50-FPN-GN                     | 40.4;36.3 [:arrow_down:][R50FPN2xGN]                                    | 40.3;35.7                                          | 31h                    | <details><summary>2x+GN</summary>`MODE_FPN=True`<br/>`FPN.NORM=GN BACKBONE.NORM=GN`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head`<br/>`FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head`                                                                                                                                                                                                                    |
+ | R50-FPN-GN                     | 40.4;36.3 [:arrow_down:][R50FPN2xGN]                                    | 40.3;35.7                                          | 31h                    | <details><summary>2x+GN</summary>`MODE_FPN=True`<br/>`FPN.NORM=GN BACKBONE.NORM=GN`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head`<br/>`FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head` <br/>`TRAIN.LR_SCHEDULE=[240000,320000,360000]`                                                                                                                                                                    |
 | R50-FPN                        | 41.7;36.2                                                               |                                                    | 17h                    | <details><summary>+Cascade</summary>`MODE_FPN=True FPN.CASCADE=True` </details>                                                                                                                                                                                                                                                                                                                               |
 | R101-C4                        | 40.1;34.6 [:arrow_down:][R101C41x]                                      |                                                    | 28h                    | <details><summary>standard</summary>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]` </details>                                                                                                                                                                                                                                                                                                                        |
 | R101-FPN                       | 40.7;36.8 [:arrow_down:][R101FPN1x]                                     | 40.0;35.9                                          | 18h                    | <details><summary>standard</summary>`MODE_FPN=True`<br/>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]` </details>                                                                                                                                                                                                                                                                                                    |

--- a/examples/ImageNetModels/imagenet_utils.py
+++ b/examples/ImageNetModels/imagenet_utils.py
@@ -18,10 +18,10 @@ from tensorpack.input_source import QueueInput, StagingInput
 from tensorpack.models import regularize_cost, l2_regularizer
 from tensorpack.predict import FeedfreePredictor, PredictConfig
 from tensorpack.tfutils.summary import add_moving_summary
+from tensorpack.tfutils.optimizer import AccumGradOptimizer
 from tensorpack.utils import logger
 from tensorpack.utils.stats import RatioCounter
 """
 ====== DataFlow =======
 """
@@ -329,6 +329,11 @@ class ImageNetModel(ModelDesc):
    """
    label_smoothing = 0.
+    """
+    Accumulate gradients across several steps (by default 1, which means no accumulation across steps).
+    """
+    accum_grad = 1
    def inputs(self):
        return [tf.TensorSpec([None, self.image_shape, self.image_shape, 3], self.image_dtype, 'input'),
                tf.TensorSpec([None], tf.int32, 'label')]
@@ -372,7 +377,10 @@ class ImageNetModel(ModelDesc):
    def optimizer(self):
        lr = tf.get_variable('learning_rate', initializer=0.1, trainable=False)
        tf.summary.scalar('learning_rate-summary', lr)
-        return tf.train.MomentumOptimizer(lr, 0.9, use_nesterov=True)
+        opt = tf.train.MomentumOptimizer(lr, 0.9, use_nesterov=True)
+        if self.accum_grad != 1:
+            opt = AccumGradOptimizer(opt, self.accum_grad)
+        return opt
    def image_preprocess(self, image):
        with tf.name_scope('image_preprocess'):