update docs

220303dc · Yuxin Wu · eb2492c4 · 220303dc · 220303dc · 220303dc
Commit 220303dc authored May 09, 2018 by Yuxin Wu
7 changed files
--- a/examples/FasterRCNN/NOTES.md
+++ b/examples/FasterRCNN/NOTES.md
@@ -49,3 +49,6 @@ Speed:
 3. With a large roi batch size (e.g. >= 256), GPU utilitization should stay above 90%.
+4. This implementation is about 14% slower than detectron,
+   probably due to the lack of specialized ops (e.g. AffineChannel, ROIAlign) in TensorFlow.
+   It's certainly faster than other TF implementation.
--- a/examples/FasterRCNN/README.md
+++ b/examples/FasterRCNN/README.md
@@ -40,7 +40,7 @@ Predict on an image (and show output in a window):
 ./train.py --predict input.jpg --load /path/to/model
 ```
-Evaluate the performance of a model and save to json.
+Evaluate the performance of a model on COCO, and save results to json.
 (Pretrained models can be downloaded in [model zoo](http://models.tensorpack.com/FasterRCNN):
 ```
 ./train.py --evaluate output.json --load /path/to/model
@@ -61,11 +61,10 @@ MaskRCNN results contain both bbox and segm mAP.
 |R-101   |512             |(800, 1333)|280k    |40.1/34.4      |70h on 8 P100s|
 |R-101   |512             |(800, 1333)|360k    |40.8/35.1      |63h on 8 V100s|
-The two 360k models have identical configurations with
+The two R-50 360k models have the same configuration __and mAP__ 
-`R50-C4-2x` configuration in
+as the `R50-C4-2x` entries in
 [Detectron Model Zoo](https://github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md#end-to-end-faster--mask-r-cnn-baselines).
-They get the __same performance__ with the official models, and are about 14% slower than the official implementation,
+So far this seems to be the only open source re-implementation that can reproduce mAP in Detectron.
-probably due to the lack of specialized ops (e.g. AffineChannel, ROIAlign) in TensorFlow.
 ## Notes

--- a/examples/FasterRCNN/train.py
+++ b/examples/FasterRCNN/train.py
@@ -334,13 +334,15 @@ class EvalCallback(Callback):
 if __name__ == '__main__':
    parser = argparse.ArgumentParser()
-    parser.add_argument('--gpu', help='comma separated list of GPU(s) to use.')
+    parser.add_argument('--gpu', help='comma separated list of GPU(s) to use. Default to all availalbe ones')
-    parser.add_argument('--load', help='load model')
+    parser.add_argument('--load', help='load model for evaluation or training')
-    parser.add_argument('--logdir', help='logdir', default='train_log/maskrcnn')
+    parser.add_argument('--logdir', help='log directory', default='train_log/maskrcnn')
    parser.add_argument('--datadir', help='override config.BASEDIR')
    parser.add_argument('--visualize', action='store_true')
-    parser.add_argument('--evaluate', help='path to the output json eval file')
+    parser.add_argument('--evaluate', help="Run evaluation on COCO. "
-    parser.add_argument('--predict', help='path to the input image file')
+                                           "This option is the path to the output json evaluation file")
+    parser.add_argument('--predict', help="Run prediction on a given image. "
+                                          "This argument is the path to the input image file")
    args = parser.parse_args()
    if args.datadir:
        config.BASEDIR = args.datadir

--- a/examples/README.md
+++ b/examples/README.md
@@ -4,13 +4,13 @@
 Training examples with __reproducible performance__.
 __The word "reproduce" should always mean reproduce performance__.
-With the magic of SGD, wrong deep learning code often appears to still work,
+With the magic of SGD, wrong deep learning code often appears to work, especially if you try it on toy datasets.
-especially if you try it on toy datasets.
+Github is full of such deep learning code that "implements" but does not "reproduce" methods.
 See [Unawareness of Deep Learning Mistakes](https://medium.com/@ppwwyyxx/unawareness-of-deep-learning-mistakes-d5b5774da0ba).
 We refuse toy examples.
 Instead of showing you 10 arbitrary networks trained on toy datasets with random final performance,
-tensorpack examples try to faithfully replicate experiments and performance in the paper as much as possible,
+tensorpack examples try to faithfully replicate experiments and performance in the paper,
 so you're confident that they are correct.
@@ -19,7 +19,7 @@ These are all the toy examples in tensorpack. They are supposed to be just demos
 + [An illustrative MNIST example with explanation of the framework](basics/mnist-convnet.py)
 + Tensorpack supports any symbolic libraries. See the same MNIST example written with [tf.layers](basics/mnist-tflayers.py), [tf-slim](basics/mnist-tfslim.py), and [with weights visualizations](basics/mnist-visualizations.py)
 + A tiny [Cifar ConvNet](basics/cifar-convnet.py) and [SVHN ConvNet](basics/svhn-digit-convnet.py)
-+ If you've used Keras, check out [Keras examples](keras)
+ If you've used Keras, check out [Keras+Tensorpack examples](keras)
 + [A boilerplate file to start with, for your own tasks](boilerplate.py)
 ## Vision:

--- a/examples/ResNet/README.md
+++ b/examples/ResNet/README.md
@@ -11,18 +11,23 @@ The training follows the __exact__ recipe used by the [Training ImageNet in 1 Ho
 and gets the same performance.
 Models can be [downloaded here](http://models.tensorpack.com/ResNet/).
+This recipe has better performance than most open source implementations.
+In fact, many papers that claim to "improve" ResNet only compete with a lower
+baseline and they actually cannot beat this ResNet recipe.
 | Model              | Top 5 Error | Top 1 Error |
 |:-------------------|-------------|------------:|
 | ResNet18           |     10.50%  |      29.66% |
-| ResNet34					 |     8.56%   |      26.17% |
+| ResNet34  		 |     8.56%   |      26.17% |
 | ResNet50           |     6.85%   |      23.61% |
-| ResNet50-SE				 |     6.24%   |      22.64% |
+| ResNet50-SE       |     6.24%   |      22.64% |
-| ResNet101      		 |     6.04%   |      21.95% |
+| ResNet101         |     6.04%   |      21.95% |
-| ResNet152      		 |     5.78%   |      21.51% |
+| ResNet152         |     5.78%   |      21.51% |
 To train, first decompress ImageNet data into [this structure](http://tensorpack.readthedocs.io/en/latest/modules/dataflow.dataset.html#tensorpack.dataflow.dataset.ILSVRC12), then:
 ```bash
-./imagenet-resnet.py --data /path/to/original/ILSVRC --gpu 0,1,2,3 -d 50 [--mode resnet/preact/se]
+./imagenet-resnet.py --data /path/to/original/ILSVRC -d 50 [--mode resnet/preact/se]
+# See ./imagenet-resnet.py -h for other options.
 ```
 You should be able to see good GPU utilization (95%~99%), if your data is fast enough.
@@ -37,6 +42,7 @@ See the [tutorial](http://tensorpack.readthedocs.io/en/latest/tutorial/efficient
 This script only converts and runs ImageNet-ResNet{50,101,152} Caffe models [released by MSRA](https://github.com/KaimingHe/deep-residual-networks).
 Note that the architecture is different from the `imagenet-resnet.py` script and the models are not compatible.
+ResNets have evolved, generally you should not cite these numbers as baselines in your paper.
 Usage:
 ```bash
@@ -76,7 +82,7 @@ Note that the architecture is different from the offcial preact-ResNet18.
 Usage:
 ```bash
 ./cifar10-preact18-mixup.py  # train without mixup
-./cifar10-preact18-mixup.py --mixup	 # with mixup
+./cifar10-preact18-mixup.py --mixup   # with mixup
 ```
 Results of the reference code can be reproduced.

--- a/examples/ResNet/cifar10-resnet.py
+++ b/examples/ResNet/cifar10-resnet.py
@@ -145,7 +145,7 @@ if __name__ == '__main__':
    parser.add_argument('-n', '--num_units',
                        help='number of units in each stage',
                        type=int, default=18)
-    parser.add_argument('--load', help='load model')
+    parser.add_argument('--load', help='load model for training')
    args = parser.parse_args()
    NUM_UNITS = args.num_units

--- a/examples/ResNet/imagenet-resnet.py
+++ b/examples/ResNet/imagenet-resnet.py
@@ -63,6 +63,8 @@ def get_config(model, fake=False):
    batch = args.batch // nr_tower
    logger.info("Running on {} towers. Batch size per tower: {}".format(nr_tower, batch))
+    if batch < 32 or batch > 64:
+        logger.warn("Batch size per tower not in [32, 64]. This may lead to worse accuracy than reported.")
    if fake:
        data = QueueInput(FakeData(
            [[batch, 224, 224, 3], [batch]], 1000, random=False, dtype='uint8'))
@@ -106,17 +108,18 @@ def get_config(model, fake=False):
 if __name__ == '__main__':
    parser = argparse.ArgumentParser()
-    parser.add_argument('--gpu', help='comma separated list of GPU(s) to use.')
+    parser.add_argument('--gpu', help='comma separated list of GPU(s) to use. Default to use all available ones')
    parser.add_argument('--data', help='ILSVRC dataset dir')
-    parser.add_argument('--load', help='load model')
+    parser.add_argument('--load', help='load a model for training or evaluation')
-    parser.add_argument('--fake', help='use fakedata to test or benchmark this model', action='store_true')
+    parser.add_argument('--fake', help='use FakeData to debug or benchmark this model', action='store_true')
-    parser.add_argument('--data_format', help='specify NCHW or NHWC',
+    parser.add_argument('--data_format', help='image data format',
-                        type=str, default='NCHW')
+                        default='NCHW', choices=['NCHW', 'NHWC'])
-    parser.add_argument('-d', '--depth', help='resnet depth',
+    parser.add_argument('-d', '--depth', help='ResNet depth',
                        type=int, default=50, choices=[18, 34, 50, 101, 152])
-    parser.add_argument('--eval', action='store_true')
+    parser.add_argument('--eval', action='store_true', help='run offline evaluation instead of training')
    parser.add_argument('--batch', default=256, type=int,
-                        help='total batch size. 32 per GPU gives best accuracy, higher values should be similarly good')
+                        help="total batch size. "
+                        "Note that it's best to keep per-GPU batch size in [32, 64] to obtain the best accuracy.")
    parser.add_argument('--mode', choices=['resnet', 'preact', 'se'],
                        help='variants of resnet to use', default='resnet')
    args = parser.parse_args()