update fasterrcnn readme

0f0a9ad7 · Yuxin Wu · b5adbb62 · 0f0a9ad7 · 0f0a9ad7
Commit 0f0a9ad7 authored Oct 12, 2017 by Yuxin Wu
Hide whitespace changes
Inline Side-by-side

Showing with 38 additions and 15 deletions

examples/FasterRCNN/NOTES.md examples/FasterRCNN/NOTES.md +30 -0

examples/FasterRCNN/README.md examples/FasterRCNN/README.md +8 -15

No files found.
--- a/examples/FasterRCNN/NOTES.md
+++ b/examples/FasterRCNN/NOTES.md
+
+### File Structure
+This is a minimal implementation that simply contains these files:
+ coco.py: load COCO data
+ data.py: prepare data for training
+ common.py: common data preparation utilities
+ basemodel.py: implement resnet
+ model.py: implement rpn/faster-rcnn
+ train.py: main training script
+ utils/: third-party helper functions
+ eval.py: evaluation utilities
+ viz.py: visualization utilities
+
+### Implementation Notes
+
+1. You can easily add more augmentations such as rotation, but be careful how a box should be
+	 augmented. The code now will always use the minimal axis-aligned bounding box of the 4 corners,
+	 which is probably not the optimal way.
+
+2. Floating-point boxes are defined like this:
+
+<p align="center"> <img src="https://user-images.githubusercontent.com/1381301/31527740-2f1b38ce-af84-11e7-8de1-628e90089826.png"> </p>
+
+3. Inference is not quite fast, because either you disable convolution autotune and end up with
+	 a slow convolution algorithm, or you spend more time on autotune.
+	 This is a general problem of TensorFlow when running against variable-sized input.
+
+4. In Faster-RCNN, BatchNorm statistics are not supposed to be updated during fine-tuning.
+	 This specific kind of BatchNorm will need [my kernel](https://github.com/tensorflow/tensorflow/pull/12580)
+	 which is included since TF 1.4. If using an earlier version of TF, it will be either slow or wrong.
--- a/examples/FasterRCNN/README.md
+++ b/examples/FasterRCNN/README.md
 # Faster-RCNN on COCO
-This example aimes to provide a minimal (<1000 lines) Multi-GPU implementation of ResNet50-Faster-RCNN on COCO.
+This example aims to provide a minimal (<1000 lines) multi-GPU implementation of ResNet50-Faster-RCNN on COCO.

 ## Dependencies
-+ TensorFlow nightly.
+ TensorFlow >= 1.4.0rc0
 + Install [pycocotools](https://github.com/pdollar/coco/tree/master/PythonAPI/pycocotools), OpenCV.
 + Pre-trained [ResNet50 model](https://goo.gl/6XjK9V) from tensorpack model zoo.
 + COCO data. It assumes the following directory structure:
@@ -29,7 +29,8 @@ To train:
 ```
 ./train.py --load /path/to/ImageNet-ResNet50.npz
 ```
-The code is written for training with __8 GPUs__. Otherwise the performance won't be as good.
+The code is written for training with __8 GPUs__.
+To run on fewer GPUs, you probably need different hyperparameters for the same performance.

 To predict on an image (and show output in a window):
 ```
@@ -39,18 +40,10 @@ To predict on an image (and show output in a window):
 ## Results

 + trainval35k/minival, FASTRCNN_BATCH=256: 32.9
-+ trainval35k/minival, FASTRCNN_BATCH=64: 31.7. Takes less than one day on 8 Maxwell TitanX.
+ trainval35k/minival, FASTRCNN_BATCH=64: 31.6. Takes less than one day on 8 Maxwell TitanX.

 The hyperparameters are not carefully tuned. You can probably get better performance by e.g.  training longer.

-## Files
-This is a minimal implementation that simply contains these files:
-+ coco.py: load COCO data
-+ data.py: prepare data for training
-+ common.py: common data preparation utilities
-+ basemodel.py: implement resnet
-+ model.py: implement rpn/faster-rcnn
-+ train.py: main training script
-+ utils/: third-party helper functions
-+ eval.py: evaluation utilities
-+ viz.py: visualization utilities
+## Notes
+
+See [Notes on This Implementation](NOTES.md)