Commit 0f0a9ad7 authored by Yuxin Wu's avatar Yuxin Wu

update fasterrcnn readme

parent b5adbb62
### File Structure
This is a minimal implementation that simply contains these files:
+ coco.py: load COCO data
+ data.py: prepare data for training
+ common.py: common data preparation utilities
+ basemodel.py: implement resnet
+ model.py: implement rpn/faster-rcnn
+ train.py: main training script
+ utils/: third-party helper functions
+ eval.py: evaluation utilities
+ viz.py: visualization utilities
### Implementation Notes
1. You can easily add more augmentations such as rotation, but be careful how a box should be
augmented. The code now will always use the minimal axis-aligned bounding box of the 4 corners,
which is probably not the optimal way.
2. Floating-point boxes are defined like this:
<p align="center"> <img src="https://user-images.githubusercontent.com/1381301/31527740-2f1b38ce-af84-11e7-8de1-628e90089826.png"> </p>
3. Inference is not quite fast, because either you disable convolution autotune and end up with
a slow convolution algorithm, or you spend more time on autotune.
This is a general problem of TensorFlow when running against variable-sized input.
4. In Faster-RCNN, BatchNorm statistics are not supposed to be updated during fine-tuning.
This specific kind of BatchNorm will need [my kernel](https://github.com/tensorflow/tensorflow/pull/12580)
which is included since TF 1.4. If using an earlier version of TF, it will be either slow or wrong.
# Faster-RCNN on COCO
This example aimes to provide a minimal (<1000 lines) Multi-GPU implementation of ResNet50-Faster-RCNN on COCO.
This example aims to provide a minimal (<1000 lines) multi-GPU implementation of ResNet50-Faster-RCNN on COCO.
## Dependencies
+ TensorFlow nightly.
+ TensorFlow >= 1.4.0rc0
+ Install [pycocotools](https://github.com/pdollar/coco/tree/master/PythonAPI/pycocotools), OpenCV.
+ Pre-trained [ResNet50 model](https://goo.gl/6XjK9V) from tensorpack model zoo.
+ COCO data. It assumes the following directory structure:
......@@ -29,7 +29,8 @@ To train:
```
./train.py --load /path/to/ImageNet-ResNet50.npz
```
The code is written for training with __8 GPUs__. Otherwise the performance won't be as good.
The code is written for training with __8 GPUs__.
To run on fewer GPUs, you probably need different hyperparameters for the same performance.
To predict on an image (and show output in a window):
```
......@@ -39,18 +40,10 @@ To predict on an image (and show output in a window):
## Results
+ trainval35k/minival, FASTRCNN_BATCH=256: 32.9
+ trainval35k/minival, FASTRCNN_BATCH=64: 31.7. Takes less than one day on 8 Maxwell TitanX.
+ trainval35k/minival, FASTRCNN_BATCH=64: 31.6. Takes less than one day on 8 Maxwell TitanX.
The hyperparameters are not carefully tuned. You can probably get better performance by e.g. training longer.
## Files
This is a minimal implementation that simply contains these files:
+ coco.py: load COCO data
+ data.py: prepare data for training
+ common.py: common data preparation utilities
+ basemodel.py: implement resnet
+ model.py: implement rpn/faster-rcnn
+ train.py: main training script
+ utils/: third-party helper functions
+ eval.py: evaluation utilities
+ viz.py: visualization utilities
## Notes
See [Notes on This Implementation](NOTES.md)
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment