Commit 8132387f authored by Yuxin Wu's avatar Yuxin Wu

[MaskRCNN] from-scratch results

parent 0fa90c41
...@@ -10,6 +10,7 @@ with the support of: ...@@ -10,6 +10,7 @@ with the support of:
+ Multi-GPU / distributed training + Multi-GPU / distributed training
+ Cross-GPU BatchNorm (from [MegDet: A Large Mini-Batch Object Detector](https://arxiv.org/abs/1711.07240)) + Cross-GPU BatchNorm (from [MegDet: A Large Mini-Batch Object Detector](https://arxiv.org/abs/1711.07240))
+ [Group Normalization](https://arxiv.org/abs/1803.08494) + [Group Normalization](https://arxiv.org/abs/1803.08494)
+ Training from scratch (from [Rethinking ImageNet Pre-training](https://arxiv.org/abs/1811.08883))
## Dependencies ## Dependencies
+ Python 3.3+; OpenCV. + Python 3.3+; OpenCV.
...@@ -76,32 +77,37 @@ prediction will need to be run with the corresponding training configs. ...@@ -76,32 +77,37 @@ prediction will need to be run with the corresponding training configs.
## Results ## Results
These models are trained on trainval35k and evaluated on minival2014 using mAP@IoU=0.50:0.95. These models are trained on trainval35k and evaluated on minival2014 using mAP@IoU=0.50:0.95.
All models are fine-tuned from ImageNet pre-trained R50/R101 models in the All models are fine-tuned from ImageNet pre-trained R50/R101 models in
[model zoo](http://models.tensorpack.com/FasterRCNN/), unless otherwise noted. [tensorpack model zoo](http://models.tensorpack.com/FasterRCNN/), unless otherwise noted.
All models are trained with 8 NVIDIA V100s, unless otherwise noted. All models are trained with 8 NVIDIA V100s, unless otherwise noted.
Performance in [Detectron](https://github.com/facebookresearch/Detectron/) can be roughly reproduced. Performance in [Detectron](https://github.com/facebookresearch/Detectron/) can be roughly reproduced.
Mask R-CNN results contain both box and mask mAP. Mask R-CNN results contain both box and mask mAP.
| Backbone | mAP<br/>(box;mask) | Detectron mAP <sup>[1](#ft1)</sup><br/> (box;mask) | Time (on 8 V100s) | Configurations <br/> (click to expand) | | Backbone | mAP<br/>(box;mask) | Detectron mAP <sup>[1](#ft1)</sup><br/> (box;mask) | Time (on 8 V100s) | Configurations <br/> (click to expand) |
| - | - | - | - | - | | - | - | - | - | - |
| R50-C4 | 33.1 | | 18h | <details><summary>super quick</summary>`MODE_MASK=False FRCNN.BATCH_PER_IM=64`<br/>`PREPROC.SHORT_EDGE_SIZE=600 PREPROC.MAX_SIZE=1024`<br/>`TRAIN.LR_SCHEDULE=[150000,230000,280000]` </details> | | R50-C4 | 33.1 | | 18h | <details><summary>super quick</summary>`MODE_MASK=False FRCNN.BATCH_PER_IM=64`<br/>`PREPROC.SHORT_EDGE_SIZE=600 PREPROC.MAX_SIZE=1024`<br/>`TRAIN.LR_SCHEDULE=[150000,230000,280000]` </details> |
| R50-C4 | 36.6 | 36.5 | 44h | <details><summary>standard</summary>`MODE_MASK=False` </details> | | R50-C4 | 36.6 | 36.5 | 44h | <details><summary>standard</summary>`MODE_MASK=False` </details> |
| R50-FPN | 37.4 | 37.9 | 29h | <details><summary>standard</summary>`MODE_MASK=False MODE_FPN=True` </details> | | R50-FPN | 37.4 | 37.9 | 29h | <details><summary>standard</summary>`MODE_MASK=False MODE_FPN=True` </details> |
| R50-C4 | 38.2;33.3 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50C4-MaskRCNN-Standard.npz) | 37.8;32.8 | 49h | <details><summary>standard</summary>this is the default </details> | | R50-C4 | 38.2;33.3 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50C4-MaskRCNN-Standard.npz) | 37.8;32.8 | 49h | <details><summary>standard</summary>this is the default </details> |
| R50-FPN | 38.5;35.2 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50FPN-MaskRCNN-Standard.npz) | 38.6;34.5 | 30h | <details><summary>standard</summary>`MODE_FPN=True` </details> | | R50-FPN | 38.5;35.2 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50FPN-MaskRCNN-Standard.npz) | 38.6;34.5 | 30h | <details><summary>standard</summary>`MODE_FPN=True` </details> |
| R50-FPN | 42.0;36.3 | | 41h | <details><summary>+Cascade</summary>`MODE_FPN=True FPN.CASCADE=True` </details> | | R50-FPN | 42.0;36.3 | | 41h | <details><summary>+Cascade</summary>`MODE_FPN=True FPN.CASCADE=True` </details> |
| R50-FPN | 39.5;35.2 | 39.5;34.4<sup>[2](#ft2)</sup> | 33h | <details><summary>+ConvGNHead</summary>`MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` </details> | | R50-FPN | 39.5;35.2 | 39.5;34.4<sup>[2](#ft2)</sup> | 33h | <details><summary>+ConvGNHead</summary>`MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` </details> |
| R50-FPN | 40.0;36.2 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50FPN-MaskRCNN-StandardGN.npz) | 40.3;35.7 | 40h | <details><summary>+GN</summary>`MODE_FPN=True`<br/>`FPN.NORM=GN BACKBONE.NORM=GN`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head`<br/>`FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head` | | R50-FPN | 40.0;36.2 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50FPN-MaskRCNN-StandardGN.npz) | 40.3;35.7 | 40h | <details><summary>+GN</summary>`MODE_FPN=True`<br/>`FPN.NORM=GN BACKBONE.NORM=GN`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head`<br/>`FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head` |
| R101-C4 | 41.4;35.2 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101C4-MaskRCNN-Standard.npz) | | 60h | <details><summary>standard</summary>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]` </details> | | R101-C4 | 41.4;35.2 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101C4-MaskRCNN-Standard.npz) | | 60h | <details><summary>standard</summary>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]` </details> |
| R101-FPN | 40.4;36.6 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101FPN-MaskRCNN-Standard.npz) | 40.9;36.4 | 38h | <details><summary>standard</summary>`MODE_FPN=True`<br/>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]` </details> | | R101-FPN | 40.4;36.6 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101FPN-MaskRCNN-Standard.npz) | 40.9;36.4 | 38h | <details><summary>standard</summary>`MODE_FPN=True`<br/>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]` </details> |
| R101-FPN | 46.5;40.1 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101FPN-MaskRCNN-BetterParams.npz) <sup>[3](#ft3)</sup> | | 73h | <details><summary>+++</summary>`MODE_FPN=True FPN.CASCADE=True`<br/>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]`<br/>`TEST.RESULT_SCORE_THRESH=1e-4`<br/>`PREPROC.TRAIN_SHORT_EDGE_SIZE=[640,800]`<br/>`TRAIN.LR_SCHEDULE=[420000,500000,540000]` </details> | | R101-FPN | 46.5;40.1 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101FPN-MaskRCNN-BetterParams.npz) <sup>[3](#ft3)</sup> | | 73h | <details><summary>3x+Cascade+TrainAug</summary>`MODE_FPN=True FPN.CASCADE=True`<br/>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]`<br/>`TEST.RESULT_SCORE_THRESH=1e-4`<br/>`PREPROC.TRAIN_SHORT_EDGE_SIZE=[640,800]`<br/>`TRAIN.LR_SCHEDULE=[420000,500000,540000]` </details> |
| R101-FPN<br/>(From Scratch) | 47.5;41.2 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101FPN-MaskRCNN-ScratchGN.npz) | 47.4;40.5<sup>[4](#ft4)</sup> | 45h (on 48 V100s) | <details><summary>9x+GN+Cascade+TrainAug</summary>`MODE_FPN=True FPN.CASCADE=True`<br/>`BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]`<br/>`FPN.NORM=GN BACKBONE.NORM=GN`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head`<br/>`PREPROC.TRAIN_SHORT_EDGE_SIZE=[640,800]`<br/>`TRAIN.LR_SCHEDULE=[1500000,1580000,1620000]`<br/>`BACKBONE.FREEZE_AT=0`</details> |
<a id="ft1">1</a>: Here we comapre models that have identical training & inference cost between the two implementation. However their numbers are different due to many small implementation details.
<a id="ft1">1</a>: Numbers taken from [Detectron Model Zoo](https://github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md).
<a id="ft2">2</a>: Numbers taken from [Group Normalization](https://arxiv.org/abs/1803.08494) We comapre models that have identical training & inference cost between the two implementation. However their numbers can be different due to many small implementation details.
<a id="ft2">2</a>: Numbers taken from Table 5 in [Group Normalization](https://arxiv.org/abs/1803.08494)
<a id="ft3">3</a>: Our mAP is __10+ point__ better than the official model in [matterport/Mask_RCNN](https://github.com/matterport/Mask_RCNN/releases/tag/v2.0) with the same R101-FPN backbone. <a id="ft3">3</a>: Our mAP is __10+ point__ better than the official model in [matterport/Mask_RCNN](https://github.com/matterport/Mask_RCNN/releases/tag/v2.0) with the same R101-FPN backbone.
<a id="ft4">4</a>: This entry does not use ImageNet pre-training. Detectron numbers are taken from Fig. 5 in [Rethinking ImageNet Pre-training](https://arxiv.org/abs/1811.08883).
Note that our training strategy is slightly different: we enable cascade throughout the entire training.
## Notes ## Notes
[NOTES.md](NOTES.md) has some notes about implementation details & speed. [NOTES.md](NOTES.md) has some notes about implementation details & speed.
...@@ -116,6 +116,8 @@ class MultiThreadMapData(_ParallelMapData): ...@@ -116,6 +116,8 @@ class MultiThreadMapData(_ParallelMapData):
You can use **strict mode**, where `MultiThreadMapData.__iter__()` You can use **strict mode**, where `MultiThreadMapData.__iter__()`
is guaranteed to produce the exact set which `df.__iter__()` is guaranteed to produce the exact set which `df.__iter__()`
produces. Although the order of data still isn't preserved. produces. Although the order of data still isn't preserved.
The behavior of strict mode is undefined if the dataflow is infinite.
""" """
class _Worker(StoppableThread): class _Worker(StoppableThread):
def __init__(self, inq, outq, evt, map_func): def __init__(self, inq, outq, evt, map_func):
...@@ -219,6 +221,8 @@ class MultiProcessMapDataZMQ(_ParallelMapData, _MultiProcessZMQDataFlow): ...@@ -219,6 +221,8 @@ class MultiProcessMapDataZMQ(_ParallelMapData, _MultiProcessZMQDataFlow):
You can use **strict mode**, where `MultiProcessMapData.__iter__()` You can use **strict mode**, where `MultiProcessMapData.__iter__()`
is guaranteed to produce the exact set which `df.__iter__()` is guaranteed to produce the exact set which `df.__iter__()`
produces. Although the order of data still isn't preserved. produces. Although the order of data still isn't preserved.
The behavior of strict mode is undefined if the dataflow is infinite.
""" """
class _Worker(mp.Process): class _Worker(mp.Process):
def __init__(self, identity, map_func, pipename, hwm): def __init__(self, identity, map_func, pipename, hwm):
...@@ -405,8 +409,10 @@ if __name__ == '__main__': ...@@ -405,8 +409,10 @@ if __name__ == '__main__':
return x return x
ds = Zero(100) ds = Zero(100)
ds = MultiThreadMapData(ds, 50, f, buffer_size=50, strict=False) ds = MultiThreadMapData(ds, 50, f, buffer_size=50, strict=True)
ds.reset_state() ds.reset_state()
for k in ds: for idx, k in enumerate(ds):
print("Bang!", k) print("Bang!", k)
if idx == 100:
break
print("END!") print("END!")
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment