<aid="ft1">1</a>: Here we comapre models that have identical training & inference cost between the two implementation. However their numbers are different due to many small implementation details.
<aid="ft1">1</a>: Numbers taken from [Detectron Model Zoo](https://github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md).
<aid="ft2">2</a>: Numbers taken from [Group Normalization](https://arxiv.org/abs/1803.08494)
We comapre models that have identical training & inference cost between the two implementation. However their numbers can be different due to many small implementation details.
<aid="ft2">2</a>: Numbers taken from Table 5 in [Group Normalization](https://arxiv.org/abs/1803.08494)
<aid="ft3">3</a>: Our mAP is __10+ point__ better than the official model in [matterport/Mask_RCNN](https://github.com/matterport/Mask_RCNN/releases/tag/v2.0) with the same R101-FPN backbone.
<aid="ft3">3</a>: Our mAP is __10+ point__ better than the official model in [matterport/Mask_RCNN](https://github.com/matterport/Mask_RCNN/releases/tag/v2.0) with the same R101-FPN backbone.
<aid="ft4">4</a>: This entry does not use ImageNet pre-training. Detectron numbers are taken from Fig. 5 in [Rethinking ImageNet Pre-training](https://arxiv.org/abs/1811.08883).
Note that our training strategy is slightly different: we enable cascade throughout the entire training.
## Notes
## Notes
[NOTES.md](NOTES.md) has some notes about implementation details & speed.
[NOTES.md](NOTES.md) has some notes about implementation details & speed.