|R-50 |512 |(800, 1333)|280k |35.6 |55h on 8 P100|
|R50* |512 |(800, 1333)|360k |36.7 |49h on 8 V100|
|R-50 |512 |(800, 1333)|360k |36.7 |49h on 8 V100|
|R50 |256 |(800, 1333)|280k |36.9/32.3 |39h on 8 P100|
|R-50 |256 |(800, 1333)|280k |36.9/32.3 |39h on 8 P100|
|R101 |512 |(800, 1333)|280k |40.1/34.4 |70h on 8 P100|
|R-50 |512 |(800, 1333)|360k |37.7/33.0 |72h on 8 P100|
|R-101 |512 |(800, 1333)|280k |40.1/34.4 |70h on 8 P100|
These models are trained with different configurations.
The two 360k models have identical configurations with
The starred (*) models have identical configurations with
`R50-C4-2x` configuration in
`R50-C4-2x` configuration in
[Detectron Model Zoo](https://github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md#end-to-end-faster--mask-r-cnn-baselines0)
[Detectron Model Zoo](https://github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md#end-to-end-faster--mask-r-cnn-baselines0).
and get the same performance.
They get the __same performance__ with the official models, and are about 14% slower than the official implementation, due to the lack of specialized ops.