Commit 72674731 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 101d7aa5
......@@ -3,7 +3,7 @@ An issue has to be one of the following:
- [ ] Feature Requests
- [ ] Questions on Using/Understanding Tensorpack
For any unexpected problems, __PLEASE ALWAYS INCLUDE__:
## For any unexpected problems, __PLEASE ALWAYS INCLUDE__:
1. What you did:
+ If you're using examples:
+ What's the command you run:
......@@ -21,18 +21,17 @@ For any unexpected problems, __PLEASE ALWAYS INCLUDE__:
You can install Tensorpack master by `pip install -U git+https://github.com/ppwwyyxx/tensorpack.git`.:
+ Hardware information, if relevant.
For efficiency issues, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html
About efficiency issues, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html
Feature Requests:
## Feature Requests:
+ You can implement a lot of features by extending Tensorpack
(See http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack).
It does not have to be added to Tensorpack unless you have a good reason.
+ "Could you improve/implement an example/paper ?"
-- the answer is: we don't know, and we don't take feature requests for
examples. You should do it yourself with Tensorpack. If you don't know how to
do it, you may ask a usage question.
-- the answer is: we have no plans to do so and we don't take feature requests for
examples. If you don't know how to do it, you may ask a usage question.
Usage Questions:
## Usage Questions:
+ Read the [tutorials](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#user-tutorials) first.
+ We answer "HOW to do X with Tensorpack" for a well-defined X.
......
......@@ -13,14 +13,14 @@ Alternative link to this page: [http://dorefa.net](http://dorefa.net)
This is a good set of baselines for research in model quantization.
These quantization techniques, when applied on AlexNet, achieves the following ImageNet performance in this implementation:
| Model | Bit Width <br/> (weights, activations, gradients) | Top 1 Validation Error <sup>[1](#ft1)</sup>|
|:----------------------------------:|:-------------------------------------------------:|:----------------------:|
| Model | Bit Width <br/> (weights, activations, gradients) | Top 1 Validation Error <sup>[1](#ft1)</sup> |
|:----------------------------------:|:-------------------------------------------------:|:-----------------------------------------------------------------------------:|
| Full Precision<sup>[2](#ft2)</sup> | 32,32,32 | 40.3% |
| TTQ | t,32,32 | 42.0% |
| BWN | 1,32,32 | 44.6% |
| BNN | 1,1,32 | 51.9% |
| DoReFa | 1,2,32 | 46.6% |
| DoReFa | 1,2,6 | 46.8% |
| DoReFa | 1,2,6 | 46.8% [:arrow_down:](http://models.tensorpack.com/DoReFa-Net/alexnet-126.npz) |
| DoReFa | 1,2,4 | 54.0% |
<a id="ft1">1</a>: These numbers were obtained by training on 8 GPUs with a total batch size of 256.
......
......@@ -11,10 +11,11 @@ with the support of:
+ [Group Normalization](https://arxiv.org/abs/1803.08494)
## Dependencies
+ Python 3; TensorFlow >= 1.6 (1.4 or 1.5 can run but may crash due to a TF bug);
+ [pycocotools](https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/), OpenCV.
+ Python 3; OpenCV.
+ TensorFlow >= 1.6 (1.4 or 1.5 can run but may crash due to a TF bug);
+ pycocotools: `pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'`
+ Pre-trained [ImageNet ResNet model](http://models.tensorpack.com/FasterRCNN/)
from tensorpack model zoo. Use the models with "-AlignPadding".
from tensorpack model zoo.
+ COCO data. It needs to have the following directory structure:
```
COCO/DIR/
......@@ -57,7 +58,7 @@ To predict on an image (and show output in a window):
```
Evaluate the performance of a model on COCO.
(Several trained models can be downloaded in [model zoo](http://models.tensorpack.com/FasterRCNN):
(Several trained models can be downloaded in [model zoo](http://models.tensorpack.com/FasterRCNN)):
```
./train.py --evaluate output.json --load /path/to/COCO-R50C4-MaskRCNN-Standard.npz \
--config MODE_MASK=True DATA.BASEDIR=/path/to/COCO/DIR
......@@ -69,19 +70,19 @@ Evaluation or prediction will need the same `--config` used during training.
These models are trained with different configurations on trainval35k and evaluated on minival using mAP@IoU=0.50:0.95.
MaskRCNN results contain both box and mask mAP.
| Backbone | mAP<br/>(box;mask) | Detectron mAP <br/> (box;mask) | Time | Configurations <br/> (click to expand) |
| Backbone | mAP<br/>(box;mask) | Detectron mAP <br/> (box;mask) | Time on 8 V100s | Configurations <br/> (click to expand) |
| - | - | - | - | - |
| R50-C4 | 33.1 | | 18h on 8 V100s | <details><summary>super quick</summary>`MODE_MASK=False FRCNN.BATCH_PER_IM=64`<br/>`PREPROC.SHORT_EDGE_SIZE=600 PREPROC.MAX_SIZE=1024`<br/>`TRAIN.LR_SCHEDULE=[150000,230000,280000]` </details> |
| R50-C4 | 36.6 | 36.5 | 44h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=False` </details> |
| R50-FPN | 37.5 | 37.9<sup>[1](#ft1)</sup> | 28h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=False MODE_FPN=True` </details> |
| R50-C4 | 36.8;32.1 | | 39h on 8 P100s | <details><summary>quick</summary>`MODE_MASK=True FRCNN.BATCH_PER_IM=256`<br/>`TRAIN.LR_SCHEDULE=[150000,230000,280000]` </details> |
| R50-C4 | 37.8;33.1 | 37.8;32.8 | 49h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=True` </details> |
| R50-FPN | 38.2;34.9 | 38.6;34.5<sup>[1](#ft1)</sup> | 32h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=True MODE_FPN=True` </details> |
| R50-FPN | 38.5;34.8 | 38.6;34.2<sup>[2](#ft2)</sup> | 34h on 8 V100s | <details><summary>standard+ConvHead</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_head` </details> |
| R50-FPN | 39.5;35.2 | 39.5;34.4<sup>[2](#ft2)</sup> | 34h on 8 V100s | <details><summary>standard+ConvGNHead</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` </details> |
| R101-C4 | 40.8;35.1 | | 63h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=True `<br/>`BACKBONE.RESNET_NUM_BLOCK=[3,4,23,3]` </details> |
<a id="ft1">1</a>: Slightly different configurations.
| R50-C4 | 33.1 | | 18h | <details><summary>super quick</summary>`MODE_MASK=False FRCNN.BATCH_PER_IM=64`<br/>`PREPROC.SHORT_EDGE_SIZE=600 PREPROC.MAX_SIZE=1024`<br/>`TRAIN.LR_SCHEDULE=[150000,230000,280000]` </details> |
| R50-C4 | 36.6 | 36.5 | 44h | <details><summary>standard</summary>`MODE_MASK=False` </details> |
| R50-FPN | 37.4 | 37.9<sup>[1](#ft1)</sup> | 30h | <details><summary>standard</summary>`MODE_MASK=False MODE_FPN=True` </details> |
| R50-C4 | 37.8;33.1 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50C4-MaskRCNN-Standard.npz) | 37.8;32.8 | 49h | <details><summary>standard</summary>`MODE_MASK=True` </details> |
| R50-FPN | 38.2;34.9 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50FPN-MaskRCNN-Standard.npz) | 38.6;34.5<sup>[1](#ft1)</sup> | 32h | <details><summary>standard</summary>`MODE_MASK=True MODE_FPN=True` </details> |
| R50-FPN | 38.5;34.8 | 38.6;34.2<sup>[2](#ft2)</sup> | 34h | <details><summary>standard+ConvHead</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_head` </details> |
| R50-FPN | 39.5;35.2 | 39.5;34.4<sup>[2](#ft2)</sup> | 34h | <details><summary>standard+ConvGNHead</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` </details> |
| R50-FPN | 40.0;36.1 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50FPN-MaskRCNN-StandardGN.npz) | 40.3;35.7 | 44h | <details><summary>standard+GN</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.NORM=GN BACKBONE.NORM=GN`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head`<br/>`FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head |
| R101-C4 | 40.8;35.1 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101C4-MaskRCNN-Standard.npz) | | 63h | <details><summary>standard</summary>`MODE_MASK=True `<br/>`BACKBONE.RESNET_NUM_BLOCK=[3,4,23,3]` </details> |
<a id="ft1">1</a>: This implementation has slightly different configurations from detectron (e.g. batch size).
<a id="ft2">2</a>: Numbers taken from [Group Normalization](https://arxiv.org/abs/1803.08494)
......
......@@ -15,14 +15,14 @@ This recipe has better performance than most open source implementations.
In fact, many papers that claim to "improve" ResNet only compete with a lower
baseline and they actually cannot beat this ResNet recipe.
| Model | Top 5 Error | Top 1 Error |
|:-------------------|-------------|------------:|
| ResNet18 | 10.50% | 29.66% |
| ResNet34 | 8.56% | 26.17% |
| ResNet50 | 6.85% | 23.61% |
| ResNet50-SE | 6.24% | 22.64% |
| ResNet101 | 6.04% | 21.95% |
| ResNet152 | 5.78% | 21.51% |
| Model | Top 5 Error | Top 1 Error | Download |
|:------------|:------------|:-----------:|:----------------------------------------------------------------------------:|
| ResNet18 | 10.50% | 29.66% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet18.npz) |
| ResNet34 | 8.56% | 26.17% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet34.npz) |
| ResNet50 | 6.85% | 23.61% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet50.npz) |
| ResNet50-SE | 6.24% | 22.64% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet50-SE.npz) |
| ResNet101 | 6.04% | 21.95% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet101.npz) |
| ResNet152 | 5.78% | 21.51% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet152.npz) |
To train, first decompress ImageNet data into [this structure](http://tensorpack.readthedocs.io/en/latest/modules/dataflow.dataset.html#tensorpack.dataflow.dataset.ILSVRC12), then:
```bash
......
......@@ -13,7 +13,7 @@ from tensorflow.python.client import timeline
from .base import Callback
from ..utils import logger
from ..utils.concurrency import ensure_proc_terminate, start_proc_mask_signal
from ..utils.gpu import get_nr_gpu
from ..utils.gpu import get_num_gpu
from ..utils.nvml import NVMLContext
__all__ = ['GPUUtilizationTracker', 'GraphProfiler', 'PeakMemoryTracker']
......@@ -40,7 +40,7 @@ class GPUUtilizationTracker(Callback):
if devices is None:
env = os.environ.get('CUDA_VISIBLE_DEVICES')
if env is None:
self._devices = list(range(get_nr_gpu()))
self._devices = list(range(get_num_gpu()))
logger.warn("[GPUUtilizationTracker] Both devices and CUDA_VISIBLE_DEVICES are None! "
"Will monitor all {} visible GPUs!".format(len(self._devices)))
else:
......
......@@ -221,8 +221,8 @@ def BatchNorm(inputs, axis=None, training=None, momentum=0.9, epsilon=1e-5,
if sync_statistics == 'nccl':
if six.PY3 and TF_version <= (1, 9) and ctx.is_main_training_tower:
logger.warn("A TensorFlow bug will cause cross-GPU BatchNorm to fail. "
"Apply this patch: https://github.com/tensorflow/tensorflow/pull/20360")
logger.warn("A bug in TensorFlow<=1.9 will cause cross-GPU BatchNorm to fail. "
"Upgrade or apply this patch manually: https://github.com/tensorflow/tensorflow/pull/20360")
from tensorflow.contrib.nccl.ops import gen_nccl_ops
shared_name = re.sub('tower[0-9]+/', '', tf.get_variable_scope().name)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment