Commit 72674731 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 101d7aa5
...@@ -3,7 +3,7 @@ An issue has to be one of the following: ...@@ -3,7 +3,7 @@ An issue has to be one of the following:
- [ ] Feature Requests - [ ] Feature Requests
- [ ] Questions on Using/Understanding Tensorpack - [ ] Questions on Using/Understanding Tensorpack
For any unexpected problems, __PLEASE ALWAYS INCLUDE__: ## For any unexpected problems, __PLEASE ALWAYS INCLUDE__:
1. What you did: 1. What you did:
+ If you're using examples: + If you're using examples:
+ What's the command you run: + What's the command you run:
...@@ -21,18 +21,17 @@ For any unexpected problems, __PLEASE ALWAYS INCLUDE__: ...@@ -21,18 +21,17 @@ For any unexpected problems, __PLEASE ALWAYS INCLUDE__:
You can install Tensorpack master by `pip install -U git+https://github.com/ppwwyyxx/tensorpack.git`.: You can install Tensorpack master by `pip install -U git+https://github.com/ppwwyyxx/tensorpack.git`.:
+ Hardware information, if relevant. + Hardware information, if relevant.
For efficiency issues, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html About efficiency issues, PLEASE first read http://tensorpack.readthedocs.io/en/latest/tutorial/performance-tuning.html
Feature Requests: ## Feature Requests:
+ You can implement a lot of features by extending Tensorpack + You can implement a lot of features by extending Tensorpack
(See http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack). (See http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#extend-tensorpack).
It does not have to be added to Tensorpack unless you have a good reason. It does not have to be added to Tensorpack unless you have a good reason.
+ "Could you improve/implement an example/paper ?" + "Could you improve/implement an example/paper ?"
-- the answer is: we don't know, and we don't take feature requests for -- the answer is: we have no plans to do so and we don't take feature requests for
examples. You should do it yourself with Tensorpack. If you don't know how to examples. If you don't know how to do it, you may ask a usage question.
do it, you may ask a usage question.
Usage Questions: ## Usage Questions:
+ Read the [tutorials](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#user-tutorials) first. + Read the [tutorials](http://tensorpack.readthedocs.io/en/latest/tutorial/index.html#user-tutorials) first.
+ We answer "HOW to do X with Tensorpack" for a well-defined X. + We answer "HOW to do X with Tensorpack" for a well-defined X.
......
...@@ -13,14 +13,14 @@ Alternative link to this page: [http://dorefa.net](http://dorefa.net) ...@@ -13,14 +13,14 @@ Alternative link to this page: [http://dorefa.net](http://dorefa.net)
This is a good set of baselines for research in model quantization. This is a good set of baselines for research in model quantization.
These quantization techniques, when applied on AlexNet, achieves the following ImageNet performance in this implementation: These quantization techniques, when applied on AlexNet, achieves the following ImageNet performance in this implementation:
| Model | Bit Width <br/> (weights, activations, gradients) | Top 1 Validation Error <sup>[1](#ft1)</sup>| | Model | Bit Width <br/> (weights, activations, gradients) | Top 1 Validation Error <sup>[1](#ft1)</sup> |
|:----------------------------------:|:-------------------------------------------------:|:----------------------:| |:----------------------------------:|:-------------------------------------------------:|:-----------------------------------------------------------------------------:|
| Full Precision<sup>[2](#ft2)</sup> | 32,32,32 | 40.3% | | Full Precision<sup>[2](#ft2)</sup> | 32,32,32 | 40.3% |
| TTQ | t,32,32 | 42.0% | | TTQ | t,32,32 | 42.0% |
| BWN | 1,32,32 | 44.6% | | BWN | 1,32,32 | 44.6% |
| BNN | 1,1,32 | 51.9% | | BNN | 1,1,32 | 51.9% |
| DoReFa | 1,2,32 | 46.6% | | DoReFa | 1,2,32 | 46.6% |
| DoReFa | 1,2,6 | 46.8% | | DoReFa | 1,2,6 | 46.8% [:arrow_down:](http://models.tensorpack.com/DoReFa-Net/alexnet-126.npz) |
| DoReFa | 1,2,4 | 54.0% | | DoReFa | 1,2,4 | 54.0% |
<a id="ft1">1</a>: These numbers were obtained by training on 8 GPUs with a total batch size of 256. <a id="ft1">1</a>: These numbers were obtained by training on 8 GPUs with a total batch size of 256.
......
...@@ -11,10 +11,11 @@ with the support of: ...@@ -11,10 +11,11 @@ with the support of:
+ [Group Normalization](https://arxiv.org/abs/1803.08494) + [Group Normalization](https://arxiv.org/abs/1803.08494)
## Dependencies ## Dependencies
+ Python 3; TensorFlow >= 1.6 (1.4 or 1.5 can run but may crash due to a TF bug); + Python 3; OpenCV.
+ [pycocotools](https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/), OpenCV. + TensorFlow >= 1.6 (1.4 or 1.5 can run but may crash due to a TF bug);
+ pycocotools: `pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'`
+ Pre-trained [ImageNet ResNet model](http://models.tensorpack.com/FasterRCNN/) + Pre-trained [ImageNet ResNet model](http://models.tensorpack.com/FasterRCNN/)
from tensorpack model zoo. Use the models with "-AlignPadding". from tensorpack model zoo.
+ COCO data. It needs to have the following directory structure: + COCO data. It needs to have the following directory structure:
``` ```
COCO/DIR/ COCO/DIR/
...@@ -57,7 +58,7 @@ To predict on an image (and show output in a window): ...@@ -57,7 +58,7 @@ To predict on an image (and show output in a window):
``` ```
Evaluate the performance of a model on COCO. Evaluate the performance of a model on COCO.
(Several trained models can be downloaded in [model zoo](http://models.tensorpack.com/FasterRCNN): (Several trained models can be downloaded in [model zoo](http://models.tensorpack.com/FasterRCNN)):
``` ```
./train.py --evaluate output.json --load /path/to/COCO-R50C4-MaskRCNN-Standard.npz \ ./train.py --evaluate output.json --load /path/to/COCO-R50C4-MaskRCNN-Standard.npz \
--config MODE_MASK=True DATA.BASEDIR=/path/to/COCO/DIR --config MODE_MASK=True DATA.BASEDIR=/path/to/COCO/DIR
...@@ -69,19 +70,19 @@ Evaluation or prediction will need the same `--config` used during training. ...@@ -69,19 +70,19 @@ Evaluation or prediction will need the same `--config` used during training.
These models are trained with different configurations on trainval35k and evaluated on minival using mAP@IoU=0.50:0.95. These models are trained with different configurations on trainval35k and evaluated on minival using mAP@IoU=0.50:0.95.
MaskRCNN results contain both box and mask mAP. MaskRCNN results contain both box and mask mAP.
| Backbone | mAP<br/>(box;mask) | Detectron mAP <br/> (box;mask) | Time | Configurations <br/> (click to expand) | | Backbone | mAP<br/>(box;mask) | Detectron mAP <br/> (box;mask) | Time on 8 V100s | Configurations <br/> (click to expand) |
| - | - | - | - | - | | - | - | - | - | - |
| R50-C4 | 33.1 | | 18h on 8 V100s | <details><summary>super quick</summary>`MODE_MASK=False FRCNN.BATCH_PER_IM=64`<br/>`PREPROC.SHORT_EDGE_SIZE=600 PREPROC.MAX_SIZE=1024`<br/>`TRAIN.LR_SCHEDULE=[150000,230000,280000]` </details> | | R50-C4 | 33.1 | | 18h | <details><summary>super quick</summary>`MODE_MASK=False FRCNN.BATCH_PER_IM=64`<br/>`PREPROC.SHORT_EDGE_SIZE=600 PREPROC.MAX_SIZE=1024`<br/>`TRAIN.LR_SCHEDULE=[150000,230000,280000]` </details> |
| R50-C4 | 36.6 | 36.5 | 44h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=False` </details> | | R50-C4 | 36.6 | 36.5 | 44h | <details><summary>standard</summary>`MODE_MASK=False` </details> |
| R50-FPN | 37.5 | 37.9<sup>[1](#ft1)</sup> | 28h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=False MODE_FPN=True` </details> | | R50-FPN | 37.4 | 37.9<sup>[1](#ft1)</sup> | 30h | <details><summary>standard</summary>`MODE_MASK=False MODE_FPN=True` </details> |
| R50-C4 | 36.8;32.1 | | 39h on 8 P100s | <details><summary>quick</summary>`MODE_MASK=True FRCNN.BATCH_PER_IM=256`<br/>`TRAIN.LR_SCHEDULE=[150000,230000,280000]` </details> | | R50-C4 | 37.8;33.1 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50C4-MaskRCNN-Standard.npz) | 37.8;32.8 | 49h | <details><summary>standard</summary>`MODE_MASK=True` </details> |
| R50-C4 | 37.8;33.1 | 37.8;32.8 | 49h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=True` </details> | | R50-FPN | 38.2;34.9 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50FPN-MaskRCNN-Standard.npz) | 38.6;34.5<sup>[1](#ft1)</sup> | 32h | <details><summary>standard</summary>`MODE_MASK=True MODE_FPN=True` </details> |
| R50-FPN | 38.2;34.9 | 38.6;34.5<sup>[1](#ft1)</sup> | 32h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=True MODE_FPN=True` </details> | | R50-FPN | 38.5;34.8 | 38.6;34.2<sup>[2](#ft2)</sup> | 34h | <details><summary>standard+ConvHead</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_head` </details> |
| R50-FPN | 38.5;34.8 | 38.6;34.2<sup>[2](#ft2)</sup> | 34h on 8 V100s | <details><summary>standard+ConvHead</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_head` </details> | | R50-FPN | 39.5;35.2 | 39.5;34.4<sup>[2](#ft2)</sup> | 34h | <details><summary>standard+ConvGNHead</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` </details> |
| R50-FPN | 39.5;35.2 | 39.5;34.4<sup>[2](#ft2)</sup> | 34h on 8 V100s | <details><summary>standard+ConvGNHead</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` </details> | | R50-FPN | 40.0;36.1 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R50FPN-MaskRCNN-StandardGN.npz) | 40.3;35.7 | 44h | <details><summary>standard+GN</summary>`MODE_MASK=True MODE_FPN=True`<br/>`FPN.NORM=GN BACKBONE.NORM=GN`<br/>`FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head`<br/>`FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head |
| R101-C4 | 40.8;35.1 | | 63h on 8 V100s | <details><summary>standard</summary>`MODE_MASK=True `<br/>`BACKBONE.RESNET_NUM_BLOCK=[3,4,23,3]` </details> | | R101-C4 | 40.8;35.1 [:arrow_down:](http://models.tensorpack.com/FasterRCNN/COCO-R101C4-MaskRCNN-Standard.npz) | | 63h | <details><summary>standard</summary>`MODE_MASK=True `<br/>`BACKBONE.RESNET_NUM_BLOCK=[3,4,23,3]` </details> |
<a id="ft1">1</a>: Slightly different configurations. <a id="ft1">1</a>: This implementation has slightly different configurations from detectron (e.g. batch size).
<a id="ft2">2</a>: Numbers taken from [Group Normalization](https://arxiv.org/abs/1803.08494) <a id="ft2">2</a>: Numbers taken from [Group Normalization](https://arxiv.org/abs/1803.08494)
......
...@@ -15,14 +15,14 @@ This recipe has better performance than most open source implementations. ...@@ -15,14 +15,14 @@ This recipe has better performance than most open source implementations.
In fact, many papers that claim to "improve" ResNet only compete with a lower In fact, many papers that claim to "improve" ResNet only compete with a lower
baseline and they actually cannot beat this ResNet recipe. baseline and they actually cannot beat this ResNet recipe.
| Model | Top 5 Error | Top 1 Error | | Model | Top 5 Error | Top 1 Error | Download |
|:-------------------|-------------|------------:| |:------------|:------------|:-----------:|:----------------------------------------------------------------------------:|
| ResNet18 | 10.50% | 29.66% | | ResNet18 | 10.50% | 29.66% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet18.npz) |
| ResNet34 | 8.56% | 26.17% | | ResNet34 | 8.56% | 26.17% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet34.npz) |
| ResNet50 | 6.85% | 23.61% | | ResNet50 | 6.85% | 23.61% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet50.npz) |
| ResNet50-SE | 6.24% | 22.64% | | ResNet50-SE | 6.24% | 22.64% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet50-SE.npz) |
| ResNet101 | 6.04% | 21.95% | | ResNet101 | 6.04% | 21.95% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet101.npz) |
| ResNet152 | 5.78% | 21.51% | | ResNet152 | 5.78% | 21.51% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet152.npz) |
To train, first decompress ImageNet data into [this structure](http://tensorpack.readthedocs.io/en/latest/modules/dataflow.dataset.html#tensorpack.dataflow.dataset.ILSVRC12), then: To train, first decompress ImageNet data into [this structure](http://tensorpack.readthedocs.io/en/latest/modules/dataflow.dataset.html#tensorpack.dataflow.dataset.ILSVRC12), then:
```bash ```bash
......
...@@ -13,7 +13,7 @@ from tensorflow.python.client import timeline ...@@ -13,7 +13,7 @@ from tensorflow.python.client import timeline
from .base import Callback from .base import Callback
from ..utils import logger from ..utils import logger
from ..utils.concurrency import ensure_proc_terminate, start_proc_mask_signal from ..utils.concurrency import ensure_proc_terminate, start_proc_mask_signal
from ..utils.gpu import get_nr_gpu from ..utils.gpu import get_num_gpu
from ..utils.nvml import NVMLContext from ..utils.nvml import NVMLContext
__all__ = ['GPUUtilizationTracker', 'GraphProfiler', 'PeakMemoryTracker'] __all__ = ['GPUUtilizationTracker', 'GraphProfiler', 'PeakMemoryTracker']
...@@ -40,7 +40,7 @@ class GPUUtilizationTracker(Callback): ...@@ -40,7 +40,7 @@ class GPUUtilizationTracker(Callback):
if devices is None: if devices is None:
env = os.environ.get('CUDA_VISIBLE_DEVICES') env = os.environ.get('CUDA_VISIBLE_DEVICES')
if env is None: if env is None:
self._devices = list(range(get_nr_gpu())) self._devices = list(range(get_num_gpu()))
logger.warn("[GPUUtilizationTracker] Both devices and CUDA_VISIBLE_DEVICES are None! " logger.warn("[GPUUtilizationTracker] Both devices and CUDA_VISIBLE_DEVICES are None! "
"Will monitor all {} visible GPUs!".format(len(self._devices))) "Will monitor all {} visible GPUs!".format(len(self._devices)))
else: else:
......
...@@ -221,8 +221,8 @@ def BatchNorm(inputs, axis=None, training=None, momentum=0.9, epsilon=1e-5, ...@@ -221,8 +221,8 @@ def BatchNorm(inputs, axis=None, training=None, momentum=0.9, epsilon=1e-5,
if sync_statistics == 'nccl': if sync_statistics == 'nccl':
if six.PY3 and TF_version <= (1, 9) and ctx.is_main_training_tower: if six.PY3 and TF_version <= (1, 9) and ctx.is_main_training_tower:
logger.warn("A TensorFlow bug will cause cross-GPU BatchNorm to fail. " logger.warn("A bug in TensorFlow<=1.9 will cause cross-GPU BatchNorm to fail. "
"Apply this patch: https://github.com/tensorflow/tensorflow/pull/20360") "Upgrade or apply this patch manually: https://github.com/tensorflow/tensorflow/pull/20360")
from tensorflow.contrib.nccl.ops import gen_nccl_ops from tensorflow.contrib.nccl.ops import gen_nccl_ops
shared_name = re.sub('tower[0-9]+/', '', tf.get_variable_scope().name) shared_name = re.sub('tower[0-9]+/', '', tf.get_variable_scope().name)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment