Commit 0a141662 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 140778d4
......@@ -53,13 +53,16 @@ We do not answer machine learning questions and it is your responsibility to
figure out how to make your models more accurate.
### 4. Your environment:
+ Paste the output of this command: `python -c 'import tensorpack.tfutils as u; print(u.collect_env_info())'`
If this command failed, tell us your version of Python/TF/tensorpack.
Paste the output of this command: `python -c 'import tensorpack.tfutils as u; print(u.collect_env_info())'`
If this command failed, tell us your version of Python/TF/tensorpack.
Note that:
+ You can install Tensorpack master by `pip install -U git+https://github.com/ppwwyyxx/tensorpack.git`
and see if your issue is already solved.
+ If you're not using tensorpack under a normal command line shell (e.g.,
using an IDE or jupyter notebook), please retry under a normal command line shell.
+ Include relevant hardware information, e.g. number of GPUs used for training, amount of RAM.
You may often want to provide extra information related to your issue, but
at the minimum please try to provide the above information __accurately__ to save effort in the investigation.
......@@ -78,9 +78,9 @@ Efficiency:
set in `train.py`; (2) reduce `buffer_size` or `NUM_WORKERS` in `data.py`
(which may negatively impact your throughput). The training only needs <10G RAM if `NUM_WORKERS=0`.
1. Inference is unoptimized. Tensorpack is a training interface, therefore it
does not help you on optimized inference. In fact, the current implementation
uses some slow numpy operations in inference (in `eval.py:_paste_mask`).
1. Inference is unoptimized. Tensorpack is a training interface: it produces the trained weights
in standard format but it does not help you on optimized inference.
In fact, the current implementation uses some slow numpy operations in inference (in `eval.py:_paste_mask`).
Possible Future Enhancements:
......
......@@ -44,9 +44,13 @@ to `annotations/` as well.
## Usage
It is recommended to get familiar the relevant papers listed above before using this code.
Otherwise you may end up doing something unreasonable.
### Train:
To train on a single machine:
To train on a single machine (with 1 or more GPUs):
```
./train.py --config \
BACKBONE.WEIGHTS=/path/to/ImageNet-R50-AlignPadding.npz \
......@@ -54,7 +58,8 @@ To train on a single machine:
[OTHER-ARCHITECTURE-SETTINGS]
```
To run distributed training, set `TRAINER=horovod` and refer to [HorovodTrainer docs](http://tensorpack.readthedocs.io/modules/train.html#tensorpack.train.HorovodTrainer).
Alternatively, use `TRAINER=horovod` which supports distributed training as well, but less straightforward to run.
Refer to [HorovodTrainer docs](http://tensorpack.readthedocs.io/modules/train.html#tensorpack.train.HorovodTrainer) for details.
All options can be changed by either the command line or the `config.py` file (recommended).
Some reasonable configurations are listed in the table below.
......@@ -74,7 +79,7 @@ To evaluate the performance of a model on COCO:
```
Several trained models can be downloaded in the table below. Evaluation and
prediction will need to be run with the corresponding configs used in training.
prediction have to be run with the corresponding configs used in training.
## Results
......
......@@ -43,15 +43,18 @@ See `./alexnet.py --help` for usage.
This VGG16 script, when trained with 8 GPUs and 32 batch size per GPU, reaches the following
validation error after 100 epochs (30h with 8 P100s). This reproduces the VGG
experiments in the paper [Group Normalization](https://arxiv.org/abs/1803.08494).
experiments in the paper [Group Normalization](https://arxiv.org/abs/1803.08494)
(more code about this paper can be found at [GroupNorm-reproduce](https://github.com/ppwwyyxx/GroupNorm-reproduce)).
See `./vgg16.py --help` for usage.
| No Normalization | Batch Normalization | Group Normalization |
|:------------------------------------------|:-------------------:|:-------------------:|
| 29~30% (large variation with random seed) | 28% | 27.6% |
Note that the purpose of this experiment in the paper is not to claim GroupNorm
has better performance than BatchNorm.
<sup> Note that this single experiment does not constitute a valid claim that GroupNorm
has better performance than BatchNorm.</sup>
### Inception-BN
......
......@@ -31,12 +31,13 @@ These are the only toy examples in tensorpack. They are supposed to be just demo
## Vision:
| Name | Performance |
| --- | --- |
| Train [ResNet](ResNet), [ShuffleNet and other models](ImageNetModels) on ImageNet | reproduce paper |
| [Train Mask/Faster R-CNN on COCO](FasterRCNN) | reproduce paper |
| [Generative Adversarial Network(GAN) variants](GAN), including DCGAN, InfoGAN, <br/> Conditional GAN, WGAN, BEGAN, DiscoGAN, Image to Image, CycleGAN | visually reproduce |
| [DoReFa-Net: training binary / low-bitwidth CNN on ImageNet](DoReFa-Net) | reproduce paper |
| Train [ResNet](ResNet), [ShuffleNet and other models](ImageNetModels) on ImageNet | reproduce 10 papers |
| [Train Mask/Faster R-CNN on COCO](FasterRCNN) | reproduce 7 papers |
| [Generative Adversarial Network(GAN) variants](GAN), including DCGAN, InfoGAN, <br/> Conditional GAN, WGAN, BEGAN, DiscoGAN, Image to Image, CycleGAN | visually reproduce 8 papers |
| [DoReFa-Net: training binary / low-bitwidth CNN on ImageNet](DoReFa-Net) | reproduce 4 papers |
| [Adversarial training with state-of-the-art robustness](https://github.com/facebookresearch/ImageNet-Adversarial-Training) | official code for the paper |
| [Fully-convolutional Network for Holistically-Nested Edge Detection(HED)](HED) | visually reproduce |
| [Spatial Transformer Networks on MNIST addition](SpatialTransformer) | reproduce paper |
| [Spatial Transformer Networks on MNIST addition](SpatialTransformer) | reproduce the paper |
| [Visualize CNN saliency maps](Saliency) | visually reproduce |
| [Similarity learning on MNIST](SimilarityLearning) | |
| Single-image super-resolution using [EnhanceNet](SuperResolution) | |
......@@ -47,12 +48,12 @@ These are the only toy examples in tensorpack. They are supposed to be just demo
## Reinforcement Learning:
| Name | Performance |
| --- | --- |
| [Deep Q-Network(DQN) variants on Atari games](DeepQNetwork), including <br/> DQN, DoubleDQN, DuelingDQN. | reproduce paper |
| [Asynchronous Advantage Actor-Critic(A3C) on Atari games](A3C-Gym) | reproduce paper |
| [Deep Q-Network(DQN) variants on Atari games](DeepQNetwork), including <br/> DQN, DoubleDQN, DuelingDQN. | reproduce the paper |
| [Asynchronous Advantage Actor-Critic(A3C) on Atari games](A3C-Gym) | reproduce the paper |
## Speech / NLP:
| Name | Performance |
| --- | --- |
| [LSTM-CTC for speech recognition](CTC-TIMIT) | reproduce paper |
| [LSTM-CTC for speech recognition](CTC-TIMIT) | reproduce the paper |
| [char-rnn for fun](Char-RNN) | fun |
| [LSTM language model on PennTreebank](PennTreebank) | reproduce reference code |
......@@ -26,7 +26,7 @@ baseline and they actually cannot beat this standard ResNet recipe.
| ResNeXt101-32x4d | 5.73% | 21.05% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNeXt101-32x4d.npz) |
| ResNet152 | 5.78% | 21.51% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet152.npz) |
To reproduce training or evaluation,
To reproduce training or evaluation in the above table,
first decompress ImageNet data into [this structure](http://tensorpack.readthedocs.io/modules/dataflow.dataset.html#tensorpack.dataflow.dataset.ILSVRC12), then:
```bash
./imagenet-resnet.py --data /directory/of/ILSVRC -d 50 --batch 512
......
......@@ -19,9 +19,10 @@ class RandomCrop(ImageAugmentor):
def __init__(self, crop_shape):
"""
Args:
crop_shape: (h, w) tuple or a int
crop_shape: (h, w), int or a tuple of int
"""
crop_shape = shape2d(crop_shape)
crop_shape = (int(crop_shape[0]), int(crop_shape[1]))
super(RandomCrop, self).__init__()
self._init(locals())
......@@ -30,9 +31,9 @@ class RandomCrop(ImageAugmentor):
assert orig_shape[0] >= self.crop_shape[0] \
and orig_shape[1] >= self.crop_shape[1], orig_shape
diffh = orig_shape[0] - self.crop_shape[0]
h0 = 0 if diffh == 0 else self.rng.randint(diffh)
h0 = self.rng.randint(diffh + 1)
diffw = orig_shape[1] - self.crop_shape[1]
w0 = 0 if diffw == 0 else self.rng.randint(diffw)
w0 = self.rng.randint(diffw + 1)
return CropTransform(h0, w0, self.crop_shape[0], self.crop_shape[1])
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment