Commit 964f5d03 authored by Yuxin Wu's avatar Yuxin Wu

update docs; add BN WD in resnet

parent 274c7544
...@@ -34,9 +34,9 @@ See [tutorials and documentations](http://tensorpack.readthedocs.io/tutorial/ind ...@@ -34,9 +34,9 @@ See [tutorials and documentations](http://tensorpack.readthedocs.io/tutorial/ind
## Examples: ## Examples:
We refuse toy examples. We refuse toy examples. We refuse low-quality implementations.
Instead of showing you 10 arbitrary networks trained on toy datasets, Unlike most open source repos which only __implement__ papers,
[Tensorpack examples](examples) faithfully replicate papers and care about reproducing numbers, [Tensorpack examples](examples) faithfully __reproduce__ papers,
demonstrating its __flexibility__ for actual research. demonstrating its __flexibility__ for actual research.
### Vision: ### Vision:
......
...@@ -20,9 +20,9 @@ Claimed performance in the paper can be reproduced, on several games I've tested ...@@ -20,9 +20,9 @@ Claimed performance in the paper can be reproduced, on several games I've tested
![DQN](curve-breakout.png) ![DQN](curve-breakout.png)
On one (Maxwell) TitanX, Double-DQN took ~18 hours of training to reach a score of 400 on breakout. On one GTX 1080Ti, the ALE version took ~3 hours of training to reach 21 (maximum) score on
Pong, ~15 hours of training to reach 400 score on Breakout.
Double-DQN with nature paper setting runs at 60 batches (3840 trained frames, 240 seen frames, 960 game frames) per second on TitanX. It runs at 50 batches (~3.2k trained frames, 200 seen frames, 800 game frames) per second on GTX 1080Ti.
## How to use ## How to use
......
...@@ -423,7 +423,8 @@ if __name__ == '__main__': ...@@ -423,7 +423,8 @@ if __name__ == '__main__':
if args.visualize or args.evaluate or args.predict: if args.visualize or args.evaluate or args.predict:
if not tf.test.is_gpu_available(): if not tf.test.is_gpu_available():
from tensorflow.python.framework import test_util from tensorflow.python.framework import test_util
assert test_util.IsMklEnabled(), "Inference requires either GPU support or MKL support!" assert get_tf_version_tuple() >= (1, 7) and test_util.IsMklEnabled(), \
"Inference requires either GPU support or MKL support!"
assert args.load assert args.load
finalize_configs(is_training=False) finalize_configs(is_training=False)
......
...@@ -9,14 +9,15 @@ Github is full of deep learning code that "implements" but does not "reproduce" ...@@ -9,14 +9,15 @@ Github is full of deep learning code that "implements" but does not "reproduce"
methods, and you'll not know whether the implementation is actually correct. methods, and you'll not know whether the implementation is actually correct.
See [Unawareness of Deep Learning Mistakes](https://medium.com/@ppwwyyxx/unawareness-of-deep-learning-mistakes-d5b5774da0ba). See [Unawareness of Deep Learning Mistakes](https://medium.com/@ppwwyyxx/unawareness-of-deep-learning-mistakes-d5b5774da0ba).
We refuse toy examples. We refuse toy examples. We refuse low-quality implementations.
Instead of showing you 10 arbitrary networks trained on toy datasets with random final performance, Unlike most open source repos which only __implement__ methods,
tensorpack examples try to faithfully replicate experiments and performance in the paper, [Tensorpack examples](examples) faithfully __reproduce__
experiments and performance in the paper,
so you're confident that they are correct. so you're confident that they are correct.
## Getting Started: ## Getting Started:
These are all the toy examples in tensorpack. They are supposed to be just demos. These are the only toy examples in tensorpack. They are supposed to be just demos.
+ [An illustrative MNIST example with explanation of the framework](basics/mnist-convnet.py) + [An illustrative MNIST example with explanation of the framework](basics/mnist-convnet.py)
+ Tensorpack supports any symbolic libraries. See the same MNIST example written with [tf.layers](basics/mnist-tflayers.py), [tf-slim](basics/mnist-tfslim.py), and [with weights visualizations](basics/mnist-visualizations.py) + Tensorpack supports any symbolic libraries. See the same MNIST example written with [tf.layers](basics/mnist-tflayers.py), [tf-slim](basics/mnist-tfslim.py), and [with weights visualizations](basics/mnist-visualizations.py)
+ A tiny [Cifar ConvNet](basics/cifar-convnet.py) and [SVHN ConvNet](basics/svhn-digit-convnet.py) + A tiny [Cifar ConvNet](basics/cifar-convnet.py) and [SVHN ConvNet](basics/svhn-digit-convnet.py)
...@@ -27,7 +28,7 @@ These are all the toy examples in tensorpack. They are supposed to be just demos ...@@ -27,7 +28,7 @@ These are all the toy examples in tensorpack. They are supposed to be just demos
| Name | Performance | | Name | Performance |
| --- | --- | | --- | --- |
| Train [ResNet](ResNet), [ShuffleNet and other models](ImageNetModels) on ImageNet | reproduce paper | | Train [ResNet](ResNet), [ShuffleNet and other models](ImageNetModels) on ImageNet | reproduce paper |
| [Train Mask/Faster R-CNN on COCO](FasterRCNN) | reproduce paper | | [Train Mask/Faster R-CNN on COCO](FasterRCNN) | reproduce paper |
| [Generative Adversarial Network(GAN) variants](GAN), including DCGAN, InfoGAN, <br/> Conditional GAN, WGAN, BEGAN, DiscoGAN, Image to Image, CycleGAN | visually reproduce | | [Generative Adversarial Network(GAN) variants](GAN), including DCGAN, InfoGAN, <br/> Conditional GAN, WGAN, BEGAN, DiscoGAN, Image to Image, CycleGAN | visually reproduce |
| [DoReFa-Net: training binary / low-bitwidth CNN on ImageNet](DoReFa-Net) | reproduce paper | | [DoReFa-Net: training binary / low-bitwidth CNN on ImageNet](DoReFa-Net) | reproduce paper |
| [Fully-convolutional Network for Holistically-Nested Edge Detection(HED)](HED) | visually reproduce | | [Fully-convolutional Network for Holistically-Nested Edge Detection(HED)](HED) | visually reproduce |
...@@ -37,7 +38,7 @@ These are all the toy examples in tensorpack. They are supposed to be just demos ...@@ -37,7 +38,7 @@ These are all the toy examples in tensorpack. They are supposed to be just demos
| Single-image super-resolution using [EnhanceNet](SuperResolution) | | | Single-image super-resolution using [EnhanceNet](SuperResolution) | |
| Learn steering filters with [Dynamic Filter Networks](DynamicFilterNetwork) | visually reproduce | | Learn steering filters with [Dynamic Filter Networks](DynamicFilterNetwork) | visually reproduce |
| Load a pre-trained [AlexNet, VGG, or Convolutional Pose Machines](CaffeModels) | | | Load a pre-trained [AlexNet, VGG, or Convolutional Pose Machines](CaffeModels) | |
| Load a pre-trained [FlowNet2-S, FlowNet2-C, FlowNet2](OpticalFlow) | | | Load a pre-trained [FlowNet2-S, FlowNet2-C, FlowNet2](OpticalFlow) | |
## Reinforcement Learning: ## Reinforcement Learning:
| Name | Performance | | Name | Performance |
......
...@@ -103,16 +103,25 @@ def get_config(model): ...@@ -103,16 +103,25 @@ def get_config(model):
if __name__ == '__main__': if __name__ == '__main__':
parser = argparse.ArgumentParser() parser = argparse.ArgumentParser()
# generic:
parser.add_argument('--gpu', help='comma separated list of GPU(s) to use. Default to use all available ones') parser.add_argument('--gpu', help='comma separated list of GPU(s) to use. Default to use all available ones')
parser.add_argument('--data', help='ILSVRC dataset dir') parser.add_argument('--eval', action='store_true', help='run offline evaluation instead of training')
parser.add_argument('--load', help='load a model for training or evaluation') parser.add_argument('--load', help='load a model for training or evaluation')
# data:
parser.add_argument('--data', help='ILSVRC dataset dir')
parser.add_argument('--fake', help='use FakeData to debug or benchmark this model', action='store_true') parser.add_argument('--fake', help='use FakeData to debug or benchmark this model', action='store_true')
parser.add_argument('--symbolic', help='use symbolic data loader', action='store_true') parser.add_argument('--symbolic', help='use symbolic data loader', action='store_true')
parser.add_argument('--data-format', help='image data format',
# model:
parser.add_argument('--data-format', help='the image data layout used by the model',
default='NCHW', choices=['NCHW', 'NHWC']) default='NCHW', choices=['NCHW', 'NHWC'])
parser.add_argument('-d', '--depth', help='ResNet depth', parser.add_argument('-d', '--depth', help='ResNet depth',
type=int, default=50, choices=[18, 34, 50, 101, 152]) type=int, default=50, choices=[18, 34, 50, 101, 152])
parser.add_argument('--eval', action='store_true', help='run offline evaluation instead of training') parser.add_argument('--weight-decay-norm', action='store_true',
help="apply weight decay on normalization layers (gamma & beta)."
"This is used in torch/pytorch, and slightly "
"improves validation accuracy of large models.")
parser.add_argument('--batch', default=256, type=int, parser.add_argument('--batch', default=256, type=int,
help="total batch size. " help="total batch size. "
"Note that it's best to keep per-GPU batch size in [32, 64] to obtain the best accuracy." "Note that it's best to keep per-GPU batch size in [32, 64] to obtain the best accuracy."
...@@ -126,6 +135,9 @@ if __name__ == '__main__': ...@@ -126,6 +135,9 @@ if __name__ == '__main__':
model = Model(args.depth, args.mode) model = Model(args.depth, args.mode)
model.data_format = args.data_format model.data_format = args.data_format
if model.weight_decay_norm:
model.weight_decay_pattern = ".*/W|.*/gamma|.*/beta"
if args.eval: if args.eval:
batch = 128 # something that can run on one gpu batch = 128 # something that can run on one gpu
ds = get_imagenet_dataflow(args.data, 'val', batch) ds = get_imagenet_dataflow(args.data, 'val', batch)
......
...@@ -198,7 +198,7 @@ def Conv2DTranspose( ...@@ -198,7 +198,7 @@ def Conv2DTranspose(
else: else:
# Our own implementation, to avoid Keras bugs. https://github.com/tensorflow/tensorflow/issues/25946 # Our own implementation, to avoid Keras bugs. https://github.com/tensorflow/tensorflow/issues/25946
assert kernel_regularizer is None and bias_regularizer is None and activity_regularizer is None, \ assert kernel_regularizer is None and bias_regularizer is None and activity_regularizer is None, \
"Unsupported arguments due to bug in TensorFlow 1.13" "Unsupported arguments due to Keras bug in TensorFlow 1.13"
data_format = get_data_format(data_format, keras_mode=False) data_format = get_data_format(data_format, keras_mode=False)
shape_dyn = tf.shape(inputs) shape_dyn = tf.shape(inputs)
strides2d = shape2d(strides) strides2d = shape2d(strides)
......
...@@ -226,7 +226,7 @@ def is_training_name(name): ...@@ -226,7 +226,7 @@ def is_training_name(name):
return True return True
if name.endswith('/Adagrad'): if name.endswith('/Adagrad'):
return True return True
if name.startswith('EMA/'): # all the moving average summaries if name.startswith('EMA/') or '/EMA/' in name: # all the moving average summaries
return True return True
if name.startswith('AccumGrad') or name.endswith('/AccumGrad'): if name.startswith('AccumGrad') or name.endswith('/AccumGrad'):
return True return True
......
...@@ -56,7 +56,7 @@ def launch_train_with_config(config, trainer): ...@@ -56,7 +56,7 @@ def launch_train_with_config(config, trainer):
2. Call `trainer.setup_graph` with the input as well as `config.model`. 2. Call `trainer.setup_graph` with the input as well as `config.model`.
3. Call `trainer.train` with rest of the attributes of config. 3. Call `trainer.train` with rest of the attributes of config.
See tutorial at See the `related tutorial
<https://tensorpack.readthedocs.io/tutorial/training-interface.html#with-modeldesc-and-trainconfig>`_ <https://tensorpack.readthedocs.io/tutorial/training-interface.html#with-modeldesc-and-trainconfig>`_
to learn more. to learn more.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment