update readme

ae2bd873 · Yuxin Wu · d8b4b4d7 · ae2bd873 · ae2bd873 · ae2bd873
Commit ae2bd873 authored Dec 02, 2017 by Yuxin Wu
5 changed files
--- a/.gitignore
+++ b/.gitignore
 # tensorpack-specific stuff
 train_log
 tensorpack/user_ops/obj
 *.npy
-*.bin
+*.npz
+*.caffemodel
 *.tfmodel
 *.meta
 *.log*
-model-*
+*.bin
-.gitignore
-*.caffemodel
 *.png
 *.jpg
 checkpoint
 *.json
 *.prototxt
 *.txt
+*.tgz
+*.gz
 # my personal stuff
 snippet
 examples/private
+examples-old
 TODO.md
+.gitignore
+.vimrc.local
-*.gz
 # Byte-compiled / optimized / DLL files
 __pycache__/

--- a/examples/FasterRCNN/utils/README.md
+++ b/examples/FasterRCNN/utils/README.md
@@ -2,5 +2,5 @@
 # Some third-party helper functions
 + generate_anchors.py: copied from [py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/rpn/generate_anchors.py).
-+ box_ops.py: modified from [TF object detection API](https://github.com/tensorflow/models/blob/master/object_detection/core/box_list_ops.py).
+ box_ops.py: modified from [TF object detection API](https://github.com/tensorflow/models/blob/master/research/object_detection/core/box_list_ops.py).
--- a/examples/GAN/README.md
+++ b/examples/GAN/README.md
@@ -12,7 +12,7 @@ Reproduce the following GAN-related methods, 100~200 lines each:
 + [Wasserstein GAN](https://arxiv.org/abs/1701.07875)
-+ Improved Wasserstein GAN ([Improved Training of Wasserstein GANs](https://arxiv.org/abs/1704.00028))
+ Improved Wasserstein GAN, i.e. WGAN-GP ([Improved Training of Wasserstein GANs](https://arxiv.org/abs/1704.00028))
 + DiscoGAN ([Learning to Discover Cross-Domain Relations with Generative Adversarial Networks](https://arxiv.org/abs/1703.05192))
@@ -23,7 +23,7 @@ Reproduce the following GAN-related methods, 100~200 lines each:
 Please see the __docstring__ in each script for detailed usage and pretrained models. MultiGPU training is supported.
-## DCGAN.py
+## [DCGAN.py](DCGAN.py)
 Reproduce DCGAN following the setup in [dcgan.torch](https://github.com/soumith/dcgan.torch).
@@ -35,7 +35,7 @@ Reproduce DCGAN following the setup in [dcgan.torch](https://github.com/soumith/
 ![vec](demo/DCGAN-CelebA-vec.jpg)
-## Image2Image.py
+## [Image2Image.py](Image2Image.py)
 Image-to-Image translation following the setup in [pix2pix](https://github.com/phillipi/pix2pix).
@@ -45,7 +45,7 @@ For example, with the cityscapes dataset, it learns to generate semantic segment
 This is a visualization from tensorboard. Left to right: original, ground truth, model output.
-## InfoGAN-mnist.py
+## [InfoGAN-mnist.py](InfoGAN-mnist.py)
 Reproduce the mnist experiement in InfoGAN.
 It assumes 10 latent variables corresponding to a categorical distribution, 2 latent variables corresponding to a uniform distribution.
@@ -57,18 +57,18 @@ It then maximizes mutual information between these latent variables and the imag
 * Middle: 1 continuous latent variable controlled the rotation.
 * Right: another continuous latent variable controlled the thickness.
-## ConditionalGAN-mnist.py
+## [ConditionalGAN-mnist.py](ConditionalGAN-mnist.py)
 Train a simple GAN on mnist, conditioned on the class labels.
-## WGAN.py, Improved-WGAN.py, BEGAN.py
+## [WGAN.py](WGAN.py), [Improved-WGAN.py](Improved-WGAN.py), [BEGAN.py](BEGAN.py)
 These variants are implemented by some small modifications on top of DCGAN.py.
 Some BEGAN samples:
 ![began-sample](demo/BEGAN-CelebA-samples.jpg)
-## CycleGAN.py, DiscoGAN-CelebA.py
+## [CycleGAN.py](CycleGAN.py), [DiscoGAN-CelebA.py](DiscoGAN-CelebA.py)
 Reproduce CycleGAN with the original datasets, and DiscoGAN on CelebA. They are pretty much the same idea with different architecture.
 CycleGAN horse-to-zebra in tensorboard:

--- a/examples/ResNet/README.md
+++ b/examples/ResNet/README.md
-## imagenet-resnet.py
+## [imagenet-resnet.py](imagenet-resnet.py)
 __Training__ code of three variants of ResNet on ImageNet:
@@ -30,7 +30,7 @@ See the [tutorial](http://tensorpack.readthedocs.io/en/latest/tutorial/efficient
 ![imagenet](imagenet-resnet.png)
-## load-resnet.py
+## [load-resnet.py](load-resnet.py)
 This script only converts and runs ImageNet-ResNet{50,101,152} Caffe models [released by MSRA](https://github.com/KaimingHe/deep-residual-networks).
 Note that the architecture is different from the `imagenet-resnet.py` script and the models are not compatible.
@@ -52,7 +52,7 @@ The per-pixel mean used here is slightly different from the original.
 | ResNet 101         |      7.11%  |      23.54% |
 | ResNet 152         |      6.71%  |      23.21% |
-## cifar10-resnet.py
+## [cifar10-resnet.py](cifar10-resnet.py)
 Reproduce pre-activation ResNet on CIFAR10.
@@ -61,12 +61,15 @@ Reproduce pre-activation ResNet on CIFAR10.
 Also see a [DenseNet implementation](https://github.com/YixuanLi/densenet-tensorflow) of the paper [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993).
-## cifar10-preact18-mixup.py
+## [cifar10-preact18-mixup.py](cifar10-preact18-mixup.py)
-Reproduce mixup pre-activation ResNet18 on CIFAR10.
+Reproduce the mixup pre-act ResNet-18 CIFAR10 experiment, in the paper:
-Please notice that this preact18 architecture is
+* [mixup: Beyond Empirical Risk Minimization](https://arxiv.org/abs/1710.09412).
+Please note that this preact18 architecture is
 [different](https://github.com/kuangliu/pytorch-cifar/blob/master/models/preact_resnet.py)
-as the [mixup paper](https://arxiv.org/abs/1710.09412) said.
+from `cifar10-resnet18.py`.
 Usage:
 ```bash
@@ -75,5 +78,6 @@ Usage:
 ```
 Validation error with the original LR schedule (100-150-200): __5.0%__ without mixup, __3.8%__ with mixup.
+This matches the number in the paper.
 With 2x LR schedule: 4.7% without mixup, and 3.2% with mixup.
--- a/examples/ResNet/load-resnet.py
+++ b/examples/ResNet/load-resnet.py
 #!/usr/bin/env python
 # -*- coding: UTF-8 -*-
 # File: load-resnet.py
-# Author: Eric Yujia Huang yujiah1@andrew.cmu.edu
+# Author: Eric Yujia Huang <yujiah1@andrew.cmu.edu>
 #         Yuxin Wu <ppwwyyxx@gmail.com>
 import cv2
@@ -11,7 +11,6 @@ import argparse
 import re
 import numpy as np
 import six
-from tensorflow.contrib.layers import variance_scaling_initializer
 from tensorpack import *
 from tensorpack.utils import logger
@@ -45,8 +44,7 @@ class Model(ModelDesc):
        image = tf.transpose(image, [0, 3, 1, 2])
        with argscope([Conv2D, MaxPooling, GlobalAvgPooling, BatchNorm],
                      data_format='NCHW'), \
-                argscope(Conv2D, nl=tf.identity, use_bias=False,
+                argscope(Conv2D, nl=tf.identity, use_bias=False):
-                         W_init=variance_scaling_initializer(mode='FAN_OUT')):
            logits = (LinearWrap(image)
                      .Conv2D('conv0', 64, 7, stride=2, nl=BNReLU, padding='VALID')
                      .MaxPooling('pool0', shape=3, stride=2, padding='SAME')