Commit ae2bd873 authored by Yuxin Wu's avatar Yuxin Wu

update readme

parent d8b4b4d7
# tensorpack-specific stuff
train_log
tensorpack/user_ops/obj
*.npy
*.bin
*.npz
*.caffemodel
*.tfmodel
*.meta
*.log*
model-*
.gitignore
*.caffemodel
*.bin
*.png
*.jpg
checkpoint
*.json
*.prototxt
*.txt
*.tgz
*.gz
# my personal stuff
snippet
examples/private
examples-old
TODO.md
.gitignore
.vimrc.local
*.gz
# Byte-compiled / optimized / DLL files
__pycache__/
......
......@@ -2,5 +2,5 @@
# Some third-party helper functions
+ generate_anchors.py: copied from [py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/rpn/generate_anchors.py).
+ box_ops.py: modified from [TF object detection API](https://github.com/tensorflow/models/blob/master/object_detection/core/box_list_ops.py).
+ box_ops.py: modified from [TF object detection API](https://github.com/tensorflow/models/blob/master/research/object_detection/core/box_list_ops.py).
......@@ -12,7 +12,7 @@ Reproduce the following GAN-related methods, 100~200 lines each:
+ [Wasserstein GAN](https://arxiv.org/abs/1701.07875)
+ Improved Wasserstein GAN ([Improved Training of Wasserstein GANs](https://arxiv.org/abs/1704.00028))
+ Improved Wasserstein GAN, i.e. WGAN-GP ([Improved Training of Wasserstein GANs](https://arxiv.org/abs/1704.00028))
+ DiscoGAN ([Learning to Discover Cross-Domain Relations with Generative Adversarial Networks](https://arxiv.org/abs/1703.05192))
......@@ -23,7 +23,7 @@ Reproduce the following GAN-related methods, 100~200 lines each:
Please see the __docstring__ in each script for detailed usage and pretrained models. MultiGPU training is supported.
## DCGAN.py
## [DCGAN.py](DCGAN.py)
Reproduce DCGAN following the setup in [dcgan.torch](https://github.com/soumith/dcgan.torch).
......@@ -35,7 +35,7 @@ Reproduce DCGAN following the setup in [dcgan.torch](https://github.com/soumith/
![vec](demo/DCGAN-CelebA-vec.jpg)
## Image2Image.py
## [Image2Image.py](Image2Image.py)
Image-to-Image translation following the setup in [pix2pix](https://github.com/phillipi/pix2pix).
......@@ -45,7 +45,7 @@ For example, with the cityscapes dataset, it learns to generate semantic segment
This is a visualization from tensorboard. Left to right: original, ground truth, model output.
## InfoGAN-mnist.py
## [InfoGAN-mnist.py](InfoGAN-mnist.py)
Reproduce the mnist experiement in InfoGAN.
It assumes 10 latent variables corresponding to a categorical distribution, 2 latent variables corresponding to a uniform distribution.
......@@ -57,18 +57,18 @@ It then maximizes mutual information between these latent variables and the imag
* Middle: 1 continuous latent variable controlled the rotation.
* Right: another continuous latent variable controlled the thickness.
## ConditionalGAN-mnist.py
## [ConditionalGAN-mnist.py](ConditionalGAN-mnist.py)
Train a simple GAN on mnist, conditioned on the class labels.
## WGAN.py, Improved-WGAN.py, BEGAN.py
## [WGAN.py](WGAN.py), [Improved-WGAN.py](Improved-WGAN.py), [BEGAN.py](BEGAN.py)
These variants are implemented by some small modifications on top of DCGAN.py.
Some BEGAN samples:
![began-sample](demo/BEGAN-CelebA-samples.jpg)
## CycleGAN.py, DiscoGAN-CelebA.py
## [CycleGAN.py](CycleGAN.py), [DiscoGAN-CelebA.py](DiscoGAN-CelebA.py)
Reproduce CycleGAN with the original datasets, and DiscoGAN on CelebA. They are pretty much the same idea with different architecture.
CycleGAN horse-to-zebra in tensorboard:
......
## imagenet-resnet.py
## [imagenet-resnet.py](imagenet-resnet.py)
__Training__ code of three variants of ResNet on ImageNet:
......@@ -30,7 +30,7 @@ See the [tutorial](http://tensorpack.readthedocs.io/en/latest/tutorial/efficient
![imagenet](imagenet-resnet.png)
## load-resnet.py
## [load-resnet.py](load-resnet.py)
This script only converts and runs ImageNet-ResNet{50,101,152} Caffe models [released by MSRA](https://github.com/KaimingHe/deep-residual-networks).
Note that the architecture is different from the `imagenet-resnet.py` script and the models are not compatible.
......@@ -52,7 +52,7 @@ The per-pixel mean used here is slightly different from the original.
| ResNet 101 | 7.11% | 23.54% |
| ResNet 152 | 6.71% | 23.21% |
## cifar10-resnet.py
## [cifar10-resnet.py](cifar10-resnet.py)
Reproduce pre-activation ResNet on CIFAR10.
......@@ -61,12 +61,15 @@ Reproduce pre-activation ResNet on CIFAR10.
Also see a [DenseNet implementation](https://github.com/YixuanLi/densenet-tensorflow) of the paper [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993).
## cifar10-preact18-mixup.py
## [cifar10-preact18-mixup.py](cifar10-preact18-mixup.py)
Reproduce mixup pre-activation ResNet18 on CIFAR10.
Please notice that this preact18 architecture is
Reproduce the mixup pre-act ResNet-18 CIFAR10 experiment, in the paper:
* [mixup: Beyond Empirical Risk Minimization](https://arxiv.org/abs/1710.09412).
Please note that this preact18 architecture is
[different](https://github.com/kuangliu/pytorch-cifar/blob/master/models/preact_resnet.py)
as the [mixup paper](https://arxiv.org/abs/1710.09412) said.
from `cifar10-resnet18.py`.
Usage:
```bash
......@@ -75,5 +78,6 @@ Usage:
```
Validation error with the original LR schedule (100-150-200): __5.0%__ without mixup, __3.8%__ with mixup.
This matches the number in the paper.
With 2x LR schedule: 4.7% without mixup, and 3.2% with mixup.
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# File: load-resnet.py
# Author: Eric Yujia Huang yujiah1@andrew.cmu.edu
# Author: Eric Yujia Huang <yujiah1@andrew.cmu.edu>
# Yuxin Wu <ppwwyyxx@gmail.com>
import cv2
......@@ -11,7 +11,6 @@ import argparse
import re
import numpy as np
import six
from tensorflow.contrib.layers import variance_scaling_initializer
from tensorpack import *
from tensorpack.utils import logger
......@@ -45,8 +44,7 @@ class Model(ModelDesc):
image = tf.transpose(image, [0, 3, 1, 2])
with argscope([Conv2D, MaxPooling, GlobalAvgPooling, BatchNorm],
data_format='NCHW'), \
argscope(Conv2D, nl=tf.identity, use_bias=False,
W_init=variance_scaling_initializer(mode='FAN_OUT')):
argscope(Conv2D, nl=tf.identity, use_bias=False):
logits = (LinearWrap(image)
.Conv2D('conv0', 64, 7, stride=2, nl=BNReLU, padding='VALID')
.MaxPooling('pool0', shape=3, stride=2, padding='SAME')
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment