Commit 40711389 authored by Yuxin Wu's avatar Yuxin Wu

Move Caffe models together.

parent 0ba89131
Example code to convert, load and run inference of some Caffe models.
Require caffe python bindings to be installed.
Converted models can also be found at [tensorpack model zoo](http://models.tensorpack.com).
## AlexNet:
Download: https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet
Convert: `python -m tensorpack.utils.loadcaffe PATH/TO/CAFFE/{deploy.prototxt,bvlc_alexnet.caffemodel} alexnet.npz`
Run: `./load-alexnet.py --load alexnet.npz --input cat.png`
## VGG
Download: https://gist.github.com/ksimonyan/211839e770f7b538e2d8
Convert:
```
python -m tensorpack.utils.loadcaffe \
PATH/TO/VGG/{VGG_ILSVRC_16_layers_deploy.prototxt,VGG_ILSVRC_16_layers.caffemodel} vgg16.npz
```
Run: `./load-vgg16.py --load vgg16.npz --input cat.png`
## ResNet
To load caffe version of ResNet, see instructions in [ResNet examples](../ResNet).
## Convolutional Pose Machines
Download:
```
wget http://pearl.vasc.ri.cmu.edu/caffe_model_github/model/_trained_MPI/pose_iter_320000.caffemodel
wget https://github.com/shihenw/convolutional-pose-machines-release/raw/master/model/_trained_MPI/pose_deploy_resize.prototxt
```
Convert: `python -m tensorpack.utils.loadcaffe pose_deploy_resize.prototxt pose_iter_320000.caffemodel CPM-original.npz`
Run: `python load-cpm.py --load CPM-original.npz --input test.jpg`
Input image will get resized to 368x368. Note that this CPM comes __without__ person detection, so the
person has to be in the center of the image (and not too small).
![demo](demo-cpm.jpg)
Also check out [Stereo Pose Machines](https://github.com/ppwwyyxx/Stereo-Pose-Machines), a real-time CPM application based on tensorpack.
...@@ -15,20 +15,6 @@ from tensorpack.tfutils.summary import * ...@@ -15,20 +15,6 @@ from tensorpack.tfutils.summary import *
from tensorpack.dataflow.dataset import ILSVRCMeta from tensorpack.dataflow.dataset import ILSVRCMeta
import tensorflow as tf import tensorflow as tf
"""
Usage:
Download caffe models at https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet
Install caffe python bindings.
python -m tensorpack.utils.loadcaffe PATH/TO/CAFFE/{deploy.prototxt,bvlc_alexnet.caffemodel} alexnet.npz
Or download a converted caffe model from http://models.tensorpack.com/caffe/
Then, run it:
./load-alexnet.py --load alexnet.npz --input cat.png
"""
def tower_func(image): def tower_func(image):
# img: 227x227x3 # img: 227x227x3
......
...@@ -15,22 +15,6 @@ from tensorpack.tfutils.symbolic_functions import * ...@@ -15,22 +15,6 @@ from tensorpack.tfutils.symbolic_functions import *
from tensorpack.tfutils.summary import * from tensorpack.tfutils.summary import *
from tensorpack.dataflow.dataset import ILSVRCMeta from tensorpack.dataflow.dataset import ILSVRCMeta
"""
Usage:
Download original caffe models at:
https://gist.github.com/ksimonyan/211839e770f7b538e2d8
Install caffe python bindings.
python -m tensorpack.utils.loadcaffe \
PATH/TO/VGG/{VGG_ILSVRC_16_layers_deploy.prototxt,VGG_ILSVRC_16_layers.caffemodel} vgg16.npz
Or download a converted caffe model from http://models.tensorpack.com/caffe/
Then, run it:
./load-vgg16.py --load vgg16.npz --input cat.png
"""
def tower_func(image): def tower_func(image):
with argscope(Conv2D, kernel_shape=3, nl=tf.nn.relu): with argscope(Conv2D, kernel_shape=3, nl=tf.nn.relu):
......
...@@ -14,16 +14,6 @@ from tensorpack.tfutils.symbolic_functions import * ...@@ -14,16 +14,6 @@ from tensorpack.tfutils.symbolic_functions import *
from tensorpack.tfutils.summary import * from tensorpack.tfutils.summary import *
from tensorpack.dataflow.dataset import ILSVRCMeta from tensorpack.dataflow.dataset import ILSVRCMeta
"""
Usage:
python -m tensorpack.utils.loadcaffe \
PATH/TO/VGG/{VGG_ILSVRC_19_layers_deploy.prototxt,VGG_ILSVRC_16_layers.caffemodel} vgg19.npz
./load-vgg19.py --load vgg19.npz --input cat.png
Or download a converted caffe model from http://models.tensorpack.com/caffe/
./load-vgg19.py --load vgg19.npz --input cat.png
"""
def tower_func(image): def tower_func(image):
with argscope(Conv2D, kernel_shape=3, nl=tf.nn.relu): with argscope(Conv2D, kernel_shape=3, nl=tf.nn.relu):
......
# Convolutional Pose Machines Moved to [../CaffeModels](../CaffeModels).
A script to load and run pre-trained CPM model released by Shih-En. The original code in caffe is [here](https://github.com/shihenw/convolutional-pose-machines-release).
Reference paper: [Convolutional Pose Machines](https://arxiv.org/abs/1602.00134), Shih-En et al., CVPR16.
Also check out [Stereo Pose Machines](https://github.com/ppwwyyxx/Stereo-Pose-Machines), a __real-time__ CPM application based on tensorpack.
## Usage:
Prepare the model:
```
# download the released caffe model:
wget http://pearl.vasc.ri.cmu.edu/caffe_model_github/model/_trained_MPI/pose_iter_320000.caffemodel
wget https://github.com/shihenw/convolutional-pose-machines-release/raw/master/model/_trained_MPI/pose_deploy_resize.prototxt
# convert the model to a dict:
python -m tensorpack.utils.loadcaffe pose_deploy_resize.prototxt pose_iter_320000.caffemodel CPM-original.npz
```
Or you can download the converted model from [model zoo](http://models.tensorpack.com/caffe/).
Run it on an image, and produce `output.jpg`:
```
python load-cpm.py --load CPM-original.npz --input test.jpg
```
Input image will get resized to 368x368. Note that this CPM comes without person detection, so the
person has to be in the center of the image (and not too small).
![demo](demo.jpg)
...@@ -22,6 +22,7 @@ NR_GPU = 6 ...@@ -22,6 +22,7 @@ NR_GPU = 6
BATCH_SIZE = TOTAL_BATCH_SIZE // NR_GPU BATCH_SIZE = TOTAL_BATCH_SIZE // NR_GPU
INPUT_SHAPE = 224 INPUT_SHAPE = 224
class Model(ModelDesc): class Model(ModelDesc):
def _get_inputs(self): def _get_inputs(self):
return [InputDesc(tf.float32, [None, INPUT_SHAPE, INPUT_SHAPE, 3], 'input'), return [InputDesc(tf.float32, [None, INPUT_SHAPE, INPUT_SHAPE, 3], 'input'),
......
...@@ -2,20 +2,13 @@ ...@@ -2,20 +2,13 @@
# -*- coding: UTF-8 -*- # -*- coding: UTF-8 -*-
# File: vgg16.py # File: vgg16.py
import sys
import argparse import argparse
import numpy as np
import os import os
from itertools import count
import tensorflow as tf import tensorflow as tf
from tensorpack import * from tensorpack import *
from tensorpack.models import * from tensorpack.tfutils import argscope, get_model_loader
from tensorpack.callbacks import *
from tensorpack.train import TrainConfig, SyncMultiGPUTrainerParameterServer
from tensorpack.dataflow import imgaug
from tensorpack.tfutils import argscope, get_model_loader, get_current_tower_context
from tensorpack.tfutils.summary import * from tensorpack.tfutils.summary import *
from tensorpack.utils.gpu import get_nr_gpu from tensorpack.utils.gpu import get_nr_gpu
...@@ -37,7 +30,7 @@ class Model(ImageNetModel): ...@@ -37,7 +30,7 @@ class Model(ImageNetModel):
def get_logits(self, image): def get_logits(self, image):
with argscope(Conv2D, kernel_shape=3, with argscope(Conv2D, kernel_shape=3,
W_init=tf.variance_scaling_initializer(scale=2.)), \ W_init=tf.variance_scaling_initializer(scale=2.)), \
argscope([Conv2D, MaxPooling, BatchNorm], data_format='NCHW'): argscope([Conv2D, MaxPooling, BatchNorm], data_format='NCHW'):
logits = (LinearWrap(image) logits = (LinearWrap(image)
.apply(convnormrelu, 'conv1_1', 64) .apply(convnormrelu, 'conv1_1', 64)
.apply(convnormrelu, 'conv1_2', 64) .apply(convnormrelu, 'conv1_2', 64)
...@@ -63,18 +56,19 @@ class Model(ImageNetModel): ...@@ -63,18 +56,19 @@ class Model(ImageNetModel):
.MaxPooling('pool5', 2) .MaxPooling('pool5', 2)
# 7 # 7
.FullyConnected('fc6', 4096, .FullyConnected('fc6', 4096,
W_init=tf.random_normal_initializer(stddev=0.001)) W_init=tf.random_normal_initializer(stddev=0.001))
.tf.nn.relu(name='fc6_relu') .tf.nn.relu(name='fc6_relu')
.Dropout('drop0', rate=0.5) .Dropout('drop0', rate=0.5)
.FullyConnected('fc7', 4096, .FullyConnected('fc7', 4096,
W_init=tf.random_normal_initializer(stddev=0.001)) W_init=tf.random_normal_initializer(stddev=0.001))
.tf.nn.relu(name='fc7_relu') .tf.nn.relu(name='fc7_relu')
.Dropout('drop1', rate=0.5) .Dropout('drop1', rate=0.5)
.FullyConnected('fc8', 1000, .FullyConnected('fc8', 1000,
W_init=tf.random_normal_initializer(stddev=0.01))()) W_init=tf.random_normal_initializer(stddev=0.01))())
add_param_summary(('.*', ['histogram', 'rms'])) add_param_summary(('.*', ['histogram', 'rms']))
return logits return logits
def get_data(name, batch): def get_data(name, batch):
isTrain = name == 'train' isTrain = name == 'train'
global args global args
...@@ -98,9 +92,11 @@ def get_config(): ...@@ -98,9 +92,11 @@ def get_config():
ModelSaver(), ModelSaver(),
GPUUtilizationTracker(), GPUUtilizationTracker(),
EstimatedTimeLeft(), EstimatedTimeLeft(),
ScheduledHyperParamSetter('learning_rate', ScheduledHyperParamSetter(
'learning_rate',
[(0, 0.01), (3, max(BASE_LR, 0.01))], interp='linear'), [(0, 0.01), (3, max(BASE_LR, 0.01))], interp='linear'),
ScheduledHyperParamSetter('learning_rate', ScheduledHyperParamSetter(
'learning_rate',
[(30, BASE_LR * 1e-1), (60, BASE_LR * 1e-2), (80, BASE_LR * 1e-3)]), [(30, BASE_LR * 1e-1), (60, BASE_LR * 1e-2), (80, BASE_LR * 1e-3)]),
DataParallelInferenceRunner( DataParallelInferenceRunner(
dataset_val, infs, list(range(nr_tower))), dataset_val, infs, list(range(nr_tower))),
......
...@@ -20,18 +20,17 @@ These examples don't have meaningful performance numbers. They are supposed to b ...@@ -20,18 +20,17 @@ These examples don't have meaningful performance numbers. They are supposed to b
## Vision: ## Vision:
| Name | Performance | | Name | Performance |
| --- | --- | | --- | --- |
| Train [ResNet](ResNet) and [ShuffleNet](ShuffleNet) on ImageNet | reproduce paper | | Train [ResNet](ResNet) and [ShuffleNet](ImageNetModels) on ImageNet | reproduce paper |
| [Train Faster-RCNN / Mask-RCNN on COCO](FasterRCNN) | reproduce paper | | [Train Faster-RCNN / Mask-RCNN on COCO](FasterRCNN) | reproduce paper |
| [DoReFa-Net: training binary / low-bitwidth CNN on ImageNet](DoReFa-Net) | reproduce paper | | [DoReFa-Net: training binary / low-bitwidth CNN on ImageNet](DoReFa-Net) | reproduce paper |
| [Generative Adversarial Network(GAN) variants](GAN), including DCGAN, InfoGAN, <br/> Conditional GAN, WGAN, BEGAN, DiscoGAN, Image to Image, CycleGAN | visually reproduce | | [Generative Adversarial Network(GAN) variants](GAN), including DCGAN, InfoGAN, <br/> Conditional GAN, WGAN, BEGAN, DiscoGAN, Image to Image, CycleGAN | visually reproduce |
| [Inception-BN](Inception) | reproduce reference code |
| [Fully-convolutional Network for Holistically-Nested Edge Detection(HED)](HED) | visually reproduce | | [Fully-convolutional Network for Holistically-Nested Edge Detection(HED)](HED) | visually reproduce |
| [Spatial Transformer Networks on MNIST addition](SpatialTransformer) | reproduce paper | | [Spatial Transformer Networks on MNIST addition](SpatialTransformer) | reproduce paper |
| [Visualize CNN saliency maps](Saliency) | visually reproduce | | [Visualize CNN saliency maps](Saliency) | visually reproduce |
| [Similarity learning on MNIST](SimilarityLearning) | | | [Similarity learning on MNIST](SimilarityLearning) | |
| Single-image super-resolution using [EnhanceNet](SuperResolution) | visually reproduce | | Single-image super-resolution using [EnhanceNet](SuperResolution) | visually reproduce |
| Learn steering filters with [Dynamic Filter Networks](DynamicFilterNetwork) | visually reproduce | | Learn steering filters with [Dynamic Filter Networks](DynamicFilterNetwork) | visually reproduce |
| Load a pre-trained [AlexNet](load-alexnet.py), [VGG16](load-vgg16.py), or [Convolutional Pose Machines](ConvolutionalPoseMachines/) | | | Load a pre-trained [AlexNet, VGG, or Convolutional Pose Machines](CaffeModels) | |
## Reinforcement Learning: ## Reinforcement Learning:
| Name | Performance | | Name | Performance |
......
...@@ -7,7 +7,7 @@ GIT_ARG="--git-dir ../.git --work-tree .." ...@@ -7,7 +7,7 @@ GIT_ARG="--git-dir ../.git --work-tree .."
# find out modified python files, so that we ignored unstaged files # find out modified python files, so that we ignored unstaged files
# exclude ../docs # exclude ../docs
MOD=$(git $GIT_ARG status -s | grep -E '\.py$' \ MOD=$(git $GIT_ARG status -s | grep -E '\.py$' \
| grep -E '^\b+M\b+' | cut -c 4- | grep -v '../docs') | grep -E '^\b+M\b+|^A' | cut -c 4- | grep -v '../docs')
if [[ -n $MOD ]]; then if [[ -n $MOD ]]; then
flake8 $MOD flake8 $MOD
fi fi
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment