Commit cc1f50e5 authored by Yuxin Wu's avatar Yuxin Wu

disturb svhn

parent 4cc274cb
...@@ -3,16 +3,21 @@ ...@@ -3,16 +3,21 @@
I ran into the paper [DisturbLabel: Regularizing CNN on the Loss Layer](https://arxiv.org/abs/1605.00055) on CVPR16, I ran into the paper [DisturbLabel: Regularizing CNN on the Loss Layer](https://arxiv.org/abs/1605.00055) on CVPR16,
which basically said that noisy data gives you better performance. which basically said that noisy data gives you better performance.
As many, I didn't believe the method and the results at first. As many, I didn't believe the method and the results.
This is a simple mnist training script with DisturbLabel. It uses the architecture in the paper and This is a simple mnist training script with DisturbLabel. It uses the architecture in the paper and
hyperparameters in my original [mnist example](examples/mnist-convnet.py). The results surprised me: hyperparameters in my original [mnist example](../mnist-convnet.py). The results surprised me:
![mnist](mnist.png) ![mnist](mnist.png)
Experiements are repeated 15 times for p=0, 10 times for p=0.02 & 0.05, and 5 times for other values Experiements were repeated 15 times for p=0, 10 times for p=0.02 & 0.05, and 5 times for other values
of p. All experiements run for 100 epochs, with lr decay, which are enough for them to converge. of p. All experiements run for 100 epochs, with lr decay, which are enough for them to converge.
I suppose the disturb method works as a random noise to prevent SGD from getting stuck. I suppose the disturb method works as a random noise to prevent SGD from getting stuck.
It doesn't work for harder problems such as SVHN (details to follow). And I don't believe However it didn't work for harder problems such as SVHN:
it will work for ImageNet. ![svhn](svhn.png)
The SVHN experiement used the model & hyperparemeters as my original [svhn example](../svhn-digit-convnet.py).
Experiements were all repeated 10 times to get the error bar.
And I don't believe it will work for ImageNet.
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# File: svhn-disturb.py
# Author: Yuxin Wu <ppwwyyxx@gmail.com>
import tensorflow as tf
import argparse
import numpy as np
import os
from tensorpack import *
from tensorpack.tfutils.symbolic_functions import *
from tensorpack.tfutils.summary import *
from disturb import DisturbLabel
import imp
svhn_example = imp.load_source('svhn_example', '../svhn-digit-convnet.py')
Model = svhn_example.Model
get_config = svhn_example.get_config
def get_data():
d1 = dataset.SVHNDigit('train')
d2 = dataset.SVHNDigit('extra')
data_train = RandomMixData([d1, d2])
data_train = DisturbLabel(data_train, args.prob)
data_test = dataset.SVHNDigit('test')
augmentors = [
imgaug.Resize((40, 40)),
imgaug.Brightness(30),
imgaug.Contrast((0.5,1.5)),
]
data_train = AugmentImageComponent(data_train, augmentors)
data_train = BatchData(data_train, 128)
data_train = PrefetchData(data_train, 5, 5)
augmentors = [ imgaug.Resize((40, 40)) ]
data_test = AugmentImageComponent(data_test, augmentors)
data_test = BatchData(data_test, 128, remainder=True)
return data_train, data_test
svhn_example.get_data = get_data
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--gpu', help='comma separated list of GPU(s) to use.') # nargs='*' in multi mode
parser.add_argument('--load', help='load model')
parser.add_argument('--prob', help='disturb prob', type=float, required=True)
args = parser.parse_args()
if args.gpu:
os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu
else:
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
config = get_config(args.prob)
if args.load:
config.session_init = SaverRestore(args.load)
if args.gpu:
config.nr_tower = len(args.gpu.split(','))
QueueInputTrainer(config).train()
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment