Commit cc1f50e5 authored by Yuxin Wu's avatar Yuxin Wu

disturb svhn

parent 4cc274cb
......@@ -3,16 +3,21 @@
I ran into the paper [DisturbLabel: Regularizing CNN on the Loss Layer](https://arxiv.org/abs/1605.00055) on CVPR16,
which basically said that noisy data gives you better performance.
As many, I didn't believe the method and the results at first.
As many, I didn't believe the method and the results.
This is a simple mnist training script with DisturbLabel. It uses the architecture in the paper and
hyperparameters in my original [mnist example](examples/mnist-convnet.py). The results surprised me:
hyperparameters in my original [mnist example](../mnist-convnet.py). The results surprised me:
![mnist](mnist.png)
Experiements are repeated 15 times for p=0, 10 times for p=0.02 & 0.05, and 5 times for other values
Experiements were repeated 15 times for p=0, 10 times for p=0.02 & 0.05, and 5 times for other values
of p. All experiements run for 100 epochs, with lr decay, which are enough for them to converge.
I suppose the disturb method works as a random noise to prevent SGD from getting stuck.
It doesn't work for harder problems such as SVHN (details to follow). And I don't believe
it will work for ImageNet.
However it didn't work for harder problems such as SVHN:
![svhn](svhn.png)
The SVHN experiement used the model & hyperparemeters as my original [svhn example](../svhn-digit-convnet.py).
Experiements were all repeated 10 times to get the error bar.
And I don't believe it will work for ImageNet.
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# File: svhn-disturb.py
# Author: Yuxin Wu <ppwwyyxx@gmail.com>
import tensorflow as tf
import argparse
import numpy as np
import os
from tensorpack import *
from tensorpack.tfutils.symbolic_functions import *
from tensorpack.tfutils.summary import *
from disturb import DisturbLabel
import imp
svhn_example = imp.load_source('svhn_example', '../svhn-digit-convnet.py')
Model = svhn_example.Model
get_config = svhn_example.get_config
def get_data():
d1 = dataset.SVHNDigit('train')
d2 = dataset.SVHNDigit('extra')
data_train = RandomMixData([d1, d2])
data_train = DisturbLabel(data_train, args.prob)
data_test = dataset.SVHNDigit('test')
augmentors = [
imgaug.Resize((40, 40)),
imgaug.Brightness(30),
imgaug.Contrast((0.5,1.5)),
]
data_train = AugmentImageComponent(data_train, augmentors)
data_train = BatchData(data_train, 128)
data_train = PrefetchData(data_train, 5, 5)
augmentors = [ imgaug.Resize((40, 40)) ]
data_test = AugmentImageComponent(data_test, augmentors)
data_test = BatchData(data_test, 128, remainder=True)
return data_train, data_test
svhn_example.get_data = get_data
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--gpu', help='comma separated list of GPU(s) to use.') # nargs='*' in multi mode
parser.add_argument('--load', help='load model')
parser.add_argument('--prob', help='disturb prob', type=float, required=True)
args = parser.parse_args()
if args.gpu:
os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu
else:
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
config = get_config(args.prob)
if args.load:
config.session_init = SaverRestore(args.load)
if args.gpu:
config.nr_tower = len(args.gpu.split(','))
QueueInputTrainer(config).train()
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment