Commit 413059b1 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 22375132
......@@ -171,6 +171,7 @@ class COCODetection(DatasetSplit):
# add the keys
img['boxes'] = boxes # nx4
assert cls.min() > 0, "Category id in COCO format must > 0!"
img['class'] = cls # n, always >0
img['is_crowd'] = is_crowd # n,
if add_mask:
......
......@@ -25,8 +25,11 @@ except ImportError:
if __name__ == '__main__':
# "spawn/forkserver" is safer than the default "fork" method and
# produce more deterministic behavior & memory saving
# However its limitation is you cannot pass a lambda function to subprocesses.
import multiprocessing as mp
mp.set_start_method('spawn') # safer behavior & memory saving
mp.set_start_method('spawn')
parser = argparse.ArgumentParser()
parser.add_argument('--load', help='load a model to start training from. Can overwrite BACKBONE.WEIGHTS')
parser.add_argument('--logdir', help='log directory', default='train_log/maskrcnn')
......
......@@ -8,13 +8,13 @@ __Training__ code of 4 variants of ResNet on ImageNet:
* [Squeeze-and-Excitation ResNet](https://arxiv.org/abs/1709.01507)
* [ResNeXt](https://arxiv.org/abs/1611.05431)
The training follows the exact recipe used by the [Training ImageNet in 1 Hour paper](https://arxiv.org/abs/1706.02677)
The training follows the exact standard recipe used by the [Training ImageNet in 1 Hour paper](https://arxiv.org/abs/1706.02677)
and gets the same performance.
__Distributed training__ code & results can be found at [tensorpack/benchmarks](https://github.com/tensorpack/benchmarks/tree/master/ResNet-Horovod).
This recipe has better performance than most open source implementations.
In fact, many papers that claim to "improve" ResNet by .5% only compete with a lower
baseline and they actually cannot beat this ResNet recipe.
baseline and they actually cannot beat this standard ResNet recipe.
| Model | Top 5 Error | Top 1 Error | Download |
|:-----------------|:------------|:-----------:|:---------------------------------------------------------------------------------:|
......
......@@ -17,9 +17,10 @@ are the only two tools I know that can scale the training of a large Keras model
There are two flavors where you can use a Keras model inside tensorpack:
1. Write the tower function similar to a standard tensorpack program, but use some Keras layers in
1. Write the tower function similar to a standard tensorpack program, but mix some Keras layers in
between. See [mnist-keras.py](mnist-keras.py) on how to do this.
It does not support all tensorpack trainers.
It does not support all tensorpack trainers, and can be brittle due to
incompatibilities between Keras and tensorpack.
2. The entire model to train is a Keras model (and there will be no `ModelDesc`, etc).
See [mnist-keras-v2.py](mnist-keras-v2.py).
......
......@@ -43,7 +43,6 @@ def get_keras_model():
with clear_tower0_name_scope():
M = keras.models.Sequential()
M.add(KL.Conv2D(32, 3, activation='relu', padding='same'))
M.add(KL.BatchNormalization())
M.add(KL.MaxPooling2D())
M.add(KL.Conv2D(32, 3, activation='relu', padding='same'))
M.add(KL.Conv2D(32, 3, activation='relu', padding='same'))
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment