Commit edb1f6c3 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 72385a85
...@@ -71,6 +71,6 @@ If you're unable to scale to multiple GPUs almost linearly: ...@@ -71,6 +71,6 @@ If you're unable to scale to multiple GPUs almost linearly:
2. Then note that your model may have a different communication-computation pattern or other 2. Then note that your model may have a different communication-computation pattern or other
characteristics that affects efficiency. characteristics that affects efficiency.
There isn't a simple answer to this. There isn't a simple answer to this.
Changing different multi-GPU trainers may affect the speed significantly sometimes. You may try a different multi-GPU trainer; the speed can vary a lot sometimes.
Note that scalibility measurement always trains with the same "batch size per GPU", not the same total equivalent batch size. Note that scalibility measurement always trains with the same "batch size per GPU", not the same total equivalent batch size.
...@@ -33,21 +33,17 @@ This is our attempt to reproduce it on tensorpack & TensorFlow. ...@@ -33,21 +33,17 @@ This is our attempt to reproduce it on tensorpack & TensorFlow.
Accuracy: Accuracy:
Trained with 4 GPUs and (W,A,G)=(1,2,6), it can reach top-1 single-crop validation error of 47.6%, Trained with 4 GPUs and (W,A,G)=(1,2,6), it can reach top-1 single-crop validation error of 47.6%,
after 70 epochs. This number is better than what's in the paper after 70 epochs. This number is better than what's in the paper
due to more sophisticated augmentors. due to more sophisticated augmentations.
Note that the effective batch size in SyncMultiGPUTrainer is actually With (W,A,G)=(32,32,32) -- full precision baseline, 41.4% error.
BATCH_SIZE * NUM_GPU. With a different number of GPUs in use, things might
be a bit different, especially for learning rate.
With (W,A,G)=(32,32,32) -- full precision baseline
With (W,A,G)=(1,32,32) -- BWN With (W,A,G)=(1,32,32) -- BWN
With (W,A,G)=(1,2,6), 47.6% error With (W,A,G)=(1,2,6), 47.6% error
With (W,A,G)=(1,2,4) With (W,A,G)=(1,2,4), 58.4% error
Speed: Speed:
About 11 iteration/s on 4 P100s. (Each epoch is set to 10000 iterations) About 11 iteration/s on 4 P100s. (Each epoch is set to 10000 iterations)
Note that this code was written early without using NCHW format. You Note that this code was written early without using NCHW format. You
should expect a speed up after switching to NCHW format. should expect a speed up if the code is ported to NCHW format.
To Train, for example: To Train, for example:
./alexnet-dorefa.py --dorefa 1,2,6 --data PATH --gpu 0,1 ./alexnet-dorefa.py --dorefa 1,2,6 --data PATH --gpu 0,1
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment