@@ -8,13 +8,14 @@ With the magic of SGD, wrong deep learning code often appears to still work,
...
@@ -8,13 +8,14 @@ With the magic of SGD, wrong deep learning code often appears to still work,
especially if you try it on toy datasets.
especially if you try it on toy datasets.
See [Unawareness of Deep Learning Mistakes](https://medium.com/@ppwwyyxx/unawareness-of-deep-learning-mistakes-d5b5774da0ba).
See [Unawareness of Deep Learning Mistakes](https://medium.com/@ppwwyyxx/unawareness-of-deep-learning-mistakes-d5b5774da0ba).
We refuse toy examples.
Instead of showing you 10 arbitrary networks trained on toy datasets with random final performance,
Instead of showing you 10 arbitrary networks trained on toy datasets with random final performance,
tensorpack examples try to faithfully replicate experiments and performance in the paper as much as possible,
tensorpack examples try to faithfully replicate experiments and performance in the paper as much as possible,
so you're confident that they are correct.
so you're confident that they are correct.
## Getting Started:
## Getting Started:
These examples don't have meaningful performance numbers. They are supposed to be just demos.
These are all the toy examples in tensorpack. They are supposed to be just demos.
+[An illustrative MNIST example with explanation of the framework](basics/mnist-convnet.py)
+[An illustrative MNIST example with explanation of the framework](basics/mnist-convnet.py)
+ Tensorpack supports any symbolic libraries. See the same MNIST example written with [tf.layers](basics/mnist-tflayers.py), [tf-slim](basics/mnist-tfslim.py), and [with weights visualizations](basics/mnist-visualizations.py)
+ Tensorpack supports any symbolic libraries. See the same MNIST example written with [tf.layers](basics/mnist-tflayers.py), [tf-slim](basics/mnist-tfslim.py), and [with weights visualizations](basics/mnist-visualizations.py)
+ A tiny [Cifar ConvNet](basics/cifar-convnet.py) and [SVHN ConvNet](basics/svhn-digit-convnet.py)
+ A tiny [Cifar ConvNet](basics/cifar-convnet.py) and [SVHN ConvNet](basics/svhn-digit-convnet.py)
@@ -8,7 +8,7 @@ Keras alone has various overhead. In particular, it is not efficient when workin
...
@@ -8,7 +8,7 @@ Keras alone has various overhead. In particular, it is not efficient when workin
The article [Towards Efficient Multi-GPU Training in Keras with TensorFlow](https://medium.com/rossum/towards-efficient-multi-gpu-training-in-keras-with-tensorflow-8a0091074fb2)
The article [Towards Efficient Multi-GPU Training in Keras with TensorFlow](https://medium.com/rossum/towards-efficient-multi-gpu-training-in-keras-with-tensorflow-8a0091074fb2)
has mentioned some of it.
has mentioned some of it.
Even on a single GPU, tensorpack can run [1.1~2x faster](https://github.com/tensorpack/benchmarks/tree/master/other-wrappers)
Even on a single GPU, tensorpack can run [1.2~2x faster](https://github.com/tensorpack/benchmarks/tree/master/other-wrappers)
than the equivalent Keras code. The gap becomes larger when you scale.
than the equivalent Keras code. The gap becomes larger when you scale.
Tensorpack and [horovod](https://github.com/uber/horovod/blob/master/examples/keras_imagenet_resnet50.py)
Tensorpack and [horovod](https://github.com/uber/horovod/blob/master/examples/keras_imagenet_resnet50.py)
are the only two tools I know that can scale the training of a large Keras model.
are the only two tools I know that can scale the training of a large Keras model.
...
@@ -26,7 +26,7 @@ reproduce exactly the same setting of [tensorpack ResNet example](../ResNet) on
...
@@ -26,7 +26,7 @@ reproduce exactly the same setting of [tensorpack ResNet example](../ResNet) on
It has:
It has:
+ ResNet-50 model modified from [keras.applications](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/_impl/keras/applications/resnet50.py).
+ ResNet-50 model modified from [keras.applications](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/_impl/keras/applications/resnet50.py).
(We put stride on 3x3 conv in each bottleneck, which is different from some other implementations).
(We put stride on 3x3 conv in each bottleneck, which is different from certain other implementations).
+ Multi-GPU data-parallel __training and validation__ which scales
+ Multi-GPU data-parallel __training and validation__ which scales
+ Finished 100 epochs in 19.5 hours on 8 V100s, with >90% GPU utilization.
+ Finished 100 epochs in 19.5 hours on 8 V100s, with >90% GPU utilization.
+ Still slightly slower than native tensorpack examples.
+ Still slightly slower than native tensorpack examples.