Commit 1f844978 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent f9bf5407
...@@ -12,10 +12,18 @@ TensorFlow itself also changes API and those are not listed here. ...@@ -12,10 +12,18 @@ TensorFlow itself also changes API and those are not listed here.
The concept of `InputDesc` was replaced by its equivalent in TF: The concept of `InputDesc` was replaced by its equivalent in TF:
`tf.TensorSpec`. This may be a breaking change if you have customized `tf.TensorSpec`. This may be a breaking change if you have customized
code that relies on internals of `InputDesc`. code that relies on internals of `InputDesc`.
+ [2018/08/27] msgpack is used again for "serialization to disk", because pyarrow To use `tf.TensorSpec` in your `ModelDesc`:
```python
def inputs(self):
return [tf.TensorSpec((None, 28, 28, 1), tf.float32, 'image'),
tf.TensorSpec((None,), tf.int32, 'label')]
```
+ [2018/08/27] msgpack is used for "serialization to disk", because pyarrow
has no compatibility between versions. To use pyarrow instead, `export TENSORPACK_COMPATIBLE_SERIALIZE=pyarrow`. has no compatibility between versions. To use pyarrow instead, `export TENSORPACK_COMPATIBLE_SERIALIZE=pyarrow`.
+ [2018/04/05] msgpack is replaced by pyarrow in favor of its speed. If you want old behavior, + [2018/04/05] <del>msgpack is replaced by pyarrow in favor of its speed. If you want old behavior,
`export TENSORPACK_SERIALIZE=msgpack`. It's later found that pyarrow is unstable and may lead to crash. `export TENSORPACK_SERIALIZE=msgpack`.</del>
It's later found that pyarrow is unstable and may lead to crash.
So the default serialization is changed back to msgpack.
+ [2018/03/20] `ModelDesc` starts to use simplified interfaces: + [2018/03/20] `ModelDesc` starts to use simplified interfaces:
+ `_get_inputs()` renamed to `inputs()` and returns `tf.placeholder`s. + `_get_inputs()` renamed to `inputs()` and returns `tf.placeholder`s.
+ `build_graph(self, tensor1, tensor2)` returns the cost tensor directly. + `build_graph(self, tensor1, tensor2)` returns the cost tensor directly.
......
## [imagenet-resnet.py](imagenet-resnet.py) ## [imagenet-resnet.py](imagenet-resnet.py)
__Training__ code of three variants of ResNet on ImageNet: __Training__ code of 4 variants of ResNet on ImageNet:
* [Original ResNet](https://arxiv.org/abs/1512.03385) * [Original ResNet](https://arxiv.org/abs/1512.03385)
* [Pre-activation ResNet](https://arxiv.org/abs/1603.05027) * [Pre-activation ResNet](https://arxiv.org/abs/1603.05027)
* [Squeeze-and-Excitation ResNet](https://arxiv.org/abs/1709.01507) * [Squeeze-and-Excitation ResNet](https://arxiv.org/abs/1709.01507)
* [ResNeXt](https://arxiv.org/abs/1611.05431)
The training follows the __exact__ recipe used by the [Training ImageNet in 1 Hour paper](https://arxiv.org/abs/1706.02677) The training follows the exact recipe used by the [Training ImageNet in 1 Hour paper](https://arxiv.org/abs/1706.02677)
and gets the same performance. and gets the same performance.
__Distributed training__ code & results can be found at [tensorpack/benchmarks](https://github.com/tensorpack/benchmarks/tree/master/ResNet-Horovod). __Distributed training__ code & results can be found at [tensorpack/benchmarks](https://github.com/tensorpack/benchmarks/tree/master/ResNet-Horovod).
...@@ -16,7 +17,7 @@ In fact, many papers that claim to "improve" ResNet by .5% only compete with a l ...@@ -16,7 +17,7 @@ In fact, many papers that claim to "improve" ResNet by .5% only compete with a l
baseline and they actually cannot beat this ResNet recipe. baseline and they actually cannot beat this ResNet recipe.
| Model | Top 5 Error | Top 1 Error | Download | | Model | Top 5 Error | Top 1 Error | Download |
|:------------|:------------|:-----------:|:----------------------------------------------------------------------------:| |:-----------------|:------------|:-----------:|:---------------------------------------------------------------------------------:|
| ResNet18 | 10.50% | 29.66% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet18.npz) | | ResNet18 | 10.50% | 29.66% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet18.npz) |
| ResNet34 | 8.56% | 26.17% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet34.npz) | | ResNet34 | 8.56% | 26.17% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet34.npz) |
| ResNet50 | 6.85% | 23.61% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet50.npz) | | ResNet50 | 6.85% | 23.61% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet50.npz) |
...@@ -25,15 +26,15 @@ baseline and they actually cannot beat this ResNet recipe. ...@@ -25,15 +26,15 @@ baseline and they actually cannot beat this ResNet recipe.
| ResNeXt101-32x4d | 5.73% | 21.05% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNeXt101-32x4d.npz) | | ResNeXt101-32x4d | 5.73% | 21.05% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNeXt101-32x4d.npz) |
| ResNet152 | 5.78% | 21.51% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet152.npz) | | ResNet152 | 5.78% | 21.51% | [:arrow_down:](http://models.tensorpack.com/ResNet/ImageNet-ResNet152.npz) |
To reproduce the above results, To reproduce,
first decompress ImageNet data into [this structure](http://tensorpack.readthedocs.io/modules/dataflow.dataset.html#tensorpack.dataflow.dataset.ILSVRC12), then: first decompress ImageNet data into [this structure](http://tensorpack.readthedocs.io/modules/dataflow.dataset.html#tensorpack.dataflow.dataset.ILSVRC12), then:
```bash ```bash
./imagenet-resnet.py --data /path/to/original/ILSVRC -d 50 [--mode resnet/preact/se] --batch 256 ./imagenet-resnet.py --data /path/to/original/ILSVRC -d 50 --mode resnet --batch 512
# See ./imagenet-resnet.py -h for other options. # See ./imagenet-resnet.py -h for other options.
``` ```
You should be able to see good GPU utilization (95%~99%), if your data is fast enough. You should be able to see good GPU utilization (95%~99%), if your data is fast enough.
With batch=64x8, it can finish 100 epochs in 16 hours on AWS p3.16xlarge (8 V100s). With batch=64x8, ResNet50 training can finish 100 epochs in 16 hours on AWS p3.16xlarge (8 V100s).
The default data pipeline is probably OK for machines with SSD & 20 CPU cores. The default data pipeline is probably OK for machines with SSD & 20 CPU cores.
See the [tutorial](http://tensorpack.readthedocs.io/tutorial/efficient-dataflow.html) on other options to speed up your data. See the [tutorial](http://tensorpack.readthedocs.io/tutorial/efficient-dataflow.html) on other options to speed up your data.
......
...@@ -11,7 +11,7 @@ from six.moves import range, zip ...@@ -11,7 +11,7 @@ from six.moves import range, zip
from ..compat import tfv1 from ..compat import tfv1
from ..callbacks.base import Callback, CallbackFactory from ..callbacks.base import Callback, CallbackFactory
from ..callbacks.graph import RunOp from ..callbacks.graph import RunOp
from ..dataflow import DataFlow, MapData, RepeatedData from ..dataflow import DataFlow, MapData, RepeatedData, DataFlowTerminated
from ..tfutils.common import get_op_tensor_name from ..tfutils.common import get_op_tensor_name
from ..tfutils.dependency import dependency_of_fetches from ..tfutils.dependency import dependency_of_fetches
from ..tfutils.summary import add_moving_summary from ..tfutils.summary import add_moving_summary
...@@ -164,18 +164,19 @@ class EnqueueThread(ShareSessionThread): ...@@ -164,18 +164,19 @@ class EnqueueThread(ShareSessionThread):
self.op.run(feed_dict=feed) self.op.run(feed_dict=feed)
except (tf.errors.CancelledError, tf.errors.OutOfRangeError): except (tf.errors.CancelledError, tf.errors.OutOfRangeError):
pass pass
# logger.exception("Exception in {}:".format(self.name)) except DataFlowTerminated:
logger.info("[EnqueueThread] DataFlow has terminated.")
except Exception as e: except Exception as e:
if isinstance(e, RuntimeError) and 'closed Session' in str(e): if isinstance(e, RuntimeError) and 'closed Session' in str(e):
pass pass
else: else:
logger.exception("Exception in {}:".format(self.name)) logger.exception("[EnqueueThread] Exception in thread {}:".format(self.name))
finally: finally:
try: try:
self.close_op.run() self.close_op.run()
except Exception: except Exception:
pass pass
logger.info("{} Exited.".format(self.name)) logger.info("[EnqueueThread] Thread {} Exited.".format(self.name))
def reinitialize_dataflow(self): def reinitialize_dataflow(self):
self._itr = self.dataflow.__iter__() self._itr = self.dataflow.__iter__()
......
...@@ -146,7 +146,19 @@ class AccumGradOptimizer(ProxyOptimizer): ...@@ -146,7 +146,19 @@ class AccumGradOptimizer(ProxyOptimizer):
:math:`k` times larger learning rate, but uses much less memory. :math:`k` times larger learning rate, but uses much less memory.
Note that this implementation may not support all models. Note that this implementation may not support all models.
E.g., it doesn't support sparse gradient update. E.g., it currently doesn't support sparse gradient update.
This optimizer can be used in any TensorFlow code (with or without tensorpack).
Example:
.. code-block:: python
from tensorpack.tfutils.optimizer import AccumGradOptimizer
myopt = tf.train.GradientDescentOptimizer(0.01)
myopt = AccumGradOptimizer(myopt, niter=5)
train_op = myopt.minimize(loss)
""" """
def __init__(self, opt, niter): def __init__(self, opt, niter):
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment