Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
1f844978
Commit
1f844978
authored
Mar 31, 2019
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update docs
parent
f9bf5407
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
45 additions
and
23 deletions
+45
-23
CHANGES.md
CHANGES.md
+11
-3
examples/ResNet/README.md
examples/ResNet/README.md
+16
-15
tensorpack/input_source/input_source.py
tensorpack/input_source/input_source.py
+5
-4
tensorpack/tfutils/optimizer.py
tensorpack/tfutils/optimizer.py
+13
-1
No files found.
CHANGES.md
View file @
1f844978
...
...
@@ -12,10 +12,18 @@ TensorFlow itself also changes API and those are not listed here.
The concept of
`InputDesc`
was replaced by its equivalent in TF:
`tf.TensorSpec`
. This may be a breaking change if you have customized
code that relies on internals of
`InputDesc`
.
+
[2018/08/27] msgpack is used again for "serialization to disk", because pyarrow
To use
`tf.TensorSpec`
in your
`ModelDesc`
:
```
python
def
inputs
(
self
):
return
[
tf
.
TensorSpec
((
None
,
28
,
28
,
1
),
tf
.
float32
,
'image'
),
tf
.
TensorSpec
((
None
,),
tf
.
int32
,
'label'
)]
```
+
[2018/08/27] msgpack is used for "serialization to disk", because pyarrow
has no compatibility between versions. To use pyarrow instead,
`export TENSORPACK_COMPATIBLE_SERIALIZE=pyarrow`
.
+
[2018/04/05] msgpack is replaced by pyarrow in favor of its speed. If you want old behavior,
`export TENSORPACK_SERIALIZE=msgpack`
. It's later found that pyarrow is unstable and may lead to crash.
+
[2018/04/05]
<del>
msgpack is replaced by pyarrow in favor of its speed. If you want old behavior,
`export TENSORPACK_SERIALIZE=msgpack`
.
</del>
It's later found that pyarrow is unstable and may lead to crash.
So the default serialization is changed back to msgpack.
+
[2018/03/20]
`ModelDesc`
starts to use simplified interfaces:
+
`_get_inputs()`
renamed to
`inputs()`
and returns
`tf.placeholder`
s.
+
`build_graph(self, tensor1, tensor2)`
returns the cost tensor directly.
...
...
examples/ResNet/README.md
View file @
1f844978
## [imagenet-resnet.py](imagenet-resnet.py)
__Training__
code of
three
variants of ResNet on ImageNet:
__Training__
code of
4
variants of ResNet on ImageNet:
*
[
Original ResNet
](
https://arxiv.org/abs/1512.03385
)
*
[
Pre-activation ResNet
](
https://arxiv.org/abs/1603.05027
)
*
[
Squeeze-and-Excitation ResNet
](
https://arxiv.org/abs/1709.01507
)
*
[
ResNeXt
](
https://arxiv.org/abs/1611.05431
)
The training follows the
__exact__
recipe used by the
[
Training ImageNet in 1 Hour paper
](
https://arxiv.org/abs/1706.02677
)
The training follows the
exact
recipe used by the
[
Training ImageNet in 1 Hour paper
](
https://arxiv.org/abs/1706.02677
)
and gets the same performance.
__Distributed training__
code & results can be found at
[
tensorpack/benchmarks
](
https://github.com/tensorpack/benchmarks/tree/master/ResNet-Horovod
)
.
...
...
@@ -15,25 +16,25 @@ This recipe has better performance than most open source implementations.
In fact, many papers that claim to "improve" ResNet by .5% only compete with a lower
baseline and they actually cannot beat this ResNet recipe.
| Model
| Top 5 Error | Top 1 Error | Download
|
|:------------
|:------------|:-----------:|:
----------------------------------------------------------------------------:|
| ResNet18
| 10.50% | 29.66% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet18.npz
)
|
| ResNet34
| 8.56% | 26.17% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet34.npz
)
|
| ResNet50
| 6.85% | 23.61% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet50.npz
)
|
| ResNet50-SE
| 6.24% | 22.64% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet50-SE.npz
)
|
| ResNet101
| 6.04% | 21.95% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet101.npz
)
|
| ResNeXt101-32x4d
| 5.73% | 21.05% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNeXt101-32x4d.npz
)
|
| ResNet152
| 5.78% | 21.51% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet152.npz
)
|
To reproduce
the above results
,
| Model
| Top 5 Error | Top 1 Error | Download
|
|:------------
-----|:------------|:-----------:|:-----
----------------------------------------------------------------------------:|
| ResNet18
| 10.50% | 29.66% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet18.npz
)
|
| ResNet34
| 8.56% | 26.17% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet34.npz
)
|
| ResNet50
| 6.85% | 23.61% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet50.npz
)
|
| ResNet50-SE
| 6.24% | 22.64% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet50-SE.npz
)
|
| ResNet101
| 6.04% | 21.95% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet101.npz
)
|
| ResNeXt101-32x4d
| 5.73% | 21.05% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNeXt101-32x4d.npz
)
|
| ResNet152
| 5.78% | 21.51% |
[
:arrow_down:
](
http://models.tensorpack.com/ResNet/ImageNet-ResNet152.npz
)
|
To reproduce,
first decompress ImageNet data into
[
this structure
](
http://tensorpack.readthedocs.io/modules/dataflow.dataset.html#tensorpack.dataflow.dataset.ILSVRC12
)
, then:
```
bash
./imagenet-resnet.py
--data
/path/to/original/ILSVRC
-d
50
[
--mode
resnet/preact/se]
--batch
256
./imagenet-resnet.py
--data
/path/to/original/ILSVRC
-d
50
--mode
resnet
--batch
512
# See ./imagenet-resnet.py -h for other options.
```
You should be able to see good GPU utilization (95%~99%), if your data is fast enough.
With batch=64x8,
it
can finish 100 epochs in 16 hours on AWS p3.16xlarge (8 V100s).
With batch=64x8,
ResNet50 training
can finish 100 epochs in 16 hours on AWS p3.16xlarge (8 V100s).
The default data pipeline is probably OK for machines with SSD & 20 CPU cores.
See the
[
tutorial
](
http://tensorpack.readthedocs.io/tutorial/efficient-dataflow.html
)
on other options to speed up your data.
...
...
tensorpack/input_source/input_source.py
View file @
1f844978
...
...
@@ -11,7 +11,7 @@ from six.moves import range, zip
from
..compat
import
tfv1
from
..callbacks.base
import
Callback
,
CallbackFactory
from
..callbacks.graph
import
RunOp
from
..dataflow
import
DataFlow
,
MapData
,
RepeatedData
from
..dataflow
import
DataFlow
,
MapData
,
RepeatedData
,
DataFlowTerminated
from
..tfutils.common
import
get_op_tensor_name
from
..tfutils.dependency
import
dependency_of_fetches
from
..tfutils.summary
import
add_moving_summary
...
...
@@ -164,18 +164,19 @@ class EnqueueThread(ShareSessionThread):
self
.
op
.
run
(
feed_dict
=
feed
)
except
(
tf
.
errors
.
CancelledError
,
tf
.
errors
.
OutOfRangeError
):
pass
# logger.exception("Exception in {}:".format(self.name))
except
DataFlowTerminated
:
logger
.
info
(
"[EnqueueThread] DataFlow has terminated."
)
except
Exception
as
e
:
if
isinstance
(
e
,
RuntimeError
)
and
'closed Session'
in
str
(
e
):
pass
else
:
logger
.
exception
(
"
Exception in
{}:"
.
format
(
self
.
name
))
logger
.
exception
(
"
[EnqueueThread] Exception in thread
{}:"
.
format
(
self
.
name
))
finally
:
try
:
self
.
close_op
.
run
()
except
Exception
:
pass
logger
.
info
(
"{} Exited."
.
format
(
self
.
name
))
logger
.
info
(
"
[EnqueueThread] Thread
{} Exited."
.
format
(
self
.
name
))
def
reinitialize_dataflow
(
self
):
self
.
_itr
=
self
.
dataflow
.
__iter__
()
...
...
tensorpack/tfutils/optimizer.py
View file @
1f844978
...
...
@@ -146,7 +146,19 @@ class AccumGradOptimizer(ProxyOptimizer):
:math:`k` times larger learning rate, but uses much less memory.
Note that this implementation may not support all models.
E.g., it doesn't support sparse gradient update.
E.g., it currently doesn't support sparse gradient update.
This optimizer can be used in any TensorFlow code (with or without tensorpack).
Example:
.. code-block:: python
from tensorpack.tfutils.optimizer import AccumGradOptimizer
myopt = tf.train.GradientDescentOptimizer(0.01)
myopt = AccumGradOptimizer(myopt, niter=5)
train_op = myopt.minimize(loss)
"""
def
__init__
(
self
,
opt
,
niter
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment