Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
4b25b6fa
Commit
4b25b6fa
authored
Nov 19, 2018
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update docs
parent
c842bf51
Changes
6
Hide whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
42 additions
and
16 deletions
+42
-16
README.md
README.md
+3
-3
docs/modules/dataflow.imgaug.rst
docs/modules/dataflow.imgaug.rst
+1
-1
examples/FasterRCNN/train.py
examples/FasterRCNN/train.py
+1
-1
tensorpack/dataflow/imgaug/external.py
tensorpack/dataflow/imgaug/external.py
+23
-2
tensorpack/dataflow/parallel.py
tensorpack/dataflow/parallel.py
+9
-5
tensorpack/models/batch_norm.py
tensorpack/models/batch_norm.py
+5
-4
No files found.
README.md
View file @
4b25b6fa
...
...
@@ -16,14 +16,14 @@ It's Yet Another TF high-level API, with __speed__, __readability__ and __flexib
Your training can probably gets faster if written with Tensorpack.
+ Data-parallel multi-GPU/distributed training strategy is off-the-shelf to use.
It scales as well as Google's [official benchmark](https://www.tensorflow.org/performance/benchmarks).
It scales as well as Google's [official benchmark](https://www.tensorflow.org/performance/benchmarks).
+ See [tensorpack/benchmarks](https://github.com/tensorpack/benchmarks) for
some benchmark scripts.
some benchmark scripts.
2.
Focus on __large datasets__.
+
[
You don't usually need `tf.data`
](
http://tensorpack.readthedocs.io/tutorial/input-source.html#tensorflow-reader-cons
)
.
Symbolic programming often makes data processing harder.
Symbolic programming often makes data processing harder.
Tensorpack helps you efficiently process large datasets (e.g. ImageNet) in __pure Python__ with autoparallelization.
3.
It's not a model wrapper.
...
...
docs/modules/dataflow.imgaug.rst
View file @
4b25b6fa
...
...
@@ -4,7 +4,7 @@ tensorpack.dataflow.imgaug package
This package contains Tensorpack's augmentors.
Note that other image augmentation libraries can be wrapped into Tensorpack's interface as well.
For example, `imgaug.IAAugmentor <#tensorpack.dataflow.imgaug.IAAugmentor>`_
and `imgaug.Albumentations <#tensorpack.dataflow.imgaug.Albumentations`_
and `imgaug.Albumentations <#tensorpack.dataflow.imgaug.Albumentations
>
`_
wrap two popular image augmentation libraries.
.. container:: custom-index
...
...
examples/FasterRCNN/train.py
View file @
4b25b6fa
...
...
@@ -555,7 +555,7 @@ if __name__ == '__main__':
train_dataflow
=
get_train_dataflow
()
# This is what's commonly referred to as "epochs"
total_passes
=
cfg
.
TRAIN
.
LR_SCHEDULE
[
-
1
]
*
8
/
train_dataflow
.
size
()
logger
.
info
(
"Total passes of the training set is: {}"
.
format
(
total_passes
))
logger
.
info
(
"Total passes of the training set is: {
:.5g
}"
.
format
(
total_passes
))
callbacks
=
[
PeriodicCallback
(
...
...
tensorpack/dataflow/imgaug/external.py
View file @
4b25b6fa
...
...
@@ -10,7 +10,7 @@ __all__ = ['IAAugmentor', 'Albumentations']
class
IAAugmentor
(
ImageAugmentor
):
"""
Wrap an augmentor form the IAA library: https://github.com/aleju/imgaug
Wrap an augmentor form the IAA library: https://github.com/aleju/imgaug
.
Both images and coordinates are supported.
Note:
...
...
@@ -20,6 +20,19 @@ class IAAugmentor(ImageAugmentor):
2. Only uint8 images are supported by the IAA library.
3. The IAA library can only produces images of the same shape.
Example:
.. code-block:: python
from tensorpack import imgaug # this is not the aleju/imgaug library
from imgaug import augmentors as iaa # this is the aleju/imgaug library
myaug = imgaug.IAAugmentor(
iaa.Sequential([
iaa.Sharpen(alpha=(0, 1), lightness=(0.75, 1.5)),
iaa.Fliplr(0.5),
iaa.Crop(px=(0, 100)),
])
"""
def
__init__
(
self
,
augmentor
):
...
...
@@ -48,8 +61,16 @@ class IAAugmentor(ImageAugmentor):
class
Albumentations
(
ImageAugmentor
):
"""
Wrap an augmentor form the albumentations library: https://github.com/albu/albumentations
Wrap an augmentor form the albumentations library: https://github.com/albu/albumentations
.
Coordinate augmentation is not supported by the library.
Example:
.. code-block:: python
from tensorpack import imgaug
import albumentations as AB
myaug = imgaug.Albumentations(AB.RandomRotate90(p=1))
"""
def
__init__
(
self
,
augmentor
):
"""
...
...
tensorpack/dataflow/parallel.py
View file @
4b25b6fa
...
...
@@ -146,6 +146,9 @@ class MultiProcessPrefetchData(ProxyDataFlow):
3. You can nest like this: ``PrefetchDataZMQ(PrefetchData(df, nr_proc=a), nr_proc=b)``.
A total of ``a`` instances of ``df`` worker processes will be created.
4. fork happens in `__init__`. `reset_state()` is a no-op. The worker processes won't get called.
5. This DataFlow does support windows. However, Windows requires more strict picklability on processes,
which means that some code that's forkable on Linux may not be forkable on Windows. If that happens you'll
need to re-organize some part of code that's not forkable.
"""
class
_Worker
(
mp
.
Process
):
...
...
@@ -170,9 +173,10 @@ class MultiProcessPrefetchData(ProxyDataFlow):
nr_prefetch (int): size of the queue to hold prefetched datapoints.
nr_proc (int): number of processes to use.
"""
# https://docs.python.org/3.6/library/multiprocessing.html?highlight=process#the-spawn-and-forkserver-start-methods
if
os
.
name
==
'nt'
:
logger
.
warn
(
"MultiProcessPrefetchData does support
w
indows.
\
However,
w
indows requires more strict picklability on processes, which may
\
logger
.
warn
(
"MultiProcessPrefetchData does support
W
indows.
\
However,
W
indows requires more strict picklability on processes, which may
\
lead of failure on some of the code."
)
super
(
MultiProcessPrefetchData
,
self
)
.
__init__
(
ds
)
try
:
...
...
@@ -210,8 +214,7 @@ PrefetchData = MultiProcessPrefetchData
# TODO renamed to MultiProcessDataFlow{,ZMQ} if separated to a new project
class
PrefetchDataZMQ
(
_MultiProcessZMQDataFlow
):
"""
Prefetch data from a DataFlow using multiple processes, with ZeroMQ for
communication.
Prefetch data from a DataFlow using multiple processes, with ZeroMQ for communication.
It will fork the calling process of :meth:`reset_state()`,
and collect datapoints from the given dataflow in each process by ZeroMQ IPC pipe.
...
...
@@ -237,7 +240,8 @@ class PrefetchDataZMQ(_MultiProcessZMQDataFlow):
it won't be usable in the forked process. Therefore, do not nest two `PrefetchDataZMQ`.
5. (Thread-safety) ZMQ is not thread safe. Therefore, do not call :meth:`get_data` of the same dataflow in
more than 1 threads.
6. (For Mac only) A UNIX named pipe will be created in the current directory.
6. This dataflow does not support windows. Use `MultiProcessPrefetchData` which works on windows.
7. (For Mac only) A UNIX named pipe will be created in the current directory.
However, certain non-local filesystem such as NFS/GlusterFS/AFS doesn't always support pipes.
You can change the directory by ``export TENSORPACK_PIPEDIR=/other/dir``.
In particular, you can use somewhere under '/tmp' which is usually local.
...
...
tensorpack/models/batch_norm.py
View file @
4b25b6fa
...
...
@@ -103,12 +103,13 @@ def BatchNorm(inputs, axis=None, training=None, momentum=0.9, epsilon=1e-5,
It uses the aggregated statistics of the whole batch (across all MPI ranks) to normalize.
Note that on single machine this is significantly slower than the "nccl" implementation.
This implementation averages the per-GPU E[x] and E[x^2] among GPUs
to compute
If not None, per-GPU E[x] and E[x^2] among all GPUs are averaged
to compute
global mean & variance. Therefore each GPU needs to have the same batch size.
It will match the BatchNorm layer on each GPU by its name (`BatchNorm('name', input)`).
If names do not match, the operation will hang.
This option has no effect when not training.
The BatchNorm layer on each GPU needs to use the same name (`BatchNorm('name', input)`), so that
statistics can be reduced. If names do not match, this layer will hang.
This option only has effect in standard training mode.
This option is also known as "Cross-GPU BatchNorm" as mentioned in:
`MegDet: A Large Mini-Batch Object Detector <https://arxiv.org/abs/1711.07240>`_.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment