Commit 2238ca93 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 77dc71e3
...@@ -13,7 +13,7 @@ It's Yet Another TF high-level API, with __speed__, __readability__ and __flexib ...@@ -13,7 +13,7 @@ It's Yet Another TF high-level API, with __speed__, __readability__ and __flexib
1. Focus on __training speed__. 1. Focus on __training speed__.
+ Speed comes for free with tensorpack -- it uses TensorFlow in the __efficient way__ with no extra overhead. + Speed comes for free with tensorpack -- it uses TensorFlow in the __efficient way__ with no extra overhead.
On different CNNs, it runs training [1.2~5x faster](https://github.com/tensorpack/benchmarks/tree/master/other-wrappers) than the equivalent Keras code. On common CNNs, it runs training [1.2~5x faster](https://github.com/tensorpack/benchmarks/tree/master/other-wrappers) than the equivalent Keras code.
+ Data-parallel multi-GPU/distributed training strategy is off-the-shelf to use. + Data-parallel multi-GPU/distributed training strategy is off-the-shelf to use.
It scales as well as Google's [official benchmark](https://www.tensorflow.org/performance/benchmarks). It scales as well as Google's [official benchmark](https://www.tensorflow.org/performance/benchmarks).
......
...@@ -10,12 +10,12 @@ It's Yet Another TF wrapper, but different in: ...@@ -10,12 +10,12 @@ It's Yet Another TF wrapper, but different in:
- Focus on **training speed**. - Focus on **training speed**.
- Speed comes for free with tensorpack -- it uses TensorFlow in the - Speed comes for free with tensorpack -- it uses TensorFlow in the
**efficient way** with no extra overhead. On various CNNs, it runs **efficient way** with no extra overhead. On common CNNs, it runs
`1.2~5x faster <https://github.com/tensorpack/benchmarks/tree/master/other-wrappers>`_ `1.2~5x faster <https://github.com/tensorpack/benchmarks/tree/master/other-wrappers>`_
than the equivalent Keras code. than the equivalent Keras code.
- Data-parallel multi-GPU/distributed training strategy is off-the-shelf to use. - Data-parallel multi-GPU/distributed training strategy is off-the-shelf to use.
It is as fast as Google's It scales as well as Google's
`official benchmark <https://www.tensorflow.org/performance/benchmarks>`_. `official benchmark <https://www.tensorflow.org/performance/benchmarks>`_.
You cannot beat its speed unless you're a TensorFlow expert. You cannot beat its speed unless you're a TensorFlow expert.
...@@ -23,13 +23,13 @@ It's Yet Another TF wrapper, but different in: ...@@ -23,13 +23,13 @@ It's Yet Another TF wrapper, but different in:
- Focus on **large datasets**. - Focus on **large datasets**.
- It's unnecessary to read/preprocess data with a new language called TF. - You don't usually need `tf.data`. Symbolic programming often makes data processing harder.
Tensorpack helps you load large datasets (e.g. ImageNet) in **pure Python** with autoparallelization. Tensorpack helps you efficiently process large datasets (e.g. ImageNet) in **pure Python** with autoparallelization.
- It's not a model wrapper. - It's not a model wrapper.
- There are already too many symbolic function wrappers in the world. - There are already too many symbolic function wrappers in the world.
Tensorpack includes only a few common models, but you can use any other wrappers within tensorpack, including sonnet/Keras/slim/tflearn/tensorlayer/.... Tensorpack includes only a few common models, but you can use any symbolic function library inside tensorpack, including tf.layers/Keras/slim/tflearn/tensorlayer/...
See :doc:`tutorial/index` to know more about these features: See :doc:`tutorial/index` to know more about these features:
......
...@@ -64,7 +64,9 @@ Note some __common problems__ when using these trainers: ...@@ -64,7 +64,9 @@ Note some __common problems__ when using these trainers:
instead of taking one for all and split. instead of taking one for all and split.
So the total batch size would become ``(batch size of InputSource) * #GPU``. So the total batch size would become ``(batch size of InputSource) * #GPU``.
Splitting a tensor for data-parallel training makes no sense at all, only to put unnecessary shape constraints on the data. Splitting a tensor for data-parallel training makes no sense at all. First, why
wasting time in concatenating into large batches and then split them?
Second, this puts unnecessary shape constraints on the data.
By letting each GPU train on its own input tensors, they can train on inputs of different shapes simultaneously. By letting each GPU train on its own input tensors, they can train on inputs of different shapes simultaneously.
2. The tower function (your model code) will get called multipile times. 2. The tower function (your model code) will get called multipile times.
......
...@@ -28,7 +28,7 @@ GAMMA = 0.99 ...@@ -28,7 +28,7 @@ GAMMA = 0.99
MEMORY_SIZE = 1e6 MEMORY_SIZE = 1e6
# will consume at least 1e6 * 84 * 84 bytes == 6.6G memory. # will consume at least 1e6 * 84 * 84 bytes == 6.6G memory.
INIT_MEMORY_SIZE = MEMORY_SIZE // 20 INIT_MEMORY_SIZE = MEMORY_SIZE // 20
STEPS_PER_EPOCH = 10000 // UPDATE_FREQ * 10 # each epoch is 100k played frames STEPS_PER_EPOCH = 100000 // UPDATE_FREQ # each epoch is 100k played frames
EVAL_EPISODE = 50 EVAL_EPISODE = 50
NUM_ACTIONS = None NUM_ACTIONS = None
...@@ -105,7 +105,7 @@ def get_config(): ...@@ -105,7 +105,7 @@ def get_config():
every_k_steps=10000 // UPDATE_FREQ), # update target network every 10k steps every_k_steps=10000 // UPDATE_FREQ), # update target network every 10k steps
expreplay, expreplay,
ScheduledHyperParamSetter('learning_rate', ScheduledHyperParamSetter('learning_rate',
[(60, 4e-4), (100, 2e-4)]), [(60, 4e-4), (100, 2e-4), (500, 5e-5)]),
ScheduledHyperParamSetter( ScheduledHyperParamSetter(
ObjAttrParam(expreplay, 'exploration'), ObjAttrParam(expreplay, 'exploration'),
[(0, 1), (10, 0.1), (320, 0.01)], # 1->0.1 in the first million steps [(0, 1), (10, 0.1), (320, 0.01)], # 1->0.1 in the first million steps
...@@ -116,7 +116,7 @@ def get_config(): ...@@ -116,7 +116,7 @@ def get_config():
HumanHyperParamSetter('learning_rate'), HumanHyperParamSetter('learning_rate'),
], ],
steps_per_epoch=STEPS_PER_EPOCH, steps_per_epoch=STEPS_PER_EPOCH,
max_epoch=1000, max_epoch=800,
) )
......
...@@ -100,7 +100,7 @@ def get_config(model, fake=False): ...@@ -100,7 +100,7 @@ def get_config(model, fake=False):
model=model, model=model,
data=data, data=data,
callbacks=callbacks, callbacks=callbacks,
steps_per_epoch=100 if args.fake else 1280000 // args.batch, steps_per_epoch=100 if args.fake else 1281167 // args.batch,
max_epoch=105, max_epoch=105,
) )
......
...@@ -38,12 +38,12 @@ def BatchNorm(inputs, axis=None, training=None, momentum=0.9, epsilon=1e-5, ...@@ -38,12 +38,12 @@ def BatchNorm(inputs, axis=None, training=None, momentum=0.9, epsilon=1e-5,
internal_update=False): internal_update=False):
""" """
Mostly equivalent to `tf.layers.batch_normalization`, but different in Mostly equivalent to `tf.layers.batch_normalization`, but different in
the following: the following due to historical reasons:
1. Accepts `data_format` when `axis` is None. For 2D input, this argument will be ignored. 1. Accepts `data_format` when `axis` is None. For 2D input, this argument will be ignored.
2. Default value for `momentum` and `epsilon` is different. 2. Default value for `momentum` and `epsilon` is different.
3. Default value for `training` is automatically obtained from `TowerContext`. 3. Default value for `training` is automatically obtained from `TowerContext`.
4. Support the `internal_update` option. 4. Support the `internal_update` option, which can be very useful in certain models.
Args: Args:
internal_update (bool): if False, add EMA update ops to internal_update (bool): if False, add EMA update ops to
......
...@@ -154,6 +154,39 @@ def execute_only_once(): ...@@ -154,6 +154,39 @@ def execute_only_once():
return True return True
def _pick_tqdm_interval(file):
# Heuristics to pick a update interval for progress bar that's nice-looking for users.
isatty = file.isatty()
# Jupyter notebook should be recognized as tty.
# Wait for https://github.com/ipython/ipykernel/issues/268
try:
from ipykernel import iostream
if isinstance(file, iostream.OutStream):
isatty = True
except ImportError:
pass
if isatty:
return 0.5
else:
# When run under mpirun/slurm, isatty is always False.
# Here we apply some hacky heuristics for slurm.
if 'SLURM_JOB_ID' in os.environ:
if int(os.environ.get('SLURM_JOB_NUM_NODES', 1)) > 1:
# multi-machine job, probably not interactive
return 60
else:
# possibly interactive, so let's be conservative
return 15
if 'OMPI_COMM_WORLD_SIZE' in os.environ:
if int(os.environ['OMPI_COMM_WORLD_SIZE']) > 1:
return 60
# If not a tty, don't refresh progress bar that often
return 180
def get_tqdm_kwargs(**kwargs): def get_tqdm_kwargs(**kwargs):
""" """
Return default arguments to be used with tqdm. Return default arguments to be used with tqdm.
...@@ -174,33 +207,8 @@ def get_tqdm_kwargs(**kwargs): ...@@ -174,33 +207,8 @@ def get_tqdm_kwargs(**kwargs):
# Use this env var to override the refresh interval setting # Use this env var to override the refresh interval setting
interval = float(os.environ['TENSORPACK_PROGRESS_REFRESH']) interval = float(os.environ['TENSORPACK_PROGRESS_REFRESH'])
except KeyError: except KeyError:
interval = _pick_tqdm_interval(kwargs.get('file', sys.stderr))
f = kwargs.get('file', sys.stderr)
isatty = f.isatty()
# Jupyter notebook should be recognized as tty.
# Wait for https://github.com/ipython/ipykernel/issues/268
try:
from ipykernel import iostream
if isinstance(f, iostream.OutStream):
isatty = True
except ImportError:
pass
if isatty:
interval = 0.5
else:
# When run under mpirun/slurm, isatty is always False.
# Here we apply some hacky heuristics for slurm.
if 'SLURM_JOB_ID' in os.environ:
if int(os.environ.get('SLURM_JOB_NUM_NODES', 1)) > 1:
# multi-machine job, probably not interactive
interval = 180
else:
# possibly interactive, so let's be conservative
interval = 15
# If not a tty, don't refresh progress bar that often
interval = 180
default['mininterval'] = interval default['mininterval'] = interval
default.update(kwargs) default.update(kwargs)
return default return default
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment