Commit f2697f69 authored by Yuxin Wu's avatar Yuxin Wu

update docs

parent 34f0dd6d
# Summary and Logging # Summary and Logging
This tutorial will introduce the `Monitor` backend and During training, everything other than the training iterations themselves are executed through callbacks, including all the logging.
explain how tensorpack handles summaries and logging. This tutorial will explain how summaries and logging are handled in callbacks and how can you customize them.
The default logging behavior should be good enough for normal use cases, so you may skip this tutorial.
### Monitors
In tensorpack, everything besides the training iterations are done in callbacks, including all the logging.
When a callback gets something to log, it will write to the monitor backend through
`trainer.monitors`, by calling `put_{scalar,image,summary,...}`.
The call gets dispatched to multiple `TrainingMonitor` instances.
These monitors are a special type of callback which can process different types of log data,
and can be customized in `TrainConfig`.
### TensorFlow Summaries ### TensorFlow Summaries
Here is how TensorFlow summaries eventually get logged/saved/printed: This is how TensorFlow summaries eventually get logged/saved/printed:
1. __What to Log__: When you call `tf.summary.xxx` in your graph code, TensorFlow adds an op to 1. __What to Log__: When you call `tf.summary.xxx` in your graph code, TensorFlow adds an op to
`tf.GraphKeys.SUMMARIES` collection (by default). `tf.GraphKeys.SUMMARIES` collection (by default).
2. __When to Log__: A [MergeAllSummaries](../modules/callbacks.html#tensorpack.callbacks.MergeAllSummaries) 2. __When to Log__: A [MergeAllSummaries](../modules/callbacks.html#tensorpack.callbacks.MergeAllSummaries)
callback is enabled by default in `TrainConfig`. callback is enabled by default in `TrainConfig`.
It runs ops in the `SUMMARIES` collection (by default) every epoch (by default), It runs ops in the `SUMMARIES` collection (by default) every epoch (by default),
and writes results to the monitor backend. and writes results to the monitors.
3. __Where to Log__: 3. __Where to Log__:
Several monitor instances are enabled by default in [TrainConfig](../modules/train.html#tensorpack.train.TrainConfig):
* A [TFEventWriter](../modules/callbacks.html#tensorpack.callbacks.TFEventWriter) * A [TFEventWriter](../modules/callbacks.html#tensorpack.callbacks.TFEventWriter)
monitor is enabled by default in [TrainConfig](../modules/train.html#tensorpack.train.TrainConfig), writes things to an event file used by tensorboard.
which writes things to an event file used by tensorboard.
* A [ScalarPrinter](../modules/callbacks.html#tensorpack.callbacks.ScalarPrinter) * A [ScalarPrinter](../modules/callbacks.html#tensorpack.callbacks.ScalarPrinter)
monitor is enabled by default, which prints all scalars in your terminal. prints all scalars in your terminal.
* A [JSONWriter](../modules/callbacks.html#tensorpack.callbacks.JSONWriter) * A [JSONWriter](../modules/callbacks.html#tensorpack.callbacks.JSONWriter)
monitor is enabled by default, which saves scalars to a file. saves scalars to a JSON file.
Since summaries are evaluated every epoch by default, if the content is data-dependent, the results All the "what, when, where" can be customized in either the graph or the `TrainConfig`.
are likely to have too much variance. You can:
1. Change "When to Log": log more frequently, but note that some large summaries are expensive to Since TF summaries are evaluated every epoch by default, if the content is data-dependent, the results
are likely to have too much variance. To address this issue, you can:
1. Change "When to Log": log more frequently, but note that certain summaries can be expensive to
log. You may want to use a separate collection for frequent logging. log. You may want to use a separate collection for frequent logging.
2. Change "What to Log": you can call 2. Change "What to Log": you can call
[tfutils.summary.add_moving_summary](../modules/tfutils.html#tensorpack.tfutils.summary.add_moving_summary) [tfutils.summary.add_moving_summary](../modules/tfutils.html#tensorpack.tfutils.summary.add_moving_summary)
on scalar tensors, which will summarize the moving average of those scalars instead of their instant values. on scalar tensors, which will summarize the moving average of those scalars, instead of their instant values.
The moving averages are maintained by the The moving averages are maintained by the
[MovingAverageSummary](../modules/callbacks.html#tensorpack.callbacks.MovingAverageSummary) [MovingAverageSummary](../modules/callbacks.html#tensorpack.callbacks.MovingAverageSummary)
callback (enabled by default). callback (enabled by default).
### Other Data
Besides TensorFlow summaries, Besides TensorFlow summaries,
a callback is free to log any other types of data to the monitor backend, a callback can also write other data to the monitor backend anytime after the training has started.
anytime after the training has started. As long as the type of data is supported, the data will be dispatched to and logged to the same place.
As long as the type of data is supported, it will be logged by each monitor.
In other words, tensorboard can show not only summaries in the graph, but also your custom data. As a result, tensorboard will show not only summaries in the graph, but also your custom data.
For example, a precise validation error often needs to be computed manually, outside the TensorFlow graph.
With a uniform monitor backend, this number will show up in tensorboard as well.
...@@ -145,7 +145,7 @@ class ClassificationError(Inferencer): ...@@ -145,7 +145,7 @@ class ClassificationError(Inferencer):
Therefore the result can be different from averaging the error rate of each batch. Therefore the result can be different from averaging the error rate of each batch.
""" """
def __init__(self, wrong_tensor_name='incorrect_vector', summary_name='val_error'): def __init__(self, wrong_tensor_name='incorrect_vector', summary_name='validation_error'):
""" """
Args: Args:
wrong_tensor_name(str): name of the ``wrong`` tensor. wrong_tensor_name(str): name of the ``wrong`` tensor.
......
...@@ -269,8 +269,9 @@ class MapDataComponent(MapData): ...@@ -269,8 +269,9 @@ class MapDataComponent(MapData):
func (TYPE -> TYPE|None): takes ``dp[index]``, returns a new value for ``dp[index]``. func (TYPE -> TYPE|None): takes ``dp[index]``, returns a new value for ``dp[index]``.
return None to discard this datapoint. return None to discard this datapoint.
index (int): index of the component. index (int): index of the component.
""" """
index = int(index)
def f(dp): def f(dp):
r = func(dp[index]) r = func(dp[index])
if r is None: if r is None:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment