Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
af667ff4
Commit
af667ff4
authored
Oct 27, 2017
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update docs
parent
6a1e822d
Changes
7
Show whitespace changes
Inline
Side-by-side
Showing
7 changed files
with
113 additions
and
97 deletions
+113
-97
docs/tutorial/graph.md
docs/tutorial/graph.md
+0
-58
docs/tutorial/index.rst
docs/tutorial/index.rst
+1
-1
docs/tutorial/trainer.md
docs/tutorial/trainer.md
+31
-24
docs/tutorial/training-interface.md
docs/tutorial/training-interface.md
+55
-0
examples/GAN/GAN.py
examples/GAN/GAN.py
+1
-1
tensorpack/train/base.py
tensorpack/train/base.py
+16
-8
tensorpack/train/interface.py
tensorpack/train/interface.py
+9
-5
No files found.
docs/tutorial/graph.md
deleted
100644 → 0
View file @
6a1e822d
# Build the Graph
This tutorial explains how a graph is built in tensorpack.
### ModelDesc
`ModelDesc`
is an abstraction over the most common type of models people train.
It assumes:
1.
Training is a single-cost optimized by a single
`tf.train.Optimizer`
.
2.
The graph can be trivially duplicated for data-parallel training or inference.
If your task is single-cost optimization,
you can subclass
`ModelDesc`
and implement several methods:
```
python
class
MyModel
(
ModelDesc
):
def
_get_inputs
(
self
):
return
[
InputDesc
(
...
),
InputDesc
(
...
)]
def
_build_graph
(
self
,
inputs
):
tensorA
,
tensorB
=
inputs
# build the graph
self
.
cost
=
xxx
# define the cost tensor
def
_get_optimizer
(
self
):
return
tf
.
train
.
GradientDescentOptimizer
(
0.1
)
```
`_get_inputs`
should define the metainfo of all the inputs your graph may need.
`_build_graph`
should add tensors/operations to the graph, where
the argument
`inputs`
is the list of input tensors matching
`_get_inputs`
.
You can use any symbolic functions in
`_build_graph`
, including TensorFlow core library
functions and other symbolic libraries.
### How it is Used:
Most tensorpack trainers expect a
`ModelDesc`
, and use it as a __description
of the TF graph to be built__.
These trainers will use
`_get_inputs`
to connect the given
`InputSource`
to the graph.
They'll then use
`_build_graph`
to create the backbone model, and then
`_get_optimizer`
to create the minimization op, and run it.
Note that data-parallel multi-GPU trainers will call
`_build_graph`
__multiple times__ on each GPU.
A trainer may also make __extra calls__ to
`_build_graph`
for inference, if used by some callbacks.
`_build_graph`
will always be called under some
`TowerContext`
which contains these context information
(e.g. training or inference, reuse or not, scope name) for your access.
Also, to respect variable reuse among multiple calls, use
`tf.get_variable()`
instead of
`tf.Variable`
in
`_build_graph`
,
if you need to create any variables.
### Build It Manually
When you need to deal with complicated graph, it may be easier to build the graph manually.
You are free to do so as long as you tell the trainer what to do in each step.
Check out
[
Write a Trainer
](
extend/trainer.html
)
for using a custom graph with trainer.
docs/tutorial/index.rst
View file @
af667ff4
...
...
@@ -39,9 +39,9 @@ User Tutorials
dataflow
input-source
efficient-dataflow
graph
symbolic
trainer
training-interface
callback
summary
faq
...
...
docs/tutorial/trainer.md
View file @
af667ff4
# Trainer
Tensorpack trainers prepares and runs the training, which consists of the following steps:
1.
__Build graph__ for the model.
Users can call whatever tensorflow functions to setup the graph.
Users may or may not use tensorpack
`InputSource`
,
`ModelDesc`
to build the graph.
This step defines "what to run" in every training step.
2.
Train the model (the
[
Trainer.train() method
](
http://tensorpack.readthedocs.io/en/latest/modules/train.html#tensorpack.train.Trainer.train
)
):
1. Setup callbacks/monitors.
2. Finalize the graph, initialize session.
3. Run the main loop.
## Assumptions of Base Trainer
In research we do training of various kind.
Tensorpack trainers try to avoid making assumptions on what type of training
you want to do (e.g., it doesn't have to be batched, SGD-like, or have
`X`
(inputs) and
`y`
(outputs)).
The only assumption tensorpack
`Trainer`
class makes about your training, is that your training
follows this pattern:
```
python
...
...
@@ -15,47 +33,36 @@ Tensorpack base trainer implements the logic of __running the iteration__.
Users or derived trainers should implement __what the iteration is__.
2.
Trainer assumes the existence of __"epoch"__, i.e. that the iterations run in double for-loops.
But
an epoch doesn't need to be a full pass of your dataset, the size of an epoch can
be any number you set
But
the epoch size can actually
be any number you set
and it only affects the
[
schedule of callbacks
](
extend/callback.html
)
.
In other words, an "epoch" in tensorpack is the __default period to run callbacks__ (validation, summary, checkpoint, etc.).
###
Common
Trainers
###
Single-Cost
Trainers
Most neural network training tasks are single-cost optimization.
Tensorpack provides some trainer implementations for such tasks.
These trainers will build the graph b
ased on inputs and functions which build the cost from inputs.
These trainers will build the graph b
y itself, with the following arguments:
The simplest way to use trainers, is to pass a
`TrainConfig`
to the
`launch_train_with_config`
high-level wrapper.
1.
Some
`InputDesc`
, the metadata about the input.
2.
An
`InputSource`
, where the input come from. See
[
Input Pipeline
](
input-source.html
)
.
3.
A function which takes input tensors and returns the cost.
4.
A function which returns an optimizer.
```
python
config
=
TrainConfig
(
model
=
MyModel
()
dataflow
=
my_dataflow
,
# data=my_inputsource, # alternatively, use a customized InputSource
callbacks
=
[
...
]
)
trainer
=
SomeTrainer
()
# multi-GPU training with synchronous update:
# trainer = SyncMultiGPUTrainerParameterServer([0, 1, 2])
launch_train_with_config
(
config
,
trainer
)
```
When you set the DataFlow (rather than the InputSource) in the config,
`launch_train_with_config`
automatically adopt certain prefetch mechanism, as mentioned
in the
[
Input Pipeline
](
input-source.html
)
tutorial.
You can set the InputSource instead, to customize this behavior.
See
[
SingleCostTrainer.setup_graph
](
http://localhost:8000/modules/train.html#tensorpack.train.SingleCostTrainer.setup_graph
)
for details.
Existing multi-GPU trainers include the logic of data-parallel training.
You can enable them by just one line, and all the necessary logic to achieve the best performance was baked into the trainers already.
The trainers can reach the same performance as the
[
official tensorflow benchmark
](
https://www.tensorflow.org/performance/benchmarks
)
.
Please note that in data-parallel training, in each iteration all towers (all replicates of the model) will take
tensors from the
InputSource
(instead of taking one for all and split). So the total batch size
tensors from the
`InputSource`
(instead of taking one for all and split). So the total batch size
would be
``(batch size of InputSource/DataFlow) * #GPU``
.
There are also high-level wrappers that have slightly simpler interface (but exist mainly for old users).
See
[
High-Level Training Interface
](
training-interface.html
)
### Custom Trainers
You can easily write a trainer for other types of training.
...
...
docs/tutorial/training-interface.md
0 → 100644
View file @
af667ff4
# Training Interface
Tensorpack trainers provide low-level API which requires a number of options to setup.
There are high-level interfaces built on top of trainer to simplify the use,
when you don't want to customize too much.
### With ModelDesc and TrainConfig
[
SingleCost trainers
](
trainer.html#single-cost-trainers
)
expects
`InputDesc`
,
`InputSource`
, get_cost function, and optimizer.
`ModelDesc`
describes a model by packing three of them together into one object:
```
python
class
MyModel
(
ModelDesc
):
def
_get_inputs
(
self
):
return
[
InputDesc
(
...
),
InputDesc
(
...
)]
def
_build_graph
(
self
,
inputs
):
tensorA
,
tensorB
=
inputs
# build the graph
self
.
cost
=
xxx
# define the cost tensor
def
_get_optimizer
(
self
):
return
tf
.
train
.
GradientDescentOptimizer
(
0.1
)
```
`_get_inputs`
should define the metainfo of all the inputs your graph may need.
`_build_graph`
should add tensors/operations to the graph, where
the argument
`inputs`
is a list of tensors which will match
`_get_inputs`
.
You can use any symbolic functions in
`_build_graph`
, including TensorFlow core library
functions and other symbolic libraries.
But you need to follow the requirement of
[
get_cost_fn
](
http://tensorpack.readthedocs.io/en/latest/modules/train.html#tensorpack.train.SingleCostTrainer.setup_graph
)
,
because this function will be used as part of
`get_cost_fn`
.
At last you need to set
`self.cost`
.
After defining such a model, use it with
`TrainConfig`
and
`launch_train_with_config`
:
```
python
config
=
TrainConfig
(
model
=
MyModel
()
dataflow
=
my_dataflow
,
# data=my_inputsource, # alternatively, use a customized InputSource
callbacks
=
[
...
]
)
trainer
=
SomeTrainer
()
# trainer = SyncMultiGPUTrainerParameterServer([0, 1, 2])
launch_train_with_config
(
config
,
trainer
)
```
See the docs of
[
launch_train_with_config
](
http://tensorpack.readthedocs.io/en/latest/modules/train.html#tensorpack.train.launch_train_with_config
)
for its usage and detailed functionalities.
examples/GAN/GAN.py
View file @
af667ff4
...
...
@@ -90,7 +90,7 @@ class GANTrainer(TowerTrainer):
class
SeparateGANTrainer
(
TowerTrainer
):
""" A GAN trainer which runs two optimization ops with a certain ratio
, one in each step.
"""
""" A GAN trainer which runs two optimization ops with a certain ratio
.
"""
def
__init__
(
self
,
input
,
model
,
d_period
=
1
,
g_period
=
1
):
"""
Args:
...
...
tensorpack/train/base.py
View file @
af667ff4
...
...
@@ -356,7 +356,7 @@ class SingleCostTrainer(TowerTrainer):
Single-cost trainer has a :meth:`setup_graph` method which takes
(inputs_desc, input, get_cost_fn, get_opt_fn), and build the training operations from them.
To use a
SingleCostTrainer
object, call `trainer.setup_graph(...); trainer.train(...)`.
To use a
:class:`SingleCostTrainer`
object, call `trainer.setup_graph(...); trainer.train(...)`.
"""
@
call_only_once
...
...
@@ -368,14 +368,16 @@ class SingleCostTrainer(TowerTrainer):
inputs_desc ([InputDesc]):
input (InputSource):
get_cost_fn ([tf.Tensor] -> tf.Tensor): callable, takes some input tenosrs and return a cost tensor.
Might get called multiple times for data-parallel training or inference.
get_opt_fn (-> tf.train.Optimizer): callable which returns an
optimizer. Will only be called once.
Returns:
[Callback]: a (possibly empty) list of callbacks needed for training.
These callbacks will be automatically added when you call `train()`.
So you can usually ignore the return value.
Note:
1. `get_cost_fn` will always be called under a :class:`TowerContext`.
which will contain information abouut reuse,
training/inference, scope name, etc.
2. `get_cost_fn` might get called multiple times for data-parallel training or inference.
3. To respect variable reuse, use `tf.get_variable` instead of
`tf.Variable` in `get_cost_fn`.
"""
get_cost_fn
=
TowerFuncWrapper
(
get_cost_fn
,
inputs_desc
)
get_opt_fn
=
memoized
(
get_opt_fn
)
...
...
@@ -386,11 +388,17 @@ class SingleCostTrainer(TowerTrainer):
internal_callbacks
=
input_callbacks
+
train_callbacks
for
cb
in
internal_callbacks
:
self
.
_register_callback
(
cb
)
return
internal_callbacks
# TODO register directly instead of return?
@
abstractmethod
def
_setup_graph
(
self
,
input
,
get_cost_fn
,
get_opt_fn
):
pass
"""
Implement the logic to build the graph, with an :class:`InputSource`
that's been setup already.
Returns:
[Callback]: list of callbacks needed
"""
def
_setup_input
(
self
,
inputs_desc
,
input
):
assert
not
input
.
setup_done
()
...
...
tensorpack/train/interface.py
View file @
af667ff4
...
...
@@ -44,7 +44,14 @@ def apply_default_prefetch(input_source_or_dataflow, trainer, towers):
def
launch_train_with_config
(
config
,
trainer
):
"""
Train with a :class:`TrainConfig` and a :class:`Trainer`, to
mimic the old training interface.
mimic the old training interface. It basically does the following
3 things (and you can easily do them by yourself):
1. Setup the :class:`InputSource` with automatic prefetching,
for `config.data` or `config.dataflow`.
2. Call `trainer.setup_graph` with the :class:`InputSource`,
as well as `config.model`.
3. Call `trainer.train` with rest of the attributes of config.
Args:
config (TrainConfig):
...
...
@@ -79,7 +86,4 @@ def launch_train_with_config(config, trainer):
trainer
.
setup_graph
(
inputs_desc
,
input
,
model
.
_build_graph_get_cost
,
model
.
get_optimizer
)
trainer
.
train
(
config
.
callbacks
,
config
.
monitors
,
config
.
session_creator
,
config
.
session_init
,
config
.
steps_per_epoch
,
config
.
starting_epoch
,
config
.
max_epoch
)
trainer
.
train_with_config
(
config
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment