Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
9b707d91
Commit
9b707d91
authored
Aug 04, 2017
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update trainer doc
parent
19a6d4ef
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
42 additions
and
25 deletions
+42
-25
docs/tutorial/extend/trainer.md
docs/tutorial/extend/trainer.md
+22
-8
docs/tutorial/trainer.md
docs/tutorial/trainer.md
+20
-17
No files found.
docs/tutorial/extend/trainer.md
View file @
9b707d91
## Write a
t
rainer
## Write a
T
rainer
The existing trainers should be enough for single-cost optimization tasks.
If you
The existing trainers should be enough for single-cost optimization tasks.
want to do something inside the trainer, consider writing it as a callback, or
If you want to do something different during training, first consider writing it as a callback,
write an issue to see if there is a better solution than creating new trainers.
or
write an issue to see if there is a better solution than creating new trainers.
For certain tasks, you might need a new trainer.
For certain tasks, you do need a new trainer.
The
[
GAN trainer
](
../../examples/GAN/GAN.py
)
is one example of how to implement
new trainers.
More details to come.
Trainers just run __some__ iterations, so there is no limit in where the data come from or what to do in an iteration.
The existing common trainers do two things:
1.
Setup the graph and input pipeline, from
`TrainConfig`
.
2.
Minimize
`model.cost`
in each iteration.
But you can customize it by using the base
`Trainer`
class.
*
To customize the graph:
Create the graph, add any tensors and ops either before creating the trainer or inside
`Trainer.__init__`
.
*
Two ways to customize the iteration:
1. Set `Trainer.train_op`. This op will be run by default.
2. Subclass `Trainer` and override the `run_step()` method. This way you can run more ops in one iteration.
There are several different
[
GAN trainers
](
../../examples/GAN/GAN.py
)
for reference.
docs/tutorial/trainer.md
View file @
9b707d91
# Trainer
# Trainer
Training is
**running something again and again**
.
In research we do training of various kind.
The only assumption tensorpack
`Trainer`
class makes about your training, is that your training
follows this pattern:
```
python
for
epoch_num
in
range
(
starting_epoch
,
max_epochs
):
for
local_step
in
range
(
steps_per_epoch
):
run_step
()
```
1.
Training is
**running some iteration**
.
Tensorpack base trainer implements the logic of __running the iteration__.
Tensorpack base trainer implements the logic of __running the iteration__.
Users or derived trainers should implement __what the iteration is__.
Users or derived trainers should implement __what the iteration is__.
2.
Trainer assumes the existence of "epoch", i.e. that the iterations run in double for loops.
But it doesn't need to be a full pass of your dataset,
``steps_per_epoch``
can be any number you set
and it only affects the
[
schedule of callbacks
](
http://tensorpack.readthedocs.io/en/latest/tutorial/extend/callback.html
)
.
In other words, an "epoch" is the __default period__ to run callbacks (validation, summary, checkpoint, etc.).
### Common Trainers
### Common Trainers
...
@@ -36,25 +50,14 @@ You can set the InputSource instead, to customize this behavior.
...
@@ -36,25 +50,14 @@ You can set the InputSource instead, to customize this behavior.
Existing multi-GPU trainers include the logic of data-parallel training.
Existing multi-GPU trainers include the logic of data-parallel training.
You can enable them by just one line, and all the necessary logic to achieve the best performance was baked into the trainers already.
You can enable them by just one line, and all the necessary logic to achieve the best performance was baked into the trainers already.
The trainers can reach the same performance as the
[
official tensorflow benchmark
](
https://
github.com/tensorflow
/benchmarks
)
.
The trainers can reach the same performance as the
[
official tensorflow benchmark
](
https://
www.tensorflow.org/performance
/benchmarks
)
.
Please note that in data-parallel training, in each iteration all towers (all replicates of the model) will take
Please note that in data-parallel training, in each iteration all towers (all replicates of the model) will take
tensors from the InputSource (instead of taking one for all and split). So the total batch size
tensors from the InputSource (instead of taking one for all and split). So the total batch size
would be
multiplied by the number of GPUs
.
would be
``(batch size of InputSource/DataFlow) * #GPU``
.
### Custom Trainers
### Custom Trainers
Trainers just run __some__ iterations, so there is no limit in where the data come from or what to do in an iteration.
You can easily write a trainer for other types of training.
The existing trainers implement the default logic, but you can implement them yourself by using the base
`Trainer`
class.
See
[
Write a Trainer
](
http://tensorpack.readthedocs.io/en/latest/tutorial/extend/trainer.html
)
.
*
Two ways to customize the graph:
1.
Create the graph, add any tensors and ops before creating the trainer.
2.
Subclass
`Trainer`
and override the
`_setup()`
method which will be called in
`Trainer.__init__`
.
*
Two ways to customize the iteration:
1. Set `Trainer.train_op`. This op will be run by default.
2. Subclass `Trainer` and override the `run_step()` method.
There are several different
[
GAN trainers
](
../examples/GAN/GAN.py
)
for reference.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment