Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
seminar-breakout
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Shashank Suhas
seminar-breakout
Commits
c04e0e11
Commit
c04e0e11
authored
May 03, 2019
by
Yuxin Wu
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update docs
parent
0c2c44e2
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
10 additions
and
8 deletions
+10
-8
docs/tutorial/save-load.md
docs/tutorial/save-load.md
+10
-8
No files found.
docs/tutorial/save-load.md
View file @
c04e0e11
...
@@ -34,7 +34,10 @@ It dumps the model to a `var-name: value` dict saved in npz format.
...
@@ -34,7 +34,10 @@ It dumps the model to a `var-name: value` dict saved in npz format.
## Load a Model to a Session
## Load a Model to a Session
Model loading (in both training and inference) is through the
`session_init`
interface.
Model loading (in both training and inference) is through the
`session_init`
interface.
Currently there are two major ways a session can be restored:
For training, use
`session_init`
in
`TrainConfig`
or
`Trainer.train()`
.
For inference, use
`session_init`
in
`PredictConfig`
.
There are two ways a session can be initialized:
[
session_init=SaverRestore(...)
](
../modules/tfutils.html#tensorpack.tfutils.sessinit.SaverRestore
)
[
session_init=SaverRestore(...)
](
../modules/tfutils.html#tensorpack.tfutils.sessinit.SaverRestore
)
which restores a TF checkpoint,
which restores a TF checkpoint,
or
[
session_init=DictRestore(...)
](
../modules/tfutils.html#tensorpack.tfutils.sessinit.DictRestore
)
which restores a dict.
or
[
session_init=DictRestore(...)
](
../modules/tfutils.html#tensorpack.tfutils.sessinit.DictRestore
)
which restores a dict.
...
@@ -43,7 +46,7 @@ you need (e.g., remove variables, rename variables) to the dict.
...
@@ -43,7 +46,7 @@ you need (e.g., remove variables, rename variables) to the dict.
To load multiple models, use
[
ChainInit
](
../modules/tfutils.html#tensorpack.tfutils.sessinit.ChainInit
)
.
To load multiple models, use
[
ChainInit
](
../modules/tfutils.html#tensorpack.tfutils.sessinit.ChainInit
)
.
To load an npz file to a session, you can use
`DictRestore(dict(np.load(filename)))`
.
To load an npz file
from tensorpack model zoo
to a session, you can use
`DictRestore(dict(np.load(filename)))`
.
You can also use
You can also use
[
get_model_loader(filename)
](
../modules/tfutils.html#tensorpack.tfutils.sessinit.get_model_loader
)
,
[
get_model_loader(filename)
](
../modules/tfutils.html#tensorpack.tfutils.sessinit.get_model_loader
)
,
a small helper which returns either a
`SaverRestore`
or a
`DictRestore`
based on the file name.
a small helper which returns either a
`SaverRestore`
or a
`DictRestore`
based on the file name.
...
@@ -64,23 +67,22 @@ graph, or rename/remove the variables in your loader.
...
@@ -64,23 +67,22 @@ graph, or rename/remove the variables in your loader.
## Resume Training
## Resume Training
"resume training" is mostly just "loading the last known checkpoint".
"Resume training" is mostly just "loading the last known checkpoint".
Therefore you should refer to the
[
previous section
](
#load-a-model-to-a-session
)
To load a model, you should refer to the previous section:
[
Load a Model to a Session
](
#load-a-model-to-a-session
)
.
on how to load a model.
```
eval_rst
```
eval_rst
.. note:: **A checkpoint does not resume everything!**
.. note:: **A checkpoint does not resume everything!**
The TensorFlow checkpoint only saves TensorFlow variables,
Loading the checkpoint does most of the work in "resume trainig", but note that
TensorFlow checkpoint only saves TensorFlow variables,
which means other Python state that are not TensorFlow variables will not be saved
which means other Python state that are not TensorFlow variables will not be saved
and resumed. This means:
and resumed. This means:
1. Training epoch number will not be resumed.
1. Training epoch number will not be resumed.
You can set it by providing a ``starting_epoch`` to your
resume job
.
You can set it by providing a ``starting_epoch`` to your
``TrainConfig``
.
2. State in your callbacks will not be resumed. Certain callbacks maintain a state
2. State in your callbacks will not be resumed. Certain callbacks maintain a state
(e.g., current best accuracy) in Python, which cannot be saved automatically.
(e.g., current best accuracy) in Python, which cannot be saved automatically.
```
```
The
[
AutoResumeTrainConfig
](
../modules/train.html#tensorpack.train.AutoResumeTrainConfig
)
The
[
AutoResumeTrainConfig
](
../modules/train.html#tensorpack.train.AutoResumeTrainConfig
)
is an alternative of
`TrainConfig`
which applies some heuristics to load the lastest epoch number and lastest checkpoint.
is an alternative of
`TrainConfig`
which applies some heuristics to load the lastest epoch number and lastest checkpoint.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment