update docs

1bdf0a45 · Yuxin Wu · e9a3c2b3 · 1bdf0a45 · 1bdf0a45
Commit 1bdf0a45 authored Apr 17, 2019 by Yuxin Wu
Hide whitespace changes
Inline Side-by-side

Showing with 26 additions and 16 deletions

docs/tutorial/save-load.md docs/tutorial/save-load.md +23 -16

examples/FasterRCNN/NOTES.md examples/FasterRCNN/NOTES.md +3 -0

No files found.
--- a/docs/tutorial/save-load.md
+++ b/docs/tutorial/save-load.md

 # Save and Load models

-## Inspect a TF Checkpoint
+## Work with a TF Checkpoint

 The `ModelSaver` callback saves the model to the directory defined by `logger.get_logger_dir()`,
 in TensorFlow checkpoint format.
@@ -17,34 +17,41 @@ for details.
 [scripts/ls-checkpoint.py](../scripts/ls-checkpoint.py)
 demos how to print all variables and their shapes in a checkpoint.

-[scripts/dump-model-params.py](../scripts/dump-model-params.py) can be used to remove unnecessary variables in a checkpoint.
+Tensorpack includes another tool to save variables to TF checkpoint, see
+[save_chkpt_vars](../modules/tfutils.html#tensorpack.tfutils.varmanip.save_chkpt_vars).
+
+## Work with npz Files in Model Zoo
+
+Most models provided by tensorpack are in npz (dictionary) format,
+because it's easy to manipulate without TF dependency.
+You can read/write them with `np.load` and `np.savez`.
+
+[scripts/dump-model-params.py](../scripts/dump-model-params.py) can be used to remove unnecessary variables in a checkpoint
+and save results to a npz.
 It takes a metagraph file (which is also saved by `ModelSaver`) and only saves variables that the model needs at inference time.
 It dumps the model to a `var-name: value` dict saved in npz format.

 ## Load a Model to a Session

 Model loading (in both training and inference) is through the `session_init` interface.
-Currently there are two ways a session can be restored:
+Currently there are two major ways a session can be restored:
 [session_init=SaverRestore(...)](../modules/tfutils.html#tensorpack.tfutils.sessinit.SaverRestore)
 which restores a TF checkpoint,
 or [session_init=DictRestore(...)](../modules/tfutils.html#tensorpack.tfutils.sessinit.DictRestore) which restores a dict.
+`DictRestore` is the most general loader because you can make arbitrary changes
+you need (e.g., remove variables, rename variables) to the dict.
+
 To load multiple models, use [ChainInit](../modules/tfutils.html#tensorpack.tfutils.sessinit.ChainInit).

-Many models in tensorpack model zoo are provided in the form of numpy dictionary (`.npz`),
-because it is easier to load and manipulate without requiring TensorFlow.
-To load such files to a session, use `DictRestore(dict(np.load(filename)))`.
+To load an npz file to a session, you can use `DictRestore(dict(np.load(filename)))`.
 You can also use
-[get_model_loader](../modules/tfutils.html#tensorpack.tfutils.sessinit.get_model_loader),
-a small helper to create a `SaverRestore` or `DictRestore` based on the file name.
-
-`DictRestore` is the most general loader because you can make arbitrary changes
-you need (e.g., remove variables, rename variables) to the dict.
-To load a TF checkpoint into a dict in order to make changes, use
-[load_chkpt_vars](../modules/tfutils.html#tensorpack.tfutils.varmanip.load_chkpt_vars).
+[get_model_loader(filename)](../modules/tfutils.html#tensorpack.tfutils.sessinit.get_model_loader),
+a small helper which returns either a `SaverRestore` or a `DictRestore` based on the file name.

-Variable restoring is completely based on __name match__ between
+Variable restoring is completely based on __exact name match__ between
 variables in the current graph and variables in the `session_init` initializer.
 Variables that appear in only one side will be printed as warning.
+Variables of the same name but incompatible shapes will cause error.

 ## Transfer Learning

@@ -52,7 +59,7 @@ Therefore, transfer learning is trivial.

 If you want to load a pre-trained model, just use the same variable names.
 If you want to re-train some layer, either rename the variables in the
-graph or rename/remove the variables in your loader.
+graph, or rename/remove the variables in your loader.


 ## Resume Training
@@ -76,4 +83,4 @@ on how to load a model.


 The [AutoResumeTrainConfig](../modules/train.html#tensorpack.train.AutoResumeTrainConfig)
-is an alternative of `TrainConfig` which applies some heuristics to
+is an alternative of `TrainConfig` which applies some heuristics to load the lastest epoch number and lastest checkpoint.
--- a/examples/FasterRCNN/NOTES.md
+++ b/examples/FasterRCNN/NOTES.md
@@ -24,6 +24,9 @@ Data:
     rewrite all the methods of
     `DetectionDataset` following its documents.
     You'll implement the logic to load your dataset and evaluate predictions.
+	 + If you want to load a model trained on COCO, you'll see error messages
+		 complaining about unmatched number of categories for certain weights in the checkpoint.
+		 Remove those weights in checkpoint, or rename them in the model.

 2. You can easily add more augmentations such as rotation, but be careful how a box should be
 	 augmented. The code now will always use the minimal axis-aligned bounding box of the 4 corners,