update readme

978f06a6 · Yuxin Wu · ceba3217 · 978f06a6 · 978f06a6 · 978f06a6
Commit 978f06a6 authored Sep 20, 2017 by Yuxin Wu
Showing with 9 additions and 6 deletions

docs/tutorial/dataflow.md docs/tutorial/dataflow.md +1 -1

docs/tutorial/extend/callback.md docs/tutorial/extend/callback.md +2 -1

examples/A3C-Gym/README.md examples/A3C-Gym/README.md +6 -4

No files found.
--- a/docs/tutorial/dataflow.md
+++ b/docs/tutorial/dataflow.md
@@ -44,7 +44,7 @@ the rest of the data pipeline.
 ### Why DataFlow

 1. It's easy: write everything in pure Python, and reuse existing utilities.
-	 On the contrary, writing data loaders in TF operators or other frameworks is usually painful.
+	 On the contrary, writing data loaders in TF operators is usually painful, and performance is hard to tune.
 2. It's fast: see [Efficient DataFlow](efficient-dataflow.html)
 	on how to build a fast DataFlow with parallel prefetching.
 	If you're using DataFlow with tensorpack, also see [Input Pipeline tutorial](input-source.html)

--- a/docs/tutorial/extend/callback.md
+++ b/docs/tutorial/extend/callback.md
@@ -89,9 +89,10 @@ to let this method run every k steps or every k epochs.
 ### What you can do in the callback

 * Access tensors / ops in either training / inference mode (need to create them in `_setup_graph`).
-	To create a callable function under inference mode, use `self.trainer.get_predictor`.
+	`self.trainer.get_predictor` is a helper function to create a callable under inference mode.
 * Write stuff to the monitor backend, by `self.trainer.monitors.put_xxx`.
 	The monitors might direct your events to TensorFlow events file, JSON file, stdout, etc.
 	You can get history monitor data as well. See the docs for [Monitors](../../modules/callbacks.html#tensorpack.callbacks.Monitors)
 * Access the current status of training, such as `epoch_num`, `global_step`. See [here](../../modules/callbacks.html#tensorpack.callbacks.Callback)
+* Stop training by `raise StopTraining()` (with `from tensorpack.train import StopTraining`).
 * Anything else that can be done with plain python.
--- a/examples/A3C-Gym/README.md
+++ b/examples/A3C-Gym/README.md
-### Code and models for Atari games in gym
+### A3C code and models for Atari games in gym

-Implemented Multi-GPU version of the A3C algorithm in [Asynchronous Methods for Deep Reinforcement Learning](http://arxiv.org/abs/1602.01783).
+Multi-GPU version of the A3C algorithm in
+[Asynchronous Methods for Deep Reinforcement Learning](http://arxiv.org/abs/1602.01783),
+with <500 lines of code.

 Results of the same code trained on 47 different Atari games were uploaded on OpenAI Gym.
 You can see them in [my gym page](https://gym.openai.com/users/ppwwyyxx).
@@ -15,7 +17,7 @@ The speed is about 6~10 iterations/s on 1 GPU plus 12+ CPU cores.
 With 2 TitanX + 20+ CPU cores, by setting `SIMULATOR_PROC=240, PREDICT_BATCH_SIZE=30, PREDICTOR_THREAD_PER_GPU=6`, it can improve to 16 it/s (2K images/s).
 Note that the network architecture is larger than what's used in the original paper.

-The pre-trained models are all trained with 4 GPUs for about 2 days.
+The uploaded models are all trained with 4 GPUs for about 2 days.
 But on simple games like Breakout, you can get good performance within several hours.
 Also note that multi-GPU doesn't give you obvious speedup here,
 because the bottleneck in this implementation is not computation but data.
@@ -25,7 +27,7 @@ Some practicical notes:
 1. Prefer Python 3.
 2. Occasionally, processes may not get terminated completely. It is suggested to use `systemd-run` to run any
 multiprocess Python program to get a cgroup dedicated for the task.
-3. Training with a significant slower speed (e.g. on CPU) will result in very bad score, probably because of async issues.
+3. Training with a significant slower speed (e.g. on CPU) will result in very bad score, probably because of the slightly off-policy implementation.

 ### To test a model: