update docs about launch_train_with_config (#1087)

d0e410ad · Yuxin Wu · 02c38f26 · d0e410ad · d0e410ad · d0e410ad
Commit d0e410ad authored Feb 19, 2019 by Yuxin Wu
5 changed files
--- a/docs/tutorial/extend/trainer.md
+++ b/docs/tutorial/extend/trainer.md
@@ -54,7 +54,7 @@ These trainers will take care of step 1 (define the graph), with the following a
 4. A function which returns an optimizer.
 These are documented in [SingleCostTrainer.setup_graph](../modules/train.html#tensorpack.train.SingleCostTrainer.setup_graph).
-In practice you'll not use this method directly, but use [high-level interface](training-interface.html#with-modeldesc-and-trainconfig) instead.
+In practice you'll not use this method directly, but use [high-level interface](../tutorial/training-interface.html#with-modeldesc-and-trainconfig) instead.
 ### Write a Trainer

--- a/docs/tutorial/training-interface.md
+++ b/docs/tutorial/training-interface.md
@@ -58,6 +58,26 @@ and
 [launch_train_with_config](../modules/train.html#tensorpack.train.launch_train_with_config)
 for detailed functionalities.
+The function `launch_train_with_config(config, trainer)`
+uses the raw trainer interface and is almost equivalent to the following two lines of code:
+```python
+trainer.setup_graph(
+    my_model.get_inputs_desc(), my_input_source,  # or QueueInput(my_dataflow)
+    my_model.build_graph, my_model.get_optimizer)
+trainer.train_with_defaults(
+    callbacks=config.callbacks,
+    monitors=config.monitors,
+    session_creator=config.session_creator,
+    session_init=config.session_init,
+    steps_per_epoch=config.steps_per_epoch,
+    starting_epoch=config.starting_epoch,
+    max_epoch=config.max_epoch,
+    extra_callbacks=config.extra_callbacks)
+```
+If you need more control (e.g., if you want to construct the callbacks after
+setting up the graph), you can write the above two lines by yourself instead.
 ### Keras Interface
 Some wrappers were made on top of tensorpack trainers, to create a Keras-like

--- a/examples/A3C-Gym/README.md
+++ b/examples/A3C-Gym/README.md
@@ -3,30 +3,28 @@
 Multi-GPU version of the A3C algorithm in
 [Asynchronous Methods for Deep Reinforcement Learning](http://arxiv.org/abs/1602.01783).
-Results of the same code trained on 47 different Atari games were uploaded to OpenAI Gym.
+Results of the code trained on 47 different Atari games were uploaded to OpenAI Gym and available for download.
 Most of them were the best reproducible results on gym.
-However OpenAI has later completely removed leaderboard from their site.
+However OpenAI has later removed the leaderboard from their site.
 ### To train on an Atari game:
 `./train-atari.py --env Breakout-v0 --gpu 0`
 In each iteration it trains on a batch of 128 new states.
-The speed is about 6~10 iterations/s on 1 GPU plus 12+ CPU cores.
+The speed is about 20 iterations/s (2.5k images/s) on 1 V100 GPU plus 12+ CPU cores.
-With 2 TitanX + 20+ CPU cores, by setting `SIMULATOR_PROC=240, PREDICT_BATCH_SIZE=30, PREDICTOR_THREAD_PER_GPU=6`, it can improve to 16 it/s (2K images/s).
 Note that the network architecture is larger than what's used in the original paper.
 The pretrained models are all trained with 4 GPUs for about 2 days.
-But on simple games like Breakout, you can get good performance within several hours.
+But on simple games like Breakout, you can get decent performance within several hours.
-Also note that multi-GPU doesn't give you obvious speedup here,
+For example, it takes only __2 hours__ on a V100 to reach 400 average score on Breakout.
-because the bottleneck in this implementation is not computation but simulation.
 Some practicical notes:
 1. Prefer Python 3; Windows not supported.
 2. Training with a significant slower speed (e.g. on CPU) will result in very bad score, probably because of the slightly off-policy implementation.
 3. Occasionally, processes may not get terminated completely.
-	If you're using Linux, install [python-prctl](https://pypi.org/project/python-prctl/) to prevent this.
+   If you're using Linux, install [python-prctl](https://pypi.org/project/python-prctl/) to prevent this.
 ### To test a model:
@@ -66,4 +64,4 @@ The most notable differences are:
 + An episode is limited to 60000 steps.
 + Lost of live is not end of episode.
-Also see the DQN implementation [here](../DeepQNetwork)
+Also see the [DQN implementation in tensorpack](../DeepQNetwork)
--- a/examples/A3C-Gym/train-atari.py
+++ b/examples/A3C-Gym/train-atari.py
@@ -4,13 +4,14 @@
 # Author: Yuxin Wu
 import argparse
+import cv2
+import gym
+import multiprocessing as mp
 import numpy as np
 import os
+import six
 import sys
 import uuid
-import cv2
-import gym
-import six
 import tensorflow as tf
 from six.moves import queue
@@ -40,9 +41,9 @@ LOCAL_TIME_MAX = 5
 STEPS_PER_EPOCH = 6000
 EVAL_EPISODE = 50
 BATCH_SIZE = 128
-PREDICT_BATCH_SIZE = 15     # batch for efficient forward
+PREDICT_BATCH_SIZE = 16     # batch for efficient forward
-SIMULATOR_PROC = 50
+SIMULATOR_PROC = mp.cpu_count() * 2
-PREDICTOR_THREAD_PER_GPU = 3
+PREDICTOR_THREAD_PER_GPU = 4
 PREDICTOR_THREAD = None
 NUM_ACTIONS = None

--- a/tensorpack/train/interface.py
+++ b/tensorpack/train/interface.py
@@ -56,6 +56,10 @@ def launch_train_with_config(config, trainer):
    2. Call `trainer.setup_graph` with the input as well as `config.model`.
    3. Call `trainer.train` with rest of the attributes of config.
+    See tutorial at
+    <https://tensorpack.readthedocs.io/tutorial/training-interface.html#with-modeldesc-and-trainconfig>`_
+    to learn more.
    Args:
        config (TrainConfig):
        trainer (Trainer): an instance of :class:`SingleCostTrainer`.