Commit d0e410ad authored by Yuxin Wu's avatar Yuxin Wu

update docs about launch_train_with_config (#1087)

parent 02c38f26
...@@ -54,7 +54,7 @@ These trainers will take care of step 1 (define the graph), with the following a ...@@ -54,7 +54,7 @@ These trainers will take care of step 1 (define the graph), with the following a
4. A function which returns an optimizer. 4. A function which returns an optimizer.
These are documented in [SingleCostTrainer.setup_graph](../modules/train.html#tensorpack.train.SingleCostTrainer.setup_graph). These are documented in [SingleCostTrainer.setup_graph](../modules/train.html#tensorpack.train.SingleCostTrainer.setup_graph).
In practice you'll not use this method directly, but use [high-level interface](training-interface.html#with-modeldesc-and-trainconfig) instead. In practice you'll not use this method directly, but use [high-level interface](../tutorial/training-interface.html#with-modeldesc-and-trainconfig) instead.
### Write a Trainer ### Write a Trainer
......
...@@ -58,6 +58,26 @@ and ...@@ -58,6 +58,26 @@ and
[launch_train_with_config](../modules/train.html#tensorpack.train.launch_train_with_config) [launch_train_with_config](../modules/train.html#tensorpack.train.launch_train_with_config)
for detailed functionalities. for detailed functionalities.
The function `launch_train_with_config(config, trainer)`
uses the raw trainer interface and is almost equivalent to the following two lines of code:
```python
trainer.setup_graph(
my_model.get_inputs_desc(), my_input_source, # or QueueInput(my_dataflow)
my_model.build_graph, my_model.get_optimizer)
trainer.train_with_defaults(
callbacks=config.callbacks,
monitors=config.monitors,
session_creator=config.session_creator,
session_init=config.session_init,
steps_per_epoch=config.steps_per_epoch,
starting_epoch=config.starting_epoch,
max_epoch=config.max_epoch,
extra_callbacks=config.extra_callbacks)
```
If you need more control (e.g., if you want to construct the callbacks after
setting up the graph), you can write the above two lines by yourself instead.
### Keras Interface ### Keras Interface
Some wrappers were made on top of tensorpack trainers, to create a Keras-like Some wrappers were made on top of tensorpack trainers, to create a Keras-like
......
...@@ -3,30 +3,28 @@ ...@@ -3,30 +3,28 @@
Multi-GPU version of the A3C algorithm in Multi-GPU version of the A3C algorithm in
[Asynchronous Methods for Deep Reinforcement Learning](http://arxiv.org/abs/1602.01783). [Asynchronous Methods for Deep Reinforcement Learning](http://arxiv.org/abs/1602.01783).
Results of the same code trained on 47 different Atari games were uploaded to OpenAI Gym. Results of the code trained on 47 different Atari games were uploaded to OpenAI Gym and available for download.
Most of them were the best reproducible results on gym. Most of them were the best reproducible results on gym.
However OpenAI has later completely removed leaderboard from their site. However OpenAI has later removed the leaderboard from their site.
### To train on an Atari game: ### To train on an Atari game:
`./train-atari.py --env Breakout-v0 --gpu 0` `./train-atari.py --env Breakout-v0 --gpu 0`
In each iteration it trains on a batch of 128 new states. In each iteration it trains on a batch of 128 new states.
The speed is about 6~10 iterations/s on 1 GPU plus 12+ CPU cores. The speed is about 20 iterations/s (2.5k images/s) on 1 V100 GPU plus 12+ CPU cores.
With 2 TitanX + 20+ CPU cores, by setting `SIMULATOR_PROC=240, PREDICT_BATCH_SIZE=30, PREDICTOR_THREAD_PER_GPU=6`, it can improve to 16 it/s (2K images/s).
Note that the network architecture is larger than what's used in the original paper. Note that the network architecture is larger than what's used in the original paper.
The pretrained models are all trained with 4 GPUs for about 2 days. The pretrained models are all trained with 4 GPUs for about 2 days.
But on simple games like Breakout, you can get good performance within several hours. But on simple games like Breakout, you can get decent performance within several hours.
Also note that multi-GPU doesn't give you obvious speedup here, For example, it takes only __2 hours__ on a V100 to reach 400 average score on Breakout.
because the bottleneck in this implementation is not computation but simulation.
Some practicical notes: Some practicical notes:
1. Prefer Python 3; Windows not supported. 1. Prefer Python 3; Windows not supported.
2. Training with a significant slower speed (e.g. on CPU) will result in very bad score, probably because of the slightly off-policy implementation. 2. Training with a significant slower speed (e.g. on CPU) will result in very bad score, probably because of the slightly off-policy implementation.
3. Occasionally, processes may not get terminated completely. 3. Occasionally, processes may not get terminated completely.
If you're using Linux, install [python-prctl](https://pypi.org/project/python-prctl/) to prevent this. If you're using Linux, install [python-prctl](https://pypi.org/project/python-prctl/) to prevent this.
### To test a model: ### To test a model:
...@@ -66,4 +64,4 @@ The most notable differences are: ...@@ -66,4 +64,4 @@ The most notable differences are:
+ An episode is limited to 60000 steps. + An episode is limited to 60000 steps.
+ Lost of live is not end of episode. + Lost of live is not end of episode.
Also see the DQN implementation [here](../DeepQNetwork) Also see the [DQN implementation in tensorpack](../DeepQNetwork)
...@@ -4,13 +4,14 @@ ...@@ -4,13 +4,14 @@
# Author: Yuxin Wu # Author: Yuxin Wu
import argparse import argparse
import cv2
import gym
import multiprocessing as mp
import numpy as np import numpy as np
import os import os
import six
import sys import sys
import uuid import uuid
import cv2
import gym
import six
import tensorflow as tf import tensorflow as tf
from six.moves import queue from six.moves import queue
...@@ -40,9 +41,9 @@ LOCAL_TIME_MAX = 5 ...@@ -40,9 +41,9 @@ LOCAL_TIME_MAX = 5
STEPS_PER_EPOCH = 6000 STEPS_PER_EPOCH = 6000
EVAL_EPISODE = 50 EVAL_EPISODE = 50
BATCH_SIZE = 128 BATCH_SIZE = 128
PREDICT_BATCH_SIZE = 15 # batch for efficient forward PREDICT_BATCH_SIZE = 16 # batch for efficient forward
SIMULATOR_PROC = 50 SIMULATOR_PROC = mp.cpu_count() * 2
PREDICTOR_THREAD_PER_GPU = 3 PREDICTOR_THREAD_PER_GPU = 4
PREDICTOR_THREAD = None PREDICTOR_THREAD = None
NUM_ACTIONS = None NUM_ACTIONS = None
......
...@@ -56,6 +56,10 @@ def launch_train_with_config(config, trainer): ...@@ -56,6 +56,10 @@ def launch_train_with_config(config, trainer):
2. Call `trainer.setup_graph` with the input as well as `config.model`. 2. Call `trainer.setup_graph` with the input as well as `config.model`.
3. Call `trainer.train` with rest of the attributes of config. 3. Call `trainer.train` with rest of the attributes of config.
See tutorial at
<https://tensorpack.readthedocs.io/tutorial/training-interface.html#with-modeldesc-and-trainconfig>`_
to learn more.
Args: Args:
config (TrainConfig): config (TrainConfig):
trainer (Trainer): an instance of :class:`SingleCostTrainer`. trainer (Trainer): an instance of :class:`SingleCostTrainer`.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment