[Asynchronous Methods for Deep Reinforcement Learning](http://arxiv.org/abs/1602.01783),
with <500 lines of code.
[Asynchronous Methods for Deep Reinforcement Learning](http://arxiv.org/abs/1602.01783).
Results of the same code trained on 47 different Atari games were uploaded on OpenAI Gym.
You can see them in [my gym page](https://gym.openai.com/users/ppwwyyxx).
Most of them are the best reproducible results on gym.
Results of the same code trained on 47 different Atari games were uploaded to OpenAI Gym.
Most of them were the best reproducible results on gym.
However OpenAI has later completely removed leaderboard from their site.
### To train on an Atari game:
...
...
@@ -17,10 +16,10 @@ The speed is about 6~10 iterations/s on 1 GPU plus 12+ CPU cores.
With 2 TitanX + 20+ CPU cores, by setting `SIMULATOR_PROC=240, PREDICT_BATCH_SIZE=30, PREDICTOR_THREAD_PER_GPU=6`, it can improve to 16 it/s (2K images/s).
Note that the network architecture is larger than what's used in the original paper.
The uploaded models are all trained with 4 GPUs for about 2 days.
The pretrained models are all trained with 4 GPUs for about 2 days.
But on simple games like Breakout, you can get good performance within several hours.
Also note that multi-GPU doesn't give you obvious speedup here,
because the bottleneck in this implementation is not computation but data.
because the bottleneck in this implementation is not computation but simulation.
Some practicical notes:
...
...
@@ -36,28 +35,32 @@ Download models from [model zoo](http://models.tensorpack.com/OpenAIGym/).
Watch the agent play:
`./train-atari.py --task play --env Breakout-v0 --load Breakout-v0.npz`
Note that atari game settings in gym (AtariGames-v0) are quite different from DeepMind papers, so the scores are not comparable. The most notable differences are: