multiple RL stats

f6acf786 · Yuxin Wu · b61d0722 · f6acf786 · f6acf786
Commit f6acf786 authored Jul 22, 2016 by Yuxin Wu
Show whitespace changes
Inline Side-by-side

Showing with 11 additions and 8 deletions

README.md README.md +5 -5

tensorpack/RL/envbase.py tensorpack/RL/envbase.py +6 -3

No files found.
--- a/README.md
+++ b/README.md
@@ -5,14 +5,14 @@ Still in development, but usable.
 See some interesting [examples](https://github.com/ppwwyyxx/tensorpack/tree/master/examples) to learn about the framework:
-+ [DoReFa-Net: low bitwidth CNN](https://github.com/ppwwyyxx/tensorpack/tree/master/examples/DoReFa-Net)
+ [DoReFa-Net: training binary / low bitwidth CNN](examples/DoReFa-Net)
-+ [Double-DQN for playing Atari games](https://github.com/ppwwyyxx/tensorpack/tree/master/examples/Atari2600)
+ [Double-DQN for playing Atari games](examples/Atari2600)
-+ [ResNet for Cifar10 classification](https://github.com/ppwwyyxx/tensorpack/tree/master/examples/ResNet)
+ [ResNet for Cifar10 classification](examples/ResNet)
-+ [char-rnn language model](https://github.com/ppwwyyxx/tensorpack/tree/master/examples/char-rnn)
+ [char-rnn language model](examples/char-rnn)
 ## Features:
-Focused on modularity. Just have to define the three components in training:
+Focused on modularity. Just have to define the three components to start a training:
 1. The model, or the graph. Define the graph as well as its inputs and outputs. `models/` has some scoped abstraction of common models.

--- a/tensorpack/RL/envbase.py
+++ b/tensorpack/RL/envbase.py
@@ -51,16 +51,19 @@ class RLEnvironment(object):
    def play_one_episode(self, func, stat='score'):
        """ play one episode for eval.
            :param func: call with the state and return an action
-            :returns: the score of this episode
+            :param stat: a key or list of keys in stats
+            :returns: the stat(s) after running this episode
        """
+        if not isinstance(stat, list):
+            stat = [stat]
        while True:
            s = self.current_state()
            act = func(s)
            r, isOver = self.action(act)
            if isOver:
-                s = self.stats[stat]
+                s = [self.stats[k] for k in stat]
                self.reset_stat()
-                return s
+                return s if len(s) > 1 else s[0]
 class ActionSpace(object):
    def __init__(self):