Updates to manual (particularly for action chart); next exploration step is non-fullstate

63f3a046 · drallensmith · 89f2e68f · 63f3a046 · 63f3a046 · 63f3a046
Commit 63f3a046 authored Aug 07, 2017 by drallensmith
5 changed files
--- a/doc/manual.pdf
+++ b/doc/manual.pdf
--- a/doc/manual.tex
+++ b/doc/manual.tex
@@ -435,7 +435,7 @@ Landmark features encode the relative angle and proximity of the agent
 to a landmark of interest. Each landmark feature consists of three
 floating point values, two to encode the agent's relative angle to the
 landmark and one to encode the landmark's proximity. Note that if the
-agent's self position is invalid, then the landmark feature values are
+agent's self position is invalid, the landmark feature values are
 zeroed.

 \subsubsection{Player Features}
@@ -451,7 +451,7 @@ In the low-level feature space, unknown uniform numbers, or \textit{unums},
 are encoded as -1, while known ones are encoded as $\frac{unum}{100}$, thus
 remaining well within the $[-1, 1]$ range. (Note that roundoff error may need
 to be allowed for when converting these back to integers, such as for use in
-high-level actions; e.g., 0.0799 will need to be converted back to 8.)
+high-level actions; \textit{e.g.}, 0.0799 will need to be converted back to 8.)
 Uniform number features, a later addition to the low-level feature space,
 are positioned after all other features to hopefully ensure compatibility
 with older programs.
@@ -490,8 +490,8 @@ low-level features:
  \itemrange{2}{\textbf{Penalty Box Center} [Landmark] Center of the penalty box line.}
  \itemrange{2}{\textbf{Penalty Box Top} [Landmark] Top corner of the penalty box.}
  \itemrange{2}{\textbf{Penalty Box Bot} [Landmark] Bottom corner of the penalty box.}
-  \itemrange{2}{\textbf{Center Field} [Landmark] The left middle point of the
-    HFO play area.}
+  \itemrange{2}{\textbf{Center Field} [Landmark] The left middle point of the RoboCup field
+    (note that this is \textbf{not} the center of the HFO play area).}
  \itemrange{2}{\textbf{Corner Top Left} [Landmark] Top left corner HFO Playfield.}
  \itemrange{2}{\textbf{Corner Top Right} [Landmark] Top right corner HFO Playfield.}
  \itemrange{2}{\textbf{Corner Bot Right} [Landmark] Bottom right corner HFO Playfield.}
@@ -614,25 +614,25 @@ below the table for the action abbreviations and notes.
 Action                  & Da & Tu & Ta & K & KT & MT & DT & I  & M  & S  & P  & Dr & C  & RG & DG & G & MP \\
 \hline \hline
 Self position invalid   & Y  & Y  & Y  & Y & N  & N  & N? & N? & N  & N? & N? & Y  & ?  & N  & N  & N & N \\
-Self velocity invalid   & N  & Y? & Y? & Y? & ?  & N  & N  & N  & N  & ?  & ?  & Y  & Y  & N  & N  & N  & N \\
-Ball position invalid   & Y  & Y  & Y? & N  & N  & Y  & N  & N  & N  & N  & N  & Y? & N  & N  & N  & N  & N \\
+Self velocity invalid   & N  & Y? & Y? & Y & ?  & N  & N  & N  & N  & ?  & ?  & Y  & Y  & N  & N  & N & N \\
+Ball position invalid   & Y  & Y  & Y? & N & N  & Y  & N  & N  & N  & N  & N  & Y  & N  & N  & N  & N & N \\
 Ball velocity invalid   & Y  & Y  & Y  & ? & ?  & Y  & Y  & N? & ?  & N  & N? & Y? & Y? & Y  & Y  & Y & Y \\
 Teammate loc invalid    & Y  & Y  & Y  & Y & Y  & Y  & Y  & Y  & N  & Y  & N  & Y? & Y  & Y  & Y  & Y & Y \\
 Team. unum invalid      & Y  & Y  & Y  & Y & Y  & Y  & Y  & Y  & Y  & Y  & N  & Y  & Y  & Y  & Y  & Y & Y \\
 Opponent loc invalid    & Y  & Y  & Y  & Y & Y  & Y  & Y? & Y  & N  & Y? & Y  & Y? & Y  & Y  & Y  & Y & N \\
 Opp. unum invalid       & Y  & Y  & Y  & Y & Y  & Y  & Y  & Y  & Y  & Y  & Y  & Y  & Y  & Y  & Y  & Y & N \\
 \hline
-Ball kickable           & Y  & Y  & Y  & Y  & Y  & Y  & Y  & N  & *  & Y  & Y  & Y  & Y  & ?  & ?  & N  & Y \\
+Ball kickable           & Y  & Y  & Y  & Y & Y  & N  & Y  & N  & *  & Y  & Y  & Y  & Y  & ?  & ?  & N & Y \\
 Ball not kickable       & Y  & Y  & Y  & N & N  & Y  & Y  & Y  & Y  & N  & N  & N  & Y  & Y  & Y  & Y & Y \\
 \hline
 Frozen                  & N  & N  & N  & N & N  & N  & N? & N  & N? & N  & N  & Y  & N? & N  & N  & N & N \\
-Colliding w/ball        & Y  & Y  & ?  & N  & Y  & Y  & Y  & Y  & ?  & ?  & ?  & ?  & ?  & ?  & ?  & N  & ? \\
-Colliding w/player      & Y  & Y? & ?  & N? & N? & Y? & ?  & Y? & ?  & ?  & ?  & Y? & ?  & ?  & ?  & Y? & ? \\
-Colliding w/post        & Y  & Y  & N? & N? & N? & Y  & Y  & Y  & Y  & ?  & ?  & Y  & ?  & Y  & Y  & Y  & Y \\
+Colliding w/ball        & Y  & Y  & ?  & N & Y  & N  & Y  & Y  & Y  & ?  & Y  & Y  & ?  & ?  & ?  & N & ? \\
+Colliding w/player      & Y  & Y  & ?  & Y & Y  & Y  & Y  & Y  & Y  & ?  & N  & Y  & ?  & ?  & ?  & Y & ? \\
+Colliding w/post        & Y  & Y  & N? & Y & Y  & Y  & N  & Y  & Y  & ?  & N  & Y  & ?  & Y  & Y  & Y & Y \\
 \hline
 Offense                 & Y  & Y  & N  & Y & Y  & Y  & Y  & Y  & Y  & Y  & Y  & Y  & N  & N  & N  & Y & N \\
-Defense, not goalie     & Y  & Y  & Y  & N? & N? & Y  & N  & Y  & Y  & N  & N  & N  & N  & Y  & ?  & Y  & Y \\
-Goalie (defense)        & Y  & Y  & Y? & N? & N? & Y  & N  & Y  & ?  & N  & N  & N  & Y  & ?  & ?  & ?  & ? \\
+Defense, not goalie     & Y  & Y  & Y  & N & N  & Y  & N  & Y  & Y  & N  & N  & N  & N  & Y  & ?  & Y & Y \\
+Goalie (defense)        & Y  & Y  & Y  & N & N  & Y  & N  & Y  & ?  & N  & N  & N  & Y  & ?  & ?  & ? & N \\
 \end{tabular}
 }
 \end{center}
@@ -647,8 +647,8 @@ Goalie (defense)        & Y  & Y  & Y? & N? & N? & Y  & N  & Y  & ?  & N  & N  &

 New agents may be developed in C++ or Python. In Python, as long as
 the hfo interface has been installed, the agent only needs to
-\verb+import hfo+ (or \verb+from hfo import *+). In C++ it is necessary to
-\verb+#include <HFO.hpp>+ and also to link against the shared object
+\verb+import hfo+ (or \verb+from hfo import *+). In C++, it is necessary to
+\verb+#include <HFO.hpp>+ and also link against the shared object
 library \verb+lib/libhfo.so+ when compiling:

 \begin{verbatim}

--- a/example/explore_offense_actions_fullstate.py
+++ b/example/explore_offense_actions_fullstate.py
@@ -624,15 +624,6 @@ def evaluate_previous_action(hfo_env,
    return
  bit_list, self_dict, goal_dict, ball_dict = filter_low_level_state(state)
  action_status = hfo_env.getLastActionStatus(namespace.action)
-  if action_status == hfo.ACTION_STATUS_UNKNOWN:
-    print(
-      "Last action {0!s} (prestate bit_list {1!s}, current bit list {2!s}) gave status {3:n}".format(
-        hfo_env.actionToString(namespace.action),
-        "".join(map(str,map(int,namespace.prestate_bit_list))),
-        "".join(map(str,map(int,bit_list))), action_status),
-      file=sys.stderr)
-    sys.stderr.flush()
-    return

  action_string = hfo_env.actionToString(namespace.action)
  if namespace.action_params:
@@ -740,9 +731,9 @@ def evaluate_previous_action(hfo_env,
    if namespace.prestate_bit_list[4]:
      if not bit_list[4]:
        action_status_observed = hfo.ACTION_STATUS_BAD
-      elif namespace.prestate_bit_list[3] and (not bit_list[3]): # goal collision
+      elif (namespace.intent != INTENT_GOAL_COLLISION) and namespace.prestate_bit_list[3] and (not bit_list[3]): # goal collision
        action_status_observed = hfo.ACTION_STATUS_MAYBE
-      elif namespace.prestate_bit_list[2] and (not bit_list[2]): # ball collision
+      elif (namespace.intent != INTENT_BALL_COLLISION) and namespace.prestate_bit_list[2] and (not bit_list[2]): # ball collision
        action_status_observed = hfo.ACTION_STATUS_MAYBE
      elif (namespace.prestate_self_dict['x_pos'] is not None) and (self_dict['x_pos'] is not None):
        dist_before = get_dist_real(namespace.prestate_self_dict['x_pos'],
@@ -759,9 +750,9 @@ def evaluate_previous_action(hfo_env,
          action_status_observed = hfo.ACTION_STATUS_BAD
    elif bit_list[4]:
      action_status_observed = hfo.ACTION_STATUS_MAYBE
-    elif namespace.prestate_bit_list[3] and (not bit_list[3]): # goal collision
+    elif (namespace.intent != INTENT_GOAL_COLLISION) and namespace.prestate_bit_list[3] and (not bit_list[3]): # goal collision
      action_status_observed = hfo.ACTION_STATUS_MAYBE
-    elif namespace.prestate_bit_list[2] and (not bit_list[2]): # ball collision
+    elif (namespace.intent != INTENT_BALL_COLLISION) and namespace.prestate_bit_list[2] and (not bit_list[2]): # ball collision
      action_status_observed = hfo.ACTION_STATUS_MAYBE
    elif namespace.prestate_ball_dict['dist'] > ball_dict['dist']:
      action_status_observed = hfo.ACTION_STATUS_MAYBE
@@ -799,7 +790,7 @@ def evaluate_previous_action(hfo_env,
    action_status_guessed = action_status
  else:
    action_status_guessed = action_status_observed
-    if action_status != action_status_observed:
+    if (action_status != action_status_observed) and (action_status != hfo.ACTION_STATUS_UNKNOWN):
      print(
        "{0!s}: Difference between feedback ({1!s}), observed ({2!s}) action_status (prestate bit_list {3!s}, current bit list {4!s})".format(
          action_string,
@@ -810,7 +801,7 @@ def evaluate_previous_action(hfo_env,
        file=sys.stderr)
      sys.stderr.flush()

-  if (action_status_guessed != hfo.ACTION_STATUS_MAYBE) and (not namespace.checking_intent):
+  if (action_status_guessed == hfo.ACTION_STATUS_BAD) and (not namespace.checking_intent):
    # unexpected lack of success
    print(
      "Unexpected lack of success for last action {0!s} (prestate bit_list {1!s}, current bit list {2!s})".format(
@@ -1006,18 +997,30 @@ def do_intent(hfo_env,
            goal_dict['x_pos'],goal_dict['y_pos']),
          file=sys.stderr)
        sys.stderr.flush()
+        if not bit_list[4]:
          hfo_env.act(*save_action_prestate(action=hfo.DASH,
                                            action_params=[80, goal_rel_angle],
                                            **prestate_dict))
+        else:
+          hfo_env.act(*save_action_prestate(action=hfo.DRIBBLE_TO,
+                                            action_params=[(self_dict['x_pos']+
+                                                            (2*(goal_dict['x_pos']-
+                                                                self_dict['x_pos']))),
+                                                           (self_dict['y_pos']+
+                                                            (2*(goal_dict['y_pos']-
+                                                                self_dict['y_pos'])))],
+                                            **prestate_dict))

      return
    
+    poss_actions_list.append(hfo.DRIBBLE_TO)
+    if not bit_list[4]:
      poss_actions_list.append(hfo.MOVE_TO)

    if goal_rel_angle is not None:
      if abs(goal_rel_angle) > 0.5:
        poss_actions_list.append(hfo.TURN)
-      if abs(goal_rel_angle) < 1.0:
+      if (abs(goal_rel_angle) < 1.0) and ((not bit_list[4]) or (goal_dict['dist'] < 0.36)):
        poss_actions_list.append(hfo.DASH)
            
    action = determine_which_action(poss_actions_list, namespace, bit_list)
@@ -1030,6 +1033,15 @@ def do_intent(hfo_env,
      hfo_env.act(*save_action_prestate(action=hfo.MOVE_TO,
                                        action_params=[goal_dict['x_pos'], goal_dict['y_pos']],
                                        **prestate_dict))
+    elif (action == hfo.DRIBBLE_TO):
+      if bit_list[4] and (get_dist_real(self_dict['x_pos'],self_dict['y_pos'],
+                                        goal_dict['x_pos'],goal_dict['y_pos']) <= 0.36):
+        x_pos_desired = get_x_normalized(get_x_unnormalized(goal_dict['x_pos'])+0.35)
+      else:
+        x_pos_desired = get_x_normalized(get_x_unnormalized(goal_dict['x_pos'])-0.35)
+      hfo_env.act(*save_action_prestate(action=hfo.DRIBBLE_TO,
+                                        action_params=[x_pos_desired, goal_dict['y_pos']],
+                                        **prestate_dict))
    elif (action == hfo.DASH):
      hfo_env.act(*save_action_prestate(action=hfo.DASH,
                                        action_params=[80, goal_rel_angle],
@@ -1104,7 +1116,10 @@ def do_next_action(hfo_env,

  # figure out what to do next

-  if not (bit_list[2] or bit_list[3]):
+  if not (bit_list[3]):
+    if bit_list[0]:
+      poss_intent_set = set([INTENT_GOAL_COLLISION])
+    else:
      poss_intent_set = set([INTENT_BALL_KICKABLE,INTENT_BALL_COLLISION,INTENT_GOAL_COLLISION])
    if not bit_list[5]:
      poss_intent_set.remove(INTENT_BALL_KICKABLE)
@@ -1131,10 +1146,10 @@ def do_next_action(hfo_env,
 ##    actions_want_check = set([hfo.INTERCEPT, hfo.GO_TO_BALL])
 ##    if bit_list[2] or bit_list[3]:
 ##      actions_want_check |= set([hfo.KICK, hfo.KICK_TO])
-  elif bit_list[2]: # colliding with ball
-    actions_want_check = set([hfo.GO_TO_BALL])
-    if bit_list[3]: # colliding with goal
-      actions_want_check |= set([hfo.KICK, hfo.KICK_TO, hfo.DRIBBLE_TO])
+##  elif bit_list[2]: # colliding with ball
+##    actions_want_check = set([hfo.GO_TO_BALL])
+##    if bit_list[3]: # colliding with goal
+##      actions_want_check |= set([hfo.KICK, hfo.KICK_TO, hfo.DRIBBLE_TO])
  else: # colliding with goal
    actions_want_check = set([hfo.KICK, hfo.KICK_TO, hfo.DRIBBLE_TO])


--- a/example/explore_offense_actions_fullstate.twoplayer.py
+++ b/example/explore_offense_actions_fullstate.twoplayer.py
--- a/example/explore_offense_actions_fullstate.twoplayer.v2.py
+++ b/example/explore_offense_actions_fullstate.twoplayer.v2.py