Adaptive AI Engine for RTS Games

Discussing the theory and practice

Case-Based Planning and Execution for Real-Time Strategy Games

Posted by Ogail on February 5, 2010

  • Abstract:
    • Artificial Intelligence techniques have been successfully applied to several computer games. However in some kinds of computer games, like real-time strategy (RTS) games, traditional artificial intelligence techniques fail to play at a human level because of the vast search spaces that they entail. In this paper we present a real-time case based planning and execution approach designed to deal with RTS games. We propose to extract behavioral knowledge from expert demonstrations in form of individual cases. This knowledge can be reused via a case based behavior generator that proposes behaviors to achieve the specific open goals in the current plan. Specifically, we applied our technique to the WARGUS domain with promising results.
  • As we said before, one of the main goals of our research is to create AI techniques that can be used by game manufacturers to reduce the effort required to develop the AI component of their games.
  • Using the architecture presented in this paper the game developers will be able to specify the AI behavior just by demonstration; i.e. instead of having to code the behavior using a programming language, the behavior can be specified simply by demonstrating it to the system.
  • Case Based Planning work is based on the idea of planning by remembering instead of planning from scratch.
  • Systems Architecture:

  • Behavior Acquisition:
    • During this first stage, an expert plays a game of WARGUS and the trace of that game is stored. Then, the expert annotates the trace explaining the goals he was pursuing with the actions he took while playing. Using those annotations, a set of behaviors are extracted from the trace and stored as a set of cases. Each case is a triple: situation/ goal/behavior, representing that the expert used a particular behavior to achieve a certain goal in a particular situation.
  • Execution:
    • The execution engine consists of two main modules:
      • A real-time plan expansion and execution (RTEE) module.
      • A behavior generation (BG) module.
    • The RTEE module maintains an execution tree of the current active goals and subgoals and which behaviors are being executed to achieve each of the goals.
    • Execution Steps:
      • Each time there is an open goal, the RTEE queries the BG module to generate a behavior to solve it.
      • The BG then retrieves the most appropriate behavior from its case base, and sends it to the RTEE.
      • Finally, when the RTEE is about to start executing a behavior, it is sent back to the BG module for adaptation.
      • Notice that this delayed adaptation is a key feature different from traditional CBR required for real-time domains where the environment continuously changes.
  • They developed Behavior Reasoning Language (BRL) to allow a system to learn behaviors, represent them, and to reason about the behaviors and their intended and actual effects.
  • Behavior is the main basic constructing piece of the system that is divided into:
    • Declarative Part:
      • The declarative part has the purpose of providing information to the system about the intended use of the behavior (WHY).
      • This part consists of:
        • Goal:
          • For every domain, ontology of possible goals has to be defined. For instance, a behavior might have the goal of “having a tower”.
        • Set of Preconditions:
          • Must be satisfied before the behavior can be executed. For instance, a behavior can have as preconditions that a particular peasant exists and that a desired location is empty.
        • Set of Alive Conditions:
          • Conditions that must be satisfied during the execution of the behavior for it to have chances of success. If at some moment during the execution, the alive conditions are not met, the behavior can be stopped, since it will not achieve its intended goal. For instance, the peasant in charge of building a building must remain alive; if he is killed, the building will not be built.
    • Procedural Part:
      • Procedural part contains the executable behavior itself (HOW).
      • Can contain the following constructs: sequence, parallel, action, and subgoals
  • Summarizing, our behavior language is strongly inspired by ABL, but expands it with declarative annotations (expanding the representation of goals and defining alive and success conditions) to allow reasoning

  • Using temporal analysis to construct goals.
  • Open goal is a goal that has not assigned a behavior yet.
  • Behaviors state: pending, executing, succeeded and failed.
  • Open Goals States:
    • Ready: An open goal is ready when all the behaviors that had to be executed before this goal have succeeded
    • Waiting: otherwise.
  • For instance, in Figure 4, “behavior 0” is a sequential behavior and therefore the goal “build army” is ready since the “build base” goal has already succeeded and thus “build army” can be started. However, the goal “attack” is waiting, since “attack” has to be executed after “build army” succeed
  • The plan expansion module is constantly querying the current plan to see if there is any ready open goal. When this happens, the open goal is sent to the BG module to generate a behavior for it. Then, that behavior is inserted in the current plan, and it is marked as pending.
  • The plan execution module has two main functionalities:
    • Check for basic actions that can be sent directly to the game engine.
    • Check the status of plans that are in execution.
  • For each pending behavior, the execution module evaluates the preconditions, and as soon as they are met, the behavior starts its execution
  • If any of the execution behaviors have any basic actions, the execution module sends those actions to WARGUS to be executed.
  • Whenever a basic action succeeds or fails, the execution module updates the status of the behavior that contained it. When a basic action succeeds, the executing behavior can continue to the next step. When a basic action fails, the behavior is marked as failed, and thus its corresponding goal is open again (thus, the system will have to find another plan for that goal).
  • The execution module periodically evaluates the alive conditions and success conditions of each behavior. If the alive conditions of an executing behavior are not satisfied, the behavior is marked as failed, and its goal is open again. If the success conditions of a behavior are satisfied, the behavior is marked as succeeded.
  • Finally, if a behavior is about to be executed and the current game state has changed since the time the BG module generated it, the behavior is handed back to the BG and it will pass again through the adaptation phase to make sure that the plan is adequate for the current game state.
  • The goal of the BG module is to generate behaviors for specific goals in specific scenarios. Therefore, the input to the BG module is a particular scenario (i.e. the current game state in WARGUS) and a particular goal that has to be achieved (e.g. “Destroy The Enemy’s Cannon Tower”). To achieve that task, the BG system uses two separate processes: case retrieval and case adaptation.
  • Case base we that have several cases that contain different behaviors to solve each one of these problems under different circumstances. Is called heterogeneous case base.
  • Therefore we represent cases as triples: c = <S, G, B>, where S is a particular game state, G is a goal, and B is a behavior; representing that c.B is a good behavior to apply when we want to pursue goal c.G in a game state similar to c.S.

  • In particular, we have used a game state definition composed of 35 features that try to represent each aspect of the WARGUS game:
    • Twelve of them represent the number of troops (number of fighters, number of peasants, and so on)
    • Four of them represent the resources that the player disposes of (gold, oil, wood and food)
    • Fourteen represent the description of the buildings (number of town halls, number of barracks, and so on)
    • Five features represent the map (size in both dimensions, percentage of water, percentage of trees and number of gold mines).
  • The case retrieval process uses a standard nearest neighbor algorithm but with a similarity metric that takes into account both the goal and the game state. Specifically, we use the following similarity metric:
  • where dGS is a simple Euclidean distance between the game states of the two cases (where all the attributes are normalized between 0 and 1), dG is the distance metric between goals, and is a factor that controls the importance of the game state in the retrieval process (in our experiments we used = 0.5)
  • To measure distance between two goals g = name1(p1, …, pn) and g2 = name2(q1, …, qm) we use the following distance:
    • Where:
      • Pi: is the maximum value that the parameter i of a goal might take (we assume that all the parameters have positive values).
  • The adaptation process consists of a series of rules that are applied to each one of the basic operators of a behavior so that it can be applied in the current game state. Specifically, we have used two adaptation rules in our system:
    • Unit Adaptation:
      • Each basic action sends a particular command to a given unit. For instance the first action in the behavior shown in Figure 5 commands the unit “2” to build a “pig-farm”. However, when that case is retrieved and applied to a different map, that particular unit “2” might not correspond to a peon (the unit that can build farms) or might not even exist (the “2” is just an identifier). Thus, the unit adaptation rule finds the most similar unit to the one used in the case for this particular basic action. To perform that search, each unit is characterized by a set of 5 features: owner, type, position (x,y), hit-points, and status (that can be idle, moving, attacking, etc.) and then the most similar unit (according to an Euclidean distance using those 5 features) in the current map to the one specified in the basic action is used.
    • Coordinate Adaptation:
      • Some basic actions make reference to some particular coordinates in the map (such as the move or build commands). To adapt the coordinates, the BG module gets (from the case) how the map in the particular coordinates looks like by retrieving the content of the map in a 5×5 window surrounding the specified coordinates. Then, it looks in the current map for a spot in the map that is the most similar to that 5×5 window, and uses those coordinates.
  • Conclusion:
    • In this paper we have presented a case based planning framework for real-time strategy games. The main features of our approach are
    • The capability to deal with the vast decision spaces required by RTS games
    • Being able to deal with real-time problems by interleaving planning and execution in real-time
    • Solving the knowledge acquisition problem by automatically extracting behavioral knowledge from annotated expert demonstrations in form of cases.
    • We have evaluated our approach by applying it to the real-time strategy WARGUS with promising results.
    • The main contributions of this framework are:
      • A case based integrated real-time execution and planning framework.
      • The introduction of a behavior representation language that includes declarative knowledge as well as procedural knowledge to allow both reasoning and execution.
      • The idea of automatic extraction of behaviors from expert traces as a way to automatically extract domain knowledge from an expert.
      • The idea of heterogeneous case bases where cases that contain solutions for several different problems (characterized as goals in our framework) coexist.
    • The introduction of delayed adaptation to deal with dynamic environments (where adaptation has to be delayed as much as possible to adapt the behaviors with the most up to date information).
  • Future Work:
    • As future lines of research we plan to experiment with adding a case retention module in our system that retains automatically all the adapted behaviors that had successful results while playing.
    • And also annotating all the cases in the case base with their rate of success and failure allowing the system to learn from experience.
    • Additionally, we would like to systematically explore the transfer learning capabilities of our approach by evaluating how the knowledge learnt (both from expert traces and by experience) in a set of maps can be applied to a different set of maps.
    • We also plan to further explore the effect of adding more expert traces to the system and evaluate if the system is able to properly extract knowledge from each of them to deal with new scenarios.

Further, we would like to improve our current planning engine so that, in addition to sequential and parallel plans, it can also handle conditional plans. Specifically, one of the main challenges of this approach will be to detect and properly extract conditional behaviors from expert demonstrations.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: