Adaptive AI Engine for RTS Games

Discussing the theory and practice

Archive for February, 2010

On-Line Case-Based Plan Adaptation for Real-Time Strategy Games – 2008

Posted by Ogail on February 7, 2010

  • Abstract:
    • Traditional artificial intelligence techniques do not perform well in applications such as real-time strategy games because of the extensive search spaces which need to be explored. In addition, this exploration must be carried out on-line during performance time; it cannot be precomputed. We have developed on-line case-based planning techniques that are effective in such domains. In this paper, we extend our earlier work using ideas from traditional planning to inform the real-time adaptation of plans. In our framework, when a plan is retrieved, a plan dependency graph is inferred to capture the relations between actions in the plan. The plan is then adapted in real-time using its plan dependency graph. This allows the system to create and adapt plans in an efficient and effective manner while performing the task. The approach is evaluated using WARGUS, a well-known real-time strategy game.


  • A key problem in CBP is plan adaptation;
    • Plans cannot be replayed exactly as they were stored, especially in games of the complexity of modern RTS games.
    • More specifically, CBP techniques for RTS games need adaptation techniques that are suitable for dynamic and unpredictable domains.
  • Plan adaptation techniques can be classified in two categories:
    • Those adaptation techniques based on domain specific rules (domain specific, but fast)
    • Those based on domain independent search-based techniques (domain independent, but slow)
  • However, most previous work on plan adaptation focus on one-shot adaptation where a single plan or set of plans are retrieved and adapted before execution. In this paper we are interested on developing domain independent and search-free plan adaptation techniques that can be interleaved with execution and that are suitable for RTS games.
  • Our plan adaptation technique is based on two ideas:
    • Removing useless operations from a plan can be done by analyzing a dependency graph without performing search.
    • The insertion of new operations in the plan can be delegated to the case-base planning cycle itself, thus also getting rid of the search.
  • A goal is a symbolic non-operational description of the goal, while success conditions are actual predicates that can be checked in the game state to evaluate whether a goal was satisfied or not.
  • Postconditions are conditions to be expected after a behavior executes while success conditions are the conditions that when satisfied we can consider the behavior to have completed its execution
  • A partial plan tree in our framework is represented as a tree consisting of two types of nodes: goals and behaviors (notice the similarity with HTN planning (Nau et al. 2005)).
  • While execution, basic actions that are ready and with all its preconditions satisfied are sent to WARGUS to be executed. If the preconditions are not satisfied, the behavior is sent back to the adaptation module to see if the plan can be repaired. If it cannot, then the behavior is marked as failed.
  • The plan adaptation module is divided in two submodules:
    • The parameter adaptation module.
      • The first one is in charge of adapting the parameters of the basic actions, i.e. the coordinates and specific units
    • The structural plan adaptation module.
  • Plans are composed of four basic types of elements:
    • Actions:
      • Which are the basic actions that can be executed
    • Parallel plans:
      • Which consist of component plans which can be executed in parallel
    • Sequential plans:
      • Which consist of component plans which need to be executed in sequence
    • Sub-goal plans:
      • Requiring further expansion.
  • We can further deduce dependencies between different plans using their preconditions and success conditions.
  • We specifically consider only plans which are completely expanded and do not contain a subgoal which further needs to be expanded.
  • We generate a plan dependency graph using the preconditions and success conditions of the actions. This plan dependency graph informs the plan adaptation process.
  • Plan Dependency Graph Generator:
    • The plan dependency graph generation, begins by taking the success conditions of the root of the plan and finding out which of the component actions in the plan contribute to the achievement of these success conditions.
      • These actions are called direct sub-plans for the subgoal.
    • All direct sub-plans are added to the plan dependency graph.
    • Then, the plan dependency graph generator analyzes the preconditions of each of these direct sub-plans.
    • Let B1 be an action in the plan which contributes to satisfying the preconditions of a direct sub-plan D1. Then, it adds B1 to the plan dependency graph and forms a directed edge from B1 to D1.
      • This directed edge can be considered as a dependency between B1 and D1.
      • This is done for each of the direct sub-plans.
    • Further this is repeated for each of the actions which are added to the graph.
    • The generation stops when there’s more nodes are added to it.
    • This process results in the formation of a plan dependency graph with directed edges between actions representing that one action contributes to the achievement of the preconditions of another action.
  • However, a challenge in our work is that simple comparison of preconditions of a plan P1 with success conditions of another plan P2 is not sufficient to determine whether P2 contributes to achievement of preconditions of P1. This is because there isn’t necessarily a direct correspondence between preconditions and success.
  • An example is the case where P1 has a precondition testing the existence of a single unit of a particular type. However, P2 may have a success condition testing the existence of a given number n of units of the same type. An edge should clearly be formed from P2 to P1, however a direct comparison of the conditions will not yield this relation.
  • For that purpose, the plan dependency graph generation component needs a precondition-success condition matcher (ps-matcher). In our system, we have developed a rule-based ps-matcher that incorporates a collection of rules for the appropriate condition matching.
  • If a dependency was not detected by our system, a necessary action in the plan might get deleted. However, if our system adds extra dependencies that do not exist the only thing that can happen is that the system ends up executing some extra actions that would not be required. Clearly, executing extra actions is better than missing some needed actions.
  • The plan adaptation following the creation of the plan dependency graph has two sub-processes:
    • Elimination of unnecessary actions.
    • Insertion of extra actions.
  • The first one is performed as soon as the plan is retrieved, and the second one is performed on-line as the plan executes.
  • Removal of unnecessary actions:
    • Given a plan for a goal, the plan dependency graph is generated and the success conditions of each direct plan are evaluated for the game state at that point of execution.
      • This gives a list L of direct plans whose all success conditions are satisfied and hence do not need to be executed.
    • Now, consider L as the list of actions that can be removed from the plan corresponding to the plan dependency graph.
    • All actions B which are present only on paths terminating on an action D such that D L can be removed and hence can be added to L.
    • This is repeated until L becomes stable and no new plan is added to it.
    • All actions in L are removed from the plan dependency graph.
    • The resulting plan dependency graph has only those actions whose success conditions are not satisfied in the given game state and which have a path to a direct plan, also with success conditions not satisfied in the given game state.
  • Adaptation for unsatisfied preconditions:
    • If the execution of an action fails because one or more of its preconditions are not satisfied, the system needs to create a game state where the given preconditions are satisfied so that the execution of the plan can proceed.
    • To enable this, the adaptation module inserts satisfying goals in the plan (one goal per unsatisfied precondition).
    • The satisfying goals are such that when plans to achieve the goals is retrieved and executed, the success of the plans implies that the preconditions of the failed action are satisfied.
    • Example: When an action P1 fails to proceed because a precondition C1 is not satisfied, a satisfying goal G1 for C1 is formed. P1 is replaced by a sequential plan containing a subgoal plan with goal G1 followed by P1.
  • Notice that the plan adaptation module performs two basic operations:
    • Delete unnecessary actions (which is performed by an analysis of the plan dependency graph)
    • Insert additional actions needed to satisfy unsatisfied preconditions.
      • This last process is performed as collaboration between several modules:
        • The plan execution module: identifies actions that cannot be executed
        • Then, the adaptation component:
          • Identifies the failed preconditions
          • And generates goals for them
        • Finally, the plan expansion and plan retrieval modules expand the inserted goals.
  • Future Work:
    • Our techniques still have several limitations. Currently, our plan adaptation techniques require a plan to be fully instantiated in order to be adapted, thus we cannot adapt plans that are still half expanded.
    • As a consequence, the high level structure of the plan cannot be adapted unless it’s fully instantiated.
      • This could be addressed by reasoning about interactions between higher level goals, by estimating which are the preconditions and success conditions of such goals by analyzing the stored plans in the case-base to achieve those goals.
    • Another line of further research is to incorporate ideas from MPA in order to be able to merge several plans into a single plan.
  • This can be very useful and can increase vastly the flexibility of the approach since sometimes no single plan in the case base can achieve a goal, but a combination will.



Posted in Papers Summaries | Tagged: , , , , | Leave a Comment »

Thank You Kariem Mohamed Ali !

Posted by merothehero on February 7, 2010

Kariem Mohamed Ali – our friend in our very same college- managed to design our Project’s Poster and Brochure yesterday to be used in the ICT Conference today.He proved to be a true professional in his work. This was observed through his great art work and his speed. His Patience to our many demands made us even much more grateful.

Our Designer Friend Kariem Mohamed Ali

Our Designer Friend Kariem Mohamed Ali

Oh ! I Forgot to say that we have chosen the name “I-Strategizer” as the “cool” or “commercial” name of our project instead of the current scientific name.

Some of Kariem’s Work in less than 2 days :

Thank You Kariem … We Owe you a lot !

The I-Strategizer Team

Posted in Thanks | Tagged: , | 4 Comments »

Case-Based Planning and Execution for Real-Time Strategy Games

Posted by Ogail on February 5, 2010

  • Abstract:
    • Artificial Intelligence techniques have been successfully applied to several computer games. However in some kinds of computer games, like real-time strategy (RTS) games, traditional artificial intelligence techniques fail to play at a human level because of the vast search spaces that they entail. In this paper we present a real-time case based planning and execution approach designed to deal with RTS games. We propose to extract behavioral knowledge from expert demonstrations in form of individual cases. This knowledge can be reused via a case based behavior generator that proposes behaviors to achieve the specific open goals in the current plan. Specifically, we applied our technique to the WARGUS domain with promising results.
  • As we said before, one of the main goals of our research is to create AI techniques that can be used by game manufacturers to reduce the effort required to develop the AI component of their games.
  • Using the architecture presented in this paper the game developers will be able to specify the AI behavior just by demonstration; i.e. instead of having to code the behavior using a programming language, the behavior can be specified simply by demonstrating it to the system.
  • Case Based Planning work is based on the idea of planning by remembering instead of planning from scratch.
  • Systems Architecture:

  • Behavior Acquisition:
    • During this first stage, an expert plays a game of WARGUS and the trace of that game is stored. Then, the expert annotates the trace explaining the goals he was pursuing with the actions he took while playing. Using those annotations, a set of behaviors are extracted from the trace and stored as a set of cases. Each case is a triple: situation/ goal/behavior, representing that the expert used a particular behavior to achieve a certain goal in a particular situation.
  • Execution:
    • The execution engine consists of two main modules:
      • A real-time plan expansion and execution (RTEE) module.
      • A behavior generation (BG) module.
    • The RTEE module maintains an execution tree of the current active goals and subgoals and which behaviors are being executed to achieve each of the goals.
    • Execution Steps:
      • Each time there is an open goal, the RTEE queries the BG module to generate a behavior to solve it.
      • The BG then retrieves the most appropriate behavior from its case base, and sends it to the RTEE.
      • Finally, when the RTEE is about to start executing a behavior, it is sent back to the BG module for adaptation.
      • Notice that this delayed adaptation is a key feature different from traditional CBR required for real-time domains where the environment continuously changes.
  • They developed Behavior Reasoning Language (BRL) to allow a system to learn behaviors, represent them, and to reason about the behaviors and their intended and actual effects.
  • Behavior is the main basic constructing piece of the system that is divided into:
    • Declarative Part:
      • The declarative part has the purpose of providing information to the system about the intended use of the behavior (WHY).
      • This part consists of:
        • Goal:
          • For every domain, ontology of possible goals has to be defined. For instance, a behavior might have the goal of “having a tower”.
        • Set of Preconditions:
          • Must be satisfied before the behavior can be executed. For instance, a behavior can have as preconditions that a particular peasant exists and that a desired location is empty.
        • Set of Alive Conditions:
          • Conditions that must be satisfied during the execution of the behavior for it to have chances of success. If at some moment during the execution, the alive conditions are not met, the behavior can be stopped, since it will not achieve its intended goal. For instance, the peasant in charge of building a building must remain alive; if he is killed, the building will not be built.
    • Procedural Part:
      • Procedural part contains the executable behavior itself (HOW).
      • Can contain the following constructs: sequence, parallel, action, and subgoals
  • Summarizing, our behavior language is strongly inspired by ABL, but expands it with declarative annotations (expanding the representation of goals and defining alive and success conditions) to allow reasoning

  • Using temporal analysis to construct goals.
  • Open goal is a goal that has not assigned a behavior yet.
  • Behaviors state: pending, executing, succeeded and failed.
  • Open Goals States:
    • Ready: An open goal is ready when all the behaviors that had to be executed before this goal have succeeded
    • Waiting: otherwise.
  • For instance, in Figure 4, “behavior 0” is a sequential behavior and therefore the goal “build army” is ready since the “build base” goal has already succeeded and thus “build army” can be started. However, the goal “attack” is waiting, since “attack” has to be executed after “build army” succeed
  • The plan expansion module is constantly querying the current plan to see if there is any ready open goal. When this happens, the open goal is sent to the BG module to generate a behavior for it. Then, that behavior is inserted in the current plan, and it is marked as pending.
  • The plan execution module has two main functionalities:
    • Check for basic actions that can be sent directly to the game engine.
    • Check the status of plans that are in execution.
  • For each pending behavior, the execution module evaluates the preconditions, and as soon as they are met, the behavior starts its execution
  • If any of the execution behaviors have any basic actions, the execution module sends those actions to WARGUS to be executed.
  • Whenever a basic action succeeds or fails, the execution module updates the status of the behavior that contained it. When a basic action succeeds, the executing behavior can continue to the next step. When a basic action fails, the behavior is marked as failed, and thus its corresponding goal is open again (thus, the system will have to find another plan for that goal).
  • The execution module periodically evaluates the alive conditions and success conditions of each behavior. If the alive conditions of an executing behavior are not satisfied, the behavior is marked as failed, and its goal is open again. If the success conditions of a behavior are satisfied, the behavior is marked as succeeded.
  • Finally, if a behavior is about to be executed and the current game state has changed since the time the BG module generated it, the behavior is handed back to the BG and it will pass again through the adaptation phase to make sure that the plan is adequate for the current game state.
  • The goal of the BG module is to generate behaviors for specific goals in specific scenarios. Therefore, the input to the BG module is a particular scenario (i.e. the current game state in WARGUS) and a particular goal that has to be achieved (e.g. “Destroy The Enemy’s Cannon Tower”). To achieve that task, the BG system uses two separate processes: case retrieval and case adaptation.
  • Case base we that have several cases that contain different behaviors to solve each one of these problems under different circumstances. Is called heterogeneous case base.
  • Therefore we represent cases as triples: c = <S, G, B>, where S is a particular game state, G is a goal, and B is a behavior; representing that c.B is a good behavior to apply when we want to pursue goal c.G in a game state similar to c.S.

  • In particular, we have used a game state definition composed of 35 features that try to represent each aspect of the WARGUS game:
    • Twelve of them represent the number of troops (number of fighters, number of peasants, and so on)
    • Four of them represent the resources that the player disposes of (gold, oil, wood and food)
    • Fourteen represent the description of the buildings (number of town halls, number of barracks, and so on)
    • Five features represent the map (size in both dimensions, percentage of water, percentage of trees and number of gold mines).
  • The case retrieval process uses a standard nearest neighbor algorithm but with a similarity metric that takes into account both the goal and the game state. Specifically, we use the following similarity metric:
  • where dGS is a simple Euclidean distance between the game states of the two cases (where all the attributes are normalized between 0 and 1), dG is the distance metric between goals, and is a factor that controls the importance of the game state in the retrieval process (in our experiments we used = 0.5)
  • To measure distance between two goals g = name1(p1, …, pn) and g2 = name2(q1, …, qm) we use the following distance:
    • Where:
      • Pi: is the maximum value that the parameter i of a goal might take (we assume that all the parameters have positive values).
  • The adaptation process consists of a series of rules that are applied to each one of the basic operators of a behavior so that it can be applied in the current game state. Specifically, we have used two adaptation rules in our system:
    • Unit Adaptation:
      • Each basic action sends a particular command to a given unit. For instance the first action in the behavior shown in Figure 5 commands the unit “2” to build a “pig-farm”. However, when that case is retrieved and applied to a different map, that particular unit “2” might not correspond to a peon (the unit that can build farms) or might not even exist (the “2” is just an identifier). Thus, the unit adaptation rule finds the most similar unit to the one used in the case for this particular basic action. To perform that search, each unit is characterized by a set of 5 features: owner, type, position (x,y), hit-points, and status (that can be idle, moving, attacking, etc.) and then the most similar unit (according to an Euclidean distance using those 5 features) in the current map to the one specified in the basic action is used.
    • Coordinate Adaptation:
      • Some basic actions make reference to some particular coordinates in the map (such as the move or build commands). To adapt the coordinates, the BG module gets (from the case) how the map in the particular coordinates looks like by retrieving the content of the map in a 5×5 window surrounding the specified coordinates. Then, it looks in the current map for a spot in the map that is the most similar to that 5×5 window, and uses those coordinates.
  • Conclusion:
    • In this paper we have presented a case based planning framework for real-time strategy games. The main features of our approach are
    • The capability to deal with the vast decision spaces required by RTS games
    • Being able to deal with real-time problems by interleaving planning and execution in real-time
    • Solving the knowledge acquisition problem by automatically extracting behavioral knowledge from annotated expert demonstrations in form of cases.
    • We have evaluated our approach by applying it to the real-time strategy WARGUS with promising results.
    • The main contributions of this framework are:
      • A case based integrated real-time execution and planning framework.
      • The introduction of a behavior representation language that includes declarative knowledge as well as procedural knowledge to allow both reasoning and execution.
      • The idea of automatic extraction of behaviors from expert traces as a way to automatically extract domain knowledge from an expert.
      • The idea of heterogeneous case bases where cases that contain solutions for several different problems (characterized as goals in our framework) coexist.
    • The introduction of delayed adaptation to deal with dynamic environments (where adaptation has to be delayed as much as possible to adapt the behaviors with the most up to date information).
  • Future Work:
    • As future lines of research we plan to experiment with adding a case retention module in our system that retains automatically all the adapted behaviors that had successful results while playing.
    • And also annotating all the cases in the case base with their rate of success and failure allowing the system to learn from experience.
    • Additionally, we would like to systematically explore the transfer learning capabilities of our approach by evaluating how the knowledge learnt (both from expert traces and by experience) in a set of maps can be applied to a different set of maps.
    • We also plan to further explore the effect of adding more expert traces to the system and evaluate if the system is able to properly extract knowledge from each of them to deal with new scenarios.

Further, we would like to improve our current planning engine so that, in addition to sequential and parallel plans, it can also handle conditional plans. Specifically, one of the main challenges of this approach will be to detect and properly extract conditional behaviors from expert demonstrations.

Posted in Case-Based Planning | Leave a Comment »

Hierarchical Task Network (HTN)

Posted by Ogail on February 5, 2010

In artificial intelligence, the hierarchical task network, or HTN, is an approach to automated planning in which the dependency among actions can be given in the form of networks.

Planning problems are specified in the hierarchical task network approach by providing a set of tasks, which can be:

  1. primitive tasks, which roughly correspond to the actions of STRIPS;
  2. compound tasks, which can be seen as composed of a set of simpler tasks;
  3. goal tasks, which roughly corresponds to the goals of STRIPS, but are more general.

A primitive task is an action that can be executed. A compound task is a complex task composed of a sequence of actions. A goal task is a task of satisfying a condition. The difference between primitive and other tasks is that the primitive actions can be directly executed. Compound and goal tasks both require a sequence of primitive actions to be performed; however, goal tasks are specified in terms of conditions that have to be made true, while compound tasks can only be specified in terms of other tasks via the task network outlined below.

Constraints among tasks are expressed in form of networks, called task networks. A task network is a set of tasks and constraints among them. Such a network can be used as the precondition for another compound or goal task to be feasible. This way, one can express that a given task is feasible only if a set of other actions (those mentioned in the network) are done, and they are done in such a way that the constraints among them (specified by the network) are satisfied. One particular formalism for representing hierarchical task networks that has been fairly widely used is TAEMS.

A task network can for example specify that a condition is necessary for a primitive action to be executed. When this network is used as the precondition for a compound or goal task, it means that the compound or goal task requires the primitive action to be executed and that the condition must be true for its execution to successfully achieve the compound or goal task.

The best-known domain-independent HTN-planning software is:

HTN is a useful way to provide the planning engine with information about the hierarchical structure of the planning domain. HTN-like planning (as it is practically used) has the same expressivity (i.e. can solve the same domains) as STRIPS. The theoretical model of HTN is strictly more expressive than STRIPS, but cannot be directly used because of its undecidability.

Example of HTN:


Posted in Planning | Tagged: , , | 2 Comments »