Adaptive AI Engine for RTS Games

Discussing the theory and practice

Posts Tagged ‘CBP’

Case-Based Planner Architecture

Posted by MHesham on December 4, 2010

Kristian Hammond , . 1994. Case-Based Planning: A Framework for Planning from Experience. In Cognitive Science Journal.

After reading the first 20 page of the paper, I felt that an architecture for the planning framework is needed. The below diagram represents the architecture for a basic Case-Based Planner. The architecture is inferred from the paper author description.

CBP Architecture Colored

Thanks to Abdelrahman for the architecture review.

Advertisements

Posted in Case-Based Planning | Tagged: , , , , , | Leave a Comment »

On-Line Case-Based Plan Adaptation for Real-Time Strategy Games – 2008

Posted by Ogail on February 7, 2010

  • Abstract:
    • Traditional artificial intelligence techniques do not perform well in applications such as real-time strategy games because of the extensive search spaces which need to be explored. In addition, this exploration must be carried out on-line during performance time; it cannot be precomputed. We have developed on-line case-based planning techniques that are effective in such domains. In this paper, we extend our earlier work using ideas from traditional planning to inform the real-time adaptation of plans. In our framework, when a plan is retrieved, a plan dependency graph is inferred to capture the relations between actions in the plan. The plan is then adapted in real-time using its plan dependency graph. This allows the system to create and adapt plans in an efficient and effective manner while performing the task. The approach is evaluated using WARGUS, a well-known real-time strategy game.

image

  • A key problem in CBP is plan adaptation;
    • Plans cannot be replayed exactly as they were stored, especially in games of the complexity of modern RTS games.
    • More specifically, CBP techniques for RTS games need adaptation techniques that are suitable for dynamic and unpredictable domains.
  • Plan adaptation techniques can be classified in two categories:
    • Those adaptation techniques based on domain specific rules (domain specific, but fast)
    • Those based on domain independent search-based techniques (domain independent, but slow)
  • However, most previous work on plan adaptation focus on one-shot adaptation where a single plan or set of plans are retrieved and adapted before execution. In this paper we are interested on developing domain independent and search-free plan adaptation techniques that can be interleaved with execution and that are suitable for RTS games.
  • Our plan adaptation technique is based on two ideas:
    • Removing useless operations from a plan can be done by analyzing a dependency graph without performing search.
    • The insertion of new operations in the plan can be delegated to the case-base planning cycle itself, thus also getting rid of the search.
  • A goal is a symbolic non-operational description of the goal, while success conditions are actual predicates that can be checked in the game state to evaluate whether a goal was satisfied or not.
  • Postconditions are conditions to be expected after a behavior executes while success conditions are the conditions that when satisfied we can consider the behavior to have completed its execution
  • A partial plan tree in our framework is represented as a tree consisting of two types of nodes: goals and behaviors (notice the similarity with HTN planning (Nau et al. 2005)).
  • While execution, basic actions that are ready and with all its preconditions satisfied are sent to WARGUS to be executed. If the preconditions are not satisfied, the behavior is sent back to the adaptation module to see if the plan can be repaired. If it cannot, then the behavior is marked as failed.
  • The plan adaptation module is divided in two submodules:
    • The parameter adaptation module.
      • The first one is in charge of adapting the parameters of the basic actions, i.e. the coordinates and specific units
    • The structural plan adaptation module.
  • Plans are composed of four basic types of elements:
    • Actions:
      • Which are the basic actions that can be executed
    • Parallel plans:
      • Which consist of component plans which can be executed in parallel
    • Sequential plans:
      • Which consist of component plans which need to be executed in sequence
    • Sub-goal plans:
      • Requiring further expansion.
  • We can further deduce dependencies between different plans using their preconditions and success conditions.
  • We specifically consider only plans which are completely expanded and do not contain a subgoal which further needs to be expanded.
  • We generate a plan dependency graph using the preconditions and success conditions of the actions. This plan dependency graph informs the plan adaptation process.
  • Plan Dependency Graph Generator:
    • The plan dependency graph generation, begins by taking the success conditions of the root of the plan and finding out which of the component actions in the plan contribute to the achievement of these success conditions.
      • These actions are called direct sub-plans for the subgoal.
    • All direct sub-plans are added to the plan dependency graph.
    • Then, the plan dependency graph generator analyzes the preconditions of each of these direct sub-plans.
    • Let B1 be an action in the plan which contributes to satisfying the preconditions of a direct sub-plan D1. Then, it adds B1 to the plan dependency graph and forms a directed edge from B1 to D1.
      • This directed edge can be considered as a dependency between B1 and D1.
      • This is done for each of the direct sub-plans.
    • Further this is repeated for each of the actions which are added to the graph.
    • The generation stops when there’s more nodes are added to it.
    • This process results in the formation of a plan dependency graph with directed edges between actions representing that one action contributes to the achievement of the preconditions of another action.
  • However, a challenge in our work is that simple comparison of preconditions of a plan P1 with success conditions of another plan P2 is not sufficient to determine whether P2 contributes to achievement of preconditions of P1. This is because there isn’t necessarily a direct correspondence between preconditions and success.
  • An example is the case where P1 has a precondition testing the existence of a single unit of a particular type. However, P2 may have a success condition testing the existence of a given number n of units of the same type. An edge should clearly be formed from P2 to P1, however a direct comparison of the conditions will not yield this relation.
  • For that purpose, the plan dependency graph generation component needs a precondition-success condition matcher (ps-matcher). In our system, we have developed a rule-based ps-matcher that incorporates a collection of rules for the appropriate condition matching.
  • If a dependency was not detected by our system, a necessary action in the plan might get deleted. However, if our system adds extra dependencies that do not exist the only thing that can happen is that the system ends up executing some extra actions that would not be required. Clearly, executing extra actions is better than missing some needed actions.
  • The plan adaptation following the creation of the plan dependency graph has two sub-processes:
    • Elimination of unnecessary actions.
    • Insertion of extra actions.
  • The first one is performed as soon as the plan is retrieved, and the second one is performed on-line as the plan executes.
  • Removal of unnecessary actions:
    • Given a plan for a goal, the plan dependency graph is generated and the success conditions of each direct plan are evaluated for the game state at that point of execution.
      • This gives a list L of direct plans whose all success conditions are satisfied and hence do not need to be executed.
    • Now, consider L as the list of actions that can be removed from the plan corresponding to the plan dependency graph.
    • All actions B which are present only on paths terminating on an action D such that D L can be removed and hence can be added to L.
    • This is repeated until L becomes stable and no new plan is added to it.
    • All actions in L are removed from the plan dependency graph.
    • The resulting plan dependency graph has only those actions whose success conditions are not satisfied in the given game state and which have a path to a direct plan, also with success conditions not satisfied in the given game state.
  • Adaptation for unsatisfied preconditions:
    • If the execution of an action fails because one or more of its preconditions are not satisfied, the system needs to create a game state where the given preconditions are satisfied so that the execution of the plan can proceed.
    • To enable this, the adaptation module inserts satisfying goals in the plan (one goal per unsatisfied precondition).
    • The satisfying goals are such that when plans to achieve the goals is retrieved and executed, the success of the plans implies that the preconditions of the failed action are satisfied.
    • Example: When an action P1 fails to proceed because a precondition C1 is not satisfied, a satisfying goal G1 for C1 is formed. P1 is replaced by a sequential plan containing a subgoal plan with goal G1 followed by P1.
  • Notice that the plan adaptation module performs two basic operations:
    • Delete unnecessary actions (which is performed by an analysis of the plan dependency graph)
    • Insert additional actions needed to satisfy unsatisfied preconditions.
      • This last process is performed as collaboration between several modules:
        • The plan execution module: identifies actions that cannot be executed
        • Then, the adaptation component:
          • Identifies the failed preconditions
          • And generates goals for them
        • Finally, the plan expansion and plan retrieval modules expand the inserted goals.
  • Future Work:
    • Our techniques still have several limitations. Currently, our plan adaptation techniques require a plan to be fully instantiated in order to be adapted, thus we cannot adapt plans that are still half expanded.
    • As a consequence, the high level structure of the plan cannot be adapted unless it’s fully instantiated.
      • This could be addressed by reasoning about interactions between higher level goals, by estimating which are the preconditions and success conditions of such goals by analyzing the stored plans in the case-base to achieve those goals.
    • Another line of further research is to incorporate ideas from MPA in order to be able to merge several plans into a single plan.
  • This can be very useful and can increase vastly the flexibility of the approach since sometimes no single plan in the case base can achieve a goal, but a combination will.

image

Posted in Papers Summaries | Tagged: , , , , | Leave a Comment »

Example of situation assessment for case retrieval

Posted by Ahmad Atta on December 3, 2009

Posted in Case-Based Planning, Presentations | Tagged: , , | Leave a Comment »

Situation Assessment for Plan Retrieval

Posted by Ahmad Atta on December 2, 2009

Traditionally, in Case-Based Reasoning the process of case retrieval is done by selecting the case from the case-base that has the closest similarity to the world state of the problem. This similarity is measured over the various features that are computed from the representation of the world state.

Omar asked “How will you determine the importance of a feature?”

This paper : Situation Assessment for Case Retrieval by Kinshuk Mishra, Santiago Ontan╦ťon, and Ashwin Ram answers this question as follows:

The importance of features depends on their relevance in representing the high-level world state, this is high level game state is defined here as a Situation. For example, in the WARGUS domain the player might be in an attacking situation, or in a base building situation among others. Situations can be predicted based on the raw features that are directly computed from the world state i.e. shallow features.

THUS, Situation assessment is a process of gathering high-level information using low-level data to help in better decision making.

The general situation assessment algorithm is described in Figure 3. It comprises of four main steps:

1- Shallow feature selection

a. Given situation annotated trace T is provided to a feature selection algorithm. This algorithm returns the set of shallow features which have high information gain. Specifically, in Darmok, we have used best-first greedy hill-climbing algorithm for filtering the high information gain shallow features.

2- Model Generation.

a. The Situation-Classification Model Mcf

This model is useful for classification of a world state to a situation using shallow features. In Darmok, we have used a standard algorithm inducing a decision tree classifier model.

b. The Situation-Case Model Mc

Provides a mapping from the set of situations S to a subset of cases in the case-base C. It can be built using statistical or empirical analysis.

c. The Situation-Deep feature Model Mdf

Provides a mapping from S to deep features in the set Fd. This mapping is done using a feature selection algorithm or by using empirical knowledge.

3- Model Execution.

In this third step, the models generated in the previous step are executed to get the current situation S, the subset of cases C’ from the case-base C and the subset of deep features Fd’ which are most relevant to S. S is obtained by running Mcf over Fs’. Once S is known, using Mc and Mdf , C ‘ and Fd’ are obtained respectively.

4- Case Retrieval.

This is the last step where using Fd’ and FS the most similar case in retrieved from C’ using normal retrieval techniques.

clip_image002

Posted in Case-Based Planning | Tagged: , , , , , | Leave a Comment »

Case-Based Planning – A Framework for planning from experience (Part I)

Posted by Ogail on October 31, 2009

  • Abstract:
    • Presenting planning as a task supported by dynamic memory
    • Case-Based Planning: integrating models of memory, learning and planning into one system
    • Successful plans are stored in memory, indexed by the goals they satisfy and the problems the avoid
    • Also, failure is stored. Planner is able to anticipate an avoid future plans failures
    • CBP aims to improve planning in three areas:
      • Failure avoidance
      • Plan repair
      • Plan reuse
  • Case Based Planning (CBP) is idea of planning as remembering
  • CBP differs from rule-based planning in that it rests on the notion that new plans should be based on the planner’s knowledge of what has succeeded and failed in the past
  • Plan for a set of goals in not build up piece by piece from the individual plans for each goal. Instead constructed by modifying a plan from memory that already satisfies or particularly satisfies many if not all goals
  • CBP is concerned with previous planning errors in order to avoid them
  • CBP storing mechanism must have the ability to avoid past failure and reuse succeeded plans
  • CBP Differs from other planning techniques in:
    • Initial plan building
    • Reaction to plan failure
    • Vocabulary for describing and storing plans

Building a Plan

  • CBP must start building a plan for a set of goals by considering how they will interact
  • The CB approach to finding an initial plan is to anticipate problems so the planner can find plans that avoid them
  • Previous planning mechanism was create and debug paradigm
  • The main difference between them is that create and debug paradigm creates a plan and debugs a failure after it arises

Debugging Failed Plans

  • CBPer should be able to recognize failed plans to mark them as failed
  • A planning fails when a plan doesn’t satisfy some goal that it was designed to deal with
  • An exception failure is different from planning failure. It occurs when an unexpected event occurs. The response for exception failure differs from plan failure where in PF the planner alter the plan and in EF the planner should alters its understanding of the world
  • A planner must respond to plan failure by building casual explanations of why the failure has occurred and then use these explanations to access replanning strategies designed for the situation in general
  • A planner must respond to exception failure by using the explanations to add new inference rules that will allow it to anticipate the problem that it previously was unable to foresee. The planner should ask first “What went wrong with the plan?” and then ask “What went wrong with planning?”
  • The planner has to repair its expectations bout the world when those expectations lead to plans that fail
  • A CBPer responds to planning failures by repairing both the faulty plan and its knowledge base that allowed it to build the plan incorrectly

Storing Plans for Later Reuse

  • The basic vocabluray of plan indexing is nessicary the vocalary of planner’s domain and of the goals domain
  • Plans must be stored by descriptions of the negative gaol interactions they avoid (umbrella example)

Learning from Planning

  • Learning by remembering differs from theories of concept of learning and explanation-driven programming
  • CBP Learning is divided into 3 types:
    • Plan Learning
      • It’s the creation and storage of new plans as the result of planning for situations that the planner has never encountered before
      • The planner should build a new plan and decide what features are best for indexing it in memory
    • Exception Learning:
      • Its more complex than plan learning and closely linked to the indexing of plans in memory
      • It involves learning the features in a domain that are predictive of negative interactions between plans steps
      • This is used to anticipate particular problems and then to search plans in memory designed to avoid them
    • Critic Learning:
      • Learning the repairs that have to be made if those problems arise again in different circumstances

The Structure of Cased-Based Planning

  • Old theories concentrate on simulating behavior but don’t explain why it arises. They called “non-explanation explanations
  • Another look at human behavior is to ask what function this behavior serves

Building it from the Bottom

Why Case-Based?

Plan Retrieval

  • A planner must know:
    • Its initial planning situation
    • The states that are currently true
    • The goals it needs to satisfy
  • Several plans could satisfy the same goal for different situations
  • When the planner wants to retrieve a plan it probably will not find an exact plan that satisfied the goals requires. So it seeks for plans with similar goals as a starting point
  • This similarity could be expressed by:
    • Grouping similar goals into sets
    • Building them in ISA hierarchy
      • I’ve found a paper talking about representing ISA hierarchy using sets rather than trees. This paper state that trees have some problems in identifying similarities and it solves these problems using sets
    • Dynamically evaluating goal similarity on the basis of individual features
    • Similarity Matrix
  • If a planner have a set of plans that satisfy goals partially how to choose the best match plan? How to determine what plan out of the group
  • The abstraction hierarchy is used to determine the similarity between plans whereas value hierarchy used to determine the relative utilities of different plans with respect to a set of goals
  • Goals that are easier to incorporate into existing plans are less important than those that are more difficult to satisfy (buildings example 10)
  • In order to get a plan that is able to find best match for a set of goals, a planner needs to know 3 things:
    • A memory of plans indexed by goals they satisfy
    • A similarity matrix for judging the similarities of goals that is required for determining how close a plan comes to satisfy a set of goals
    • A value hierarchy of goals used to judge the relative utilities of plans with respect to a set of goals

Plan Modification

  • Plan modification is about modifying retrieved plan in order to satisfy unsatisfied goals in it
  • To alter old plans to meet new goals the planner needs:
    • Set of modification rules
      • These rules are sets of steps that can be added to a particular plans to achieve particular goal
      • These rules can just be the modifications that are needed to alter existing plan to achieve particular goal
    • Critics with knowledge of goal specific requirements
      • This information will let the planner tailor the general modifications of a plan to the specific needs of the items required to achieve particular goals
    • General plan specifications
      • This is needed so planner doesn’t violate the goals of overall the plan when it modify it to satisfy a particular goal
  • For now, the RETRIVER finds a good plan and the MODIFIER makes it better

Plan Storage

  • To place new plans in memory, the STORER needs to index them under the same features that the RETRIVER uses to find them:
    • Goals they satisfy
    • Situations in which they are appropriate

Plan Repair

  • Giving that the planner is going to make mistakes, we’ve to give it some mechanism for repairing the faulty plans it builds. This mechanism will be called the REPAIRER
  • Input to a REPAIRER:
    • Faulty plan
    • Some description of the fault itself
      • The desired state that it has failed to achieve
      • The undesired state that it has caused to come about
  • How planner can get this information?
    • Run the plan and examine the results
    • Run simulation for the plan and use their results to diagnose errors
    • It can ask outside source if the plan will do what it wants to do
  • The REPAIRER is going to have some vocabulary for describing plan failures that can be used to index methods for repairing the plan itself
  • The relationship between problems and repair is like the relationship between goals and plans
  • Plans are indexed under problems they solve and repair methods are indexed under the types of problems they resolve
  • A plan repairer needs access to two types of knowledge:
    • Vocabulary for describing plan failures
    • Set of rapier methods that correspond to those description

Learning from Failure

  • There’s a problem about how to anticipate that a problem will arise again
  • The fact that a plan solves a particular problem is useless unless the planner can anticipate that problem will arise
  • To decide which features in a situation are to blame for failure, the ASSIGNER needs to be able to describe the causes of failure. The more extensive its vocabulary for this description, the more exact its credit assignment will be
  • In anticipation, if one of the features that caused failure is high, then the suitable plan will be used

Problem Anticipation

  • The job of an anticipator is to look at the planner’s goals and the current situation that surrounds them and decide if there is anything in the situation that is predictive of a problem before any other planning is done
  • The whole point of anticipator is to provide
    • Information about problems that have to be avoided
    • Information that will be used by retriever to find plan that does so
  • To anticipate a problem on the basis of surface features, the anticipator needs the base of information built by the assigner

Learning from Planning

  • A CBPer learns by correctly indexing its planning experience in memory
  • CBP learning theory presents in:
    • Learning new plans
    • Learning new problems
    • Learning new solutions
  • A CBPer learns new plans, the features that predict failures and past repairs to faulty plans that it can reuse

Posted in Case-Based Planning | Tagged: , , , | 2 Comments »