Adaptive AI Engine for RTS Games

Discussing the theory and practice

Archive for the ‘Case-Based Planning’ Category

Case-Based Planner Architecture

Posted by MHesham on December 4, 2010

Kristian Hammond , . 1994. Case-Based Planning: A Framework for Planning from Experience. In Cognitive Science Journal.

After reading the first 20 page of the paper, I felt that an architecture for the planning framework is needed. The below diagram represents the architecture for a basic Case-Based Planner. The architecture is inferred from the paper author description.

CBP Architecture Colored

Thanks to Abdelrahman for the architecture review.

Posted in Case-Based Planning | Tagged: , , , , , | Leave a Comment »

Introducing – “On-line Planning for Resource Production in RTS”

Posted by ferasferas on October 31, 2010

On-line Planning for Resource Production in Real-Time Strategy Games

Hei Chan, Alan Fern, Soumya Ray, Nick Wilson and Chris Ventura

School of Electrical Engineering and Computer Science

Oregon State University

Corvallis, OR 97330

{chanhe,afern,sray,wilsonic,ventura}@eecs.oregonstate.edu

Goal :

Develop action-selection mechanism in producing certain amount of resources as fast as possible.

Planner :

Computationally efficient “action-selection” mechanism which at each epoch creates a possibly Sub-Optimal concurrent plan from the current state to the goal, then begin executing the set of initial actions.

How it’s done :

formed via a combination of MEA(means-ends-analysis) and Scheduling and Bounded Search over sub-goals that aren’t required for goal achievements but may improve the make span.

Two Key problem domains :

-Resource production and tactical battles.

In Resource Production : player has to produce ( or gather ) varies raw materials, buildings, civilian and military Units to improve their economic and military power.

Tactical Battles : a player uses military units to gain territory and defeat enemy Units ( offensive or defensive actions ).

“Winning the Resource Production race is often a key factor in overall success”.

Uses :

1- either in computer opponent A.I.

2- Human can use it to specify resources needed and the Planner will get the best way to get this “RTS resource production is interesting from a pure A.I. Planning perspective as it encompasses a number of challenging issues”.

First, resource production involves temporal actions with numeric effects.

Second, performing well in this task requires highly concurrent activity.

Third, real-time constraints of the problem require action selection be computational efficient in apractical sense.

Why? :

Most existing planners are :

1- not handling temporal and numerical domains.

2- simply too inefficient to be useful.

3- produce highly Sub-Optimal plans.

The planning component used by online planner is based on a combination of means-ends analysis (MEA) and scheduling.

Given an initial state and resource goal, is used to compute a sequential plan that reaches the goal using the minimum number of actions and resources in sense that any valid plan must include all of the actions in the MEA plan.

Importantly, the special structure of our domain guarantees that MEA will produce such a plan and do so efficiently (linear time in the plan length).

Given the minimum sequential plan, we then reschedule those actions, allowing for concurrency, in an attempt to minimize the make span. This scheduling step is computationally hard, however, we have found that simple worst-case quadratic time heuristic methods work quite well. Thus both the MEA step and scheduling step are both low-order polynomial operations in the minimum number of actions required to achieve the goal, allowing for real-time efficiency.

Refrences :

MEA( means-ends analysis) : http://en.wikipedia.org/wiki/Means-ends_analysis

Posted in AI for Games, Case-Based Planning, Case-Based Reasoning, Papers Summaries, Planning | 3 Comments »

Case-Based Planning and Execution for Real-Time Strategy Games

Posted by Ogail on February 5, 2010

  • Abstract:
    • Artificial Intelligence techniques have been successfully applied to several computer games. However in some kinds of computer games, like real-time strategy (RTS) games, traditional artificial intelligence techniques fail to play at a human level because of the vast search spaces that they entail. In this paper we present a real-time case based planning and execution approach designed to deal with RTS games. We propose to extract behavioral knowledge from expert demonstrations in form of individual cases. This knowledge can be reused via a case based behavior generator that proposes behaviors to achieve the specific open goals in the current plan. Specifically, we applied our technique to the WARGUS domain with promising results.
  • As we said before, one of the main goals of our research is to create AI techniques that can be used by game manufacturers to reduce the effort required to develop the AI component of their games.
  • Using the architecture presented in this paper the game developers will be able to specify the AI behavior just by demonstration; i.e. instead of having to code the behavior using a programming language, the behavior can be specified simply by demonstrating it to the system.
  • Case Based Planning work is based on the idea of planning by remembering instead of planning from scratch.
  • Systems Architecture:

  • Behavior Acquisition:
    • During this first stage, an expert plays a game of WARGUS and the trace of that game is stored. Then, the expert annotates the trace explaining the goals he was pursuing with the actions he took while playing. Using those annotations, a set of behaviors are extracted from the trace and stored as a set of cases. Each case is a triple: situation/ goal/behavior, representing that the expert used a particular behavior to achieve a certain goal in a particular situation.
  • Execution:
    • The execution engine consists of two main modules:
      • A real-time plan expansion and execution (RTEE) module.
      • A behavior generation (BG) module.
    • The RTEE module maintains an execution tree of the current active goals and subgoals and which behaviors are being executed to achieve each of the goals.
    • Execution Steps:
      • Each time there is an open goal, the RTEE queries the BG module to generate a behavior to solve it.
      • The BG then retrieves the most appropriate behavior from its case base, and sends it to the RTEE.
      • Finally, when the RTEE is about to start executing a behavior, it is sent back to the BG module for adaptation.
      • Notice that this delayed adaptation is a key feature different from traditional CBR required for real-time domains where the environment continuously changes.
  • They developed Behavior Reasoning Language (BRL) to allow a system to learn behaviors, represent them, and to reason about the behaviors and their intended and actual effects.
  • Behavior is the main basic constructing piece of the system that is divided into:
    • Declarative Part:
      • The declarative part has the purpose of providing information to the system about the intended use of the behavior (WHY).
      • This part consists of:
        • Goal:
          • For every domain, ontology of possible goals has to be defined. For instance, a behavior might have the goal of “having a tower”.
        • Set of Preconditions:
          • Must be satisfied before the behavior can be executed. For instance, a behavior can have as preconditions that a particular peasant exists and that a desired location is empty.
        • Set of Alive Conditions:
          • Conditions that must be satisfied during the execution of the behavior for it to have chances of success. If at some moment during the execution, the alive conditions are not met, the behavior can be stopped, since it will not achieve its intended goal. For instance, the peasant in charge of building a building must remain alive; if he is killed, the building will not be built.
    • Procedural Part:
      • Procedural part contains the executable behavior itself (HOW).
      • Can contain the following constructs: sequence, parallel, action, and subgoals
  • Summarizing, our behavior language is strongly inspired by ABL, but expands it with declarative annotations (expanding the representation of goals and defining alive and success conditions) to allow reasoning

  • Using temporal analysis to construct goals.
  • Open goal is a goal that has not assigned a behavior yet.
  • Behaviors state: pending, executing, succeeded and failed.
  • Open Goals States:
    • Ready: An open goal is ready when all the behaviors that had to be executed before this goal have succeeded
    • Waiting: otherwise.
  • For instance, in Figure 4, “behavior 0” is a sequential behavior and therefore the goal “build army” is ready since the “build base” goal has already succeeded and thus “build army” can be started. However, the goal “attack” is waiting, since “attack” has to be executed after “build army” succeed
  • The plan expansion module is constantly querying the current plan to see if there is any ready open goal. When this happens, the open goal is sent to the BG module to generate a behavior for it. Then, that behavior is inserted in the current plan, and it is marked as pending.
  • The plan execution module has two main functionalities:
    • Check for basic actions that can be sent directly to the game engine.
    • Check the status of plans that are in execution.
  • For each pending behavior, the execution module evaluates the preconditions, and as soon as they are met, the behavior starts its execution
  • If any of the execution behaviors have any basic actions, the execution module sends those actions to WARGUS to be executed.
  • Whenever a basic action succeeds or fails, the execution module updates the status of the behavior that contained it. When a basic action succeeds, the executing behavior can continue to the next step. When a basic action fails, the behavior is marked as failed, and thus its corresponding goal is open again (thus, the system will have to find another plan for that goal).
  • The execution module periodically evaluates the alive conditions and success conditions of each behavior. If the alive conditions of an executing behavior are not satisfied, the behavior is marked as failed, and its goal is open again. If the success conditions of a behavior are satisfied, the behavior is marked as succeeded.
  • Finally, if a behavior is about to be executed and the current game state has changed since the time the BG module generated it, the behavior is handed back to the BG and it will pass again through the adaptation phase to make sure that the plan is adequate for the current game state.
  • The goal of the BG module is to generate behaviors for specific goals in specific scenarios. Therefore, the input to the BG module is a particular scenario (i.e. the current game state in WARGUS) and a particular goal that has to be achieved (e.g. “Destroy The Enemy’s Cannon Tower”). To achieve that task, the BG system uses two separate processes: case retrieval and case adaptation.
  • Case base we that have several cases that contain different behaviors to solve each one of these problems under different circumstances. Is called heterogeneous case base.
  • Therefore we represent cases as triples: c = <S, G, B>, where S is a particular game state, G is a goal, and B is a behavior; representing that c.B is a good behavior to apply when we want to pursue goal c.G in a game state similar to c.S.

  • In particular, we have used a game state definition composed of 35 features that try to represent each aspect of the WARGUS game:
    • Twelve of them represent the number of troops (number of fighters, number of peasants, and so on)
    • Four of them represent the resources that the player disposes of (gold, oil, wood and food)
    • Fourteen represent the description of the buildings (number of town halls, number of barracks, and so on)
    • Five features represent the map (size in both dimensions, percentage of water, percentage of trees and number of gold mines).
  • The case retrieval process uses a standard nearest neighbor algorithm but with a similarity metric that takes into account both the goal and the game state. Specifically, we use the following similarity metric:
  • where dGS is a simple Euclidean distance between the game states of the two cases (where all the attributes are normalized between 0 and 1), dG is the distance metric between goals, and is a factor that controls the importance of the game state in the retrieval process (in our experiments we used = 0.5)
  • To measure distance between two goals g = name1(p1, …, pn) and g2 = name2(q1, …, qm) we use the following distance:
    • Where:
      • Pi: is the maximum value that the parameter i of a goal might take (we assume that all the parameters have positive values).
  • The adaptation process consists of a series of rules that are applied to each one of the basic operators of a behavior so that it can be applied in the current game state. Specifically, we have used two adaptation rules in our system:
    • Unit Adaptation:
      • Each basic action sends a particular command to a given unit. For instance the first action in the behavior shown in Figure 5 commands the unit “2” to build a “pig-farm”. However, when that case is retrieved and applied to a different map, that particular unit “2” might not correspond to a peon (the unit that can build farms) or might not even exist (the “2” is just an identifier). Thus, the unit adaptation rule finds the most similar unit to the one used in the case for this particular basic action. To perform that search, each unit is characterized by a set of 5 features: owner, type, position (x,y), hit-points, and status (that can be idle, moving, attacking, etc.) and then the most similar unit (according to an Euclidean distance using those 5 features) in the current map to the one specified in the basic action is used.
    • Coordinate Adaptation:
      • Some basic actions make reference to some particular coordinates in the map (such as the move or build commands). To adapt the coordinates, the BG module gets (from the case) how the map in the particular coordinates looks like by retrieving the content of the map in a 5×5 window surrounding the specified coordinates. Then, it looks in the current map for a spot in the map that is the most similar to that 5×5 window, and uses those coordinates.
  • Conclusion:
    • In this paper we have presented a case based planning framework for real-time strategy games. The main features of our approach are
    • The capability to deal with the vast decision spaces required by RTS games
    • Being able to deal with real-time problems by interleaving planning and execution in real-time
    • Solving the knowledge acquisition problem by automatically extracting behavioral knowledge from annotated expert demonstrations in form of cases.
    • We have evaluated our approach by applying it to the real-time strategy WARGUS with promising results.
    • The main contributions of this framework are:
      • A case based integrated real-time execution and planning framework.
      • The introduction of a behavior representation language that includes declarative knowledge as well as procedural knowledge to allow both reasoning and execution.
      • The idea of automatic extraction of behaviors from expert traces as a way to automatically extract domain knowledge from an expert.
      • The idea of heterogeneous case bases where cases that contain solutions for several different problems (characterized as goals in our framework) coexist.
    • The introduction of delayed adaptation to deal with dynamic environments (where adaptation has to be delayed as much as possible to adapt the behaviors with the most up to date information).
  • Future Work:
    • As future lines of research we plan to experiment with adding a case retention module in our system that retains automatically all the adapted behaviors that had successful results while playing.
    • And also annotating all the cases in the case base with their rate of success and failure allowing the system to learn from experience.
    • Additionally, we would like to systematically explore the transfer learning capabilities of our approach by evaluating how the knowledge learnt (both from expert traces and by experience) in a set of maps can be applied to a different set of maps.
    • We also plan to further explore the effect of adding more expert traces to the system and evaluate if the system is able to properly extract knowledge from each of them to deal with new scenarios.

Further, we would like to improve our current planning engine so that, in addition to sequential and parallel plans, it can also handle conditional plans. Specifically, one of the main challenges of this approach will be to detect and properly extract conditional behaviors from expert demonstrations.

Posted in Case-Based Planning | Leave a Comment »

Example of situation assessment for case retrieval

Posted by Ahmad Atta on December 3, 2009

Posted in Case-Based Planning, Presentations | Tagged: , , | Leave a Comment »

Situation Assessment for Plan Retrieval

Posted by Ahmad Atta on December 2, 2009

Traditionally, in Case-Based Reasoning the process of case retrieval is done by selecting the case from the case-base that has the closest similarity to the world state of the problem. This similarity is measured over the various features that are computed from the representation of the world state.

Omar asked “How will you determine the importance of a feature?”

This paper : Situation Assessment for Case Retrieval by Kinshuk Mishra, Santiago Ontan˜on, and Ashwin Ram answers this question as follows:

The importance of features depends on their relevance in representing the high-level world state, this is high level game state is defined here as a Situation. For example, in the WARGUS domain the player might be in an attacking situation, or in a base building situation among others. Situations can be predicted based on the raw features that are directly computed from the world state i.e. shallow features.

THUS, Situation assessment is a process of gathering high-level information using low-level data to help in better decision making.

The general situation assessment algorithm is described in Figure 3. It comprises of four main steps:

1- Shallow feature selection

a. Given situation annotated trace T is provided to a feature selection algorithm. This algorithm returns the set of shallow features which have high information gain. Specifically, in Darmok, we have used best-first greedy hill-climbing algorithm for filtering the high information gain shallow features.

2- Model Generation.

a. The Situation-Classification Model Mcf

This model is useful for classification of a world state to a situation using shallow features. In Darmok, we have used a standard algorithm inducing a decision tree classifier model.

b. The Situation-Case Model Mc

Provides a mapping from the set of situations S to a subset of cases in the case-base C. It can be built using statistical or empirical analysis.

c. The Situation-Deep feature Model Mdf

Provides a mapping from S to deep features in the set Fd. This mapping is done using a feature selection algorithm or by using empirical knowledge.

3- Model Execution.

In this third step, the models generated in the previous step are executed to get the current situation S, the subset of cases C’ from the case-base C and the subset of deep features Fd’ which are most relevant to S. S is obtained by running Mcf over Fs’. Once S is known, using Mc and Mdf , C ‘ and Fd’ are obtained respectively.

4- Case Retrieval.

This is the last step where using Fd’ and FS the most similar case in retrieved from C’ using normal retrieval techniques.

clip_image002

Posted in Case-Based Planning | Tagged: , , , , , | Leave a Comment »

Case-Based Planning – A Framework for planning from experience (Part II)

Posted by Ogail on December 1, 2009

  • Object critics are used in plan modification to reduce effort of modifying a plan. Where you can use some of the unrelated components in the partially matched plan.
  • When plan fails the explanation function has two tasks:
    • Identifies the parts of the plan to be altered.
    • Identifies places where planner’s knowledge is faulty
  • Plan failure means:
    • Change the steps that leads to the current failed plan
    • Change your understanding to the world (as planner)
  • Explanation of Failure Contains:
    • Why the failure has occurred?
    • Definition of the state of the failure.
    • Step that resulted in the failure.
    • Conditions that has to be true for failure to come out.
    • What was the planner trying to do when the failure occurred?
    • Description of steps that achieves the plan that was being applied goals.
  • Explanation example is page 25 is very useful!
  • Thematic Organization Packets (TOP) are set of strategies used to resolve the failure in the plan.
  • Usually the planning failure occurs from the casual interaction between goals,
  • After detecting a failure try to generalize it as possible (as if you discover a problem in beef generalize it to meat).
  • Demons are features that cause the failure to arise.
  • In failure anticipation, when the planner predicts a problem that will arise it adds a goal that will avoid it.
  • Plan is a series of steps and a list of ingredients that has been built to satisfy some particular set of goals.    
  • CBP need not to generalize plans it stores, it does need to generalize the features that are used to index them.
  • It’s easier to alter a plan to satisfy a new goal than to make the changes required to deal with an interaction between goals (in other words problems to avoid)
  • CHEF was using semantic network as data structure that holds the plans
  • A CBP can use its own memories of past failure to anticipate new plans by associating failures with the surface features that predict them.
  • A CBP can generalize surface features of a problem by generalizing to the level of the rules. This means generalizing an object in an explanation up to the highest possible level of generality while still staying within the confines of the rules that explain the failure.
  • Intermediate states that serve as links in the casual chain that led to a failure are also linked to the memory of the failure and thus can be used to predict it if they arise in late situations.
  • Why not just index plans by the features that predict the problem that they solve directly?
    • Problems are described in few words and it’s inefficient to store all the features that describe a problem!
    • Ability re recovering from failure.
  • CBP stores names or repairs that it makes (critics) indexed by problems they solve.
  • A CBP can only save those repairs that can be transferred to any plan in which the problem that they repair arises.
  • Here we imply that: Learning is the organization of experience.
  • The most important problem here is deciding how to describe and store an experience so that it can be used again.
  • Learner job is to collect features that should be used to index the planner’s work and then store that work away for later reuse.
  • Planner’s Job:
    • Provide learner with content of what is learned.
    • Building explanations that is used to assign blame for a failure.
    • Building batch that is stored as a new ingredient critic.
  • Learning means storing the different results of the planner’s activity indexed by the features that determine their usefulness.
  • A planner knows when to learn.
  • Learning from planning is an improvement over other types of learning in that:
    • It uses the knowledge of the planner to determine:
      • What it learns.
      • How to index it.
      • When to learn it at all.

Posted in Case-Based Planning | Leave a Comment »

A survey of common techniques used in developing CBR systems

Posted by Ahmad Atta on November 3, 2009

Abstract
After reading many papers related to CBR, CBP, and their use in RTS games, I collected the common techniques used for applying CBR concepts, and sub-tasks. At first, I want to assure that online adaptation of planning in RTS games has been already implemented (in contrast to what we had thought). Thus, I will mention some of the important techniques used in Darmok system; the CBR system that implements the architecture for case-based planning in the WARGUS RTS game. In addition to these techniques, I added some ideas which we can apply in our new CBR system.
Read This Doc

Posted in Case-Based Planning, Case-Based Reasoning | Tagged: , , , , , | Leave a Comment »

Introduction to Case-Based Planning

Posted by Ogail on November 1, 2009

Posted in Case-Based Planning, Presentations | 1 Comment »

Case-Based Planning – A Framework for planning from experience (Part I)

Posted by Ogail on October 31, 2009

  • Abstract:
    • Presenting planning as a task supported by dynamic memory
    • Case-Based Planning: integrating models of memory, learning and planning into one system
    • Successful plans are stored in memory, indexed by the goals they satisfy and the problems the avoid
    • Also, failure is stored. Planner is able to anticipate an avoid future plans failures
    • CBP aims to improve planning in three areas:
      • Failure avoidance
      • Plan repair
      • Plan reuse
  • Case Based Planning (CBP) is idea of planning as remembering
  • CBP differs from rule-based planning in that it rests on the notion that new plans should be based on the planner’s knowledge of what has succeeded and failed in the past
  • Plan for a set of goals in not build up piece by piece from the individual plans for each goal. Instead constructed by modifying a plan from memory that already satisfies or particularly satisfies many if not all goals
  • CBP is concerned with previous planning errors in order to avoid them
  • CBP storing mechanism must have the ability to avoid past failure and reuse succeeded plans
  • CBP Differs from other planning techniques in:
    • Initial plan building
    • Reaction to plan failure
    • Vocabulary for describing and storing plans

Building a Plan

  • CBP must start building a plan for a set of goals by considering how they will interact
  • The CB approach to finding an initial plan is to anticipate problems so the planner can find plans that avoid them
  • Previous planning mechanism was create and debug paradigm
  • The main difference between them is that create and debug paradigm creates a plan and debugs a failure after it arises

Debugging Failed Plans

  • CBPer should be able to recognize failed plans to mark them as failed
  • A planning fails when a plan doesn’t satisfy some goal that it was designed to deal with
  • An exception failure is different from planning failure. It occurs when an unexpected event occurs. The response for exception failure differs from plan failure where in PF the planner alter the plan and in EF the planner should alters its understanding of the world
  • A planner must respond to plan failure by building casual explanations of why the failure has occurred and then use these explanations to access replanning strategies designed for the situation in general
  • A planner must respond to exception failure by using the explanations to add new inference rules that will allow it to anticipate the problem that it previously was unable to foresee. The planner should ask first “What went wrong with the plan?” and then ask “What went wrong with planning?”
  • The planner has to repair its expectations bout the world when those expectations lead to plans that fail
  • A CBPer responds to planning failures by repairing both the faulty plan and its knowledge base that allowed it to build the plan incorrectly

Storing Plans for Later Reuse

  • The basic vocabluray of plan indexing is nessicary the vocalary of planner’s domain and of the goals domain
  • Plans must be stored by descriptions of the negative gaol interactions they avoid (umbrella example)

Learning from Planning

  • Learning by remembering differs from theories of concept of learning and explanation-driven programming
  • CBP Learning is divided into 3 types:
    • Plan Learning
      • It’s the creation and storage of new plans as the result of planning for situations that the planner has never encountered before
      • The planner should build a new plan and decide what features are best for indexing it in memory
    • Exception Learning:
      • Its more complex than plan learning and closely linked to the indexing of plans in memory
      • It involves learning the features in a domain that are predictive of negative interactions between plans steps
      • This is used to anticipate particular problems and then to search plans in memory designed to avoid them
    • Critic Learning:
      • Learning the repairs that have to be made if those problems arise again in different circumstances

The Structure of Cased-Based Planning

  • Old theories concentrate on simulating behavior but don’t explain why it arises. They called “non-explanation explanations
  • Another look at human behavior is to ask what function this behavior serves

Building it from the Bottom

Why Case-Based?

Plan Retrieval

  • A planner must know:
    • Its initial planning situation
    • The states that are currently true
    • The goals it needs to satisfy
  • Several plans could satisfy the same goal for different situations
  • When the planner wants to retrieve a plan it probably will not find an exact plan that satisfied the goals requires. So it seeks for plans with similar goals as a starting point
  • This similarity could be expressed by:
    • Grouping similar goals into sets
    • Building them in ISA hierarchy
      • I’ve found a paper talking about representing ISA hierarchy using sets rather than trees. This paper state that trees have some problems in identifying similarities and it solves these problems using sets
    • Dynamically evaluating goal similarity on the basis of individual features
    • Similarity Matrix
  • If a planner have a set of plans that satisfy goals partially how to choose the best match plan? How to determine what plan out of the group
  • The abstraction hierarchy is used to determine the similarity between plans whereas value hierarchy used to determine the relative utilities of different plans with respect to a set of goals
  • Goals that are easier to incorporate into existing plans are less important than those that are more difficult to satisfy (buildings example 10)
  • In order to get a plan that is able to find best match for a set of goals, a planner needs to know 3 things:
    • A memory of plans indexed by goals they satisfy
    • A similarity matrix for judging the similarities of goals that is required for determining how close a plan comes to satisfy a set of goals
    • A value hierarchy of goals used to judge the relative utilities of plans with respect to a set of goals

Plan Modification

  • Plan modification is about modifying retrieved plan in order to satisfy unsatisfied goals in it
  • To alter old plans to meet new goals the planner needs:
    • Set of modification rules
      • These rules are sets of steps that can be added to a particular plans to achieve particular goal
      • These rules can just be the modifications that are needed to alter existing plan to achieve particular goal
    • Critics with knowledge of goal specific requirements
      • This information will let the planner tailor the general modifications of a plan to the specific needs of the items required to achieve particular goals
    • General plan specifications
      • This is needed so planner doesn’t violate the goals of overall the plan when it modify it to satisfy a particular goal
  • For now, the RETRIVER finds a good plan and the MODIFIER makes it better

Plan Storage

  • To place new plans in memory, the STORER needs to index them under the same features that the RETRIVER uses to find them:
    • Goals they satisfy
    • Situations in which they are appropriate

Plan Repair

  • Giving that the planner is going to make mistakes, we’ve to give it some mechanism for repairing the faulty plans it builds. This mechanism will be called the REPAIRER
  • Input to a REPAIRER:
    • Faulty plan
    • Some description of the fault itself
      • The desired state that it has failed to achieve
      • The undesired state that it has caused to come about
  • How planner can get this information?
    • Run the plan and examine the results
    • Run simulation for the plan and use their results to diagnose errors
    • It can ask outside source if the plan will do what it wants to do
  • The REPAIRER is going to have some vocabulary for describing plan failures that can be used to index methods for repairing the plan itself
  • The relationship between problems and repair is like the relationship between goals and plans
  • Plans are indexed under problems they solve and repair methods are indexed under the types of problems they resolve
  • A plan repairer needs access to two types of knowledge:
    • Vocabulary for describing plan failures
    • Set of rapier methods that correspond to those description

Learning from Failure

  • There’s a problem about how to anticipate that a problem will arise again
  • The fact that a plan solves a particular problem is useless unless the planner can anticipate that problem will arise
  • To decide which features in a situation are to blame for failure, the ASSIGNER needs to be able to describe the causes of failure. The more extensive its vocabulary for this description, the more exact its credit assignment will be
  • In anticipation, if one of the features that caused failure is high, then the suitable plan will be used

Problem Anticipation

  • The job of an anticipator is to look at the planner’s goals and the current situation that surrounds them and decide if there is anything in the situation that is predictive of a problem before any other planning is done
  • The whole point of anticipator is to provide
    • Information about problems that have to be avoided
    • Information that will be used by retriever to find plan that does so
  • To anticipate a problem on the basis of surface features, the anticipator needs the base of information built by the assigner

Learning from Planning

  • A CBPer learns by correctly indexing its planning experience in memory
  • CBP learning theory presents in:
    • Learning new plans
    • Learning new problems
    • Learning new solutions
  • A CBPer learns new plans, the features that predict failures and past repairs to faulty plans that it can reuse

Posted in Case-Based Planning | Tagged: , , , | 2 Comments »

On-Line Case-Based Plan Adaptation for RTS Games

Posted by Ahmad Atta on October 22, 2009

Abstract

After reading about Case-based Reasoning, the second step we should talk is understanding Case-based planning, my reference in this post will be the paper of Neha Sugandh, Santiago Ontanon, and Ashwin Ram titled: On-Line Case-Based Plan Adaptation for Real-Time Strategy Games. This paper presents techniques allow the system to create and adapt plans in an efficient and effective manner while performing the task.

Case-Based Planning in WARGUS

In this section we will briefly describe the Darmok system, which applies the plan adaptation techniques. Darmok was designed to play the game of WARGUS . In order to achieve this, Darmok learns behaviors from expert demonstrations, and then uses case-based planning to play the game reusing the learnt behaviors. Figure 2 shows an overview of our case-based planning approach.

Darmok System

Basically, we divide the process in two main stages:

• Behavior acquisition:

During this first stage, an expert plays a game of WARGUS and the trace of that game is stored. Then, the expert annotates the trace explaining the goals he was pursuing with the actions he took while playing. Using those annotations, a set of behaviors are extracted from the trace and stored as a set of cases.

Behavior Execution:

The execution engine consists of several modules that together maintain a current plan to win the game.

  1. The Plan Execution module: Executes the current plan, and updates its state (marking which actions succeeded or failed).
  2. The Plan Expansion module: Identifies open goals(a goal still doesn’t have an assigned behavior) in the current plan and expands them.
  3. The Behavior Retrieval module: Given an open goal and the current game state, it retrieves the most appropriate behavior to fulfill the open goal.
  4. The Plan Adaptation module (the focus of this paper): Adapts the retrieved plans according to the current game state.

————————–

Posted in Case-Based Planning | Leave a Comment »