Adaptive AI Engine for RTS Games

Discussing the theory and practice

Archive for the ‘Planning’ Category

On-line Planning for Resource Production in Real-Time Strategy Games -part5

Posted by ferasferas on November 5, 2010

Heuristic Scheduler

The previous component(sequential planner) finds a sequential plan. However, to accurately estimate the utility of any renewable resources, we need to reschedule actions to allow concurrency and decrease the makespan of the found plan. We do this by using a heuristic scheduling procedure that traverses the found action sequence in order. For each action Ai , the procedure moves the start time of Ai to the earliest possible time such that its preconditions are still satisfied.

Assume that Ai starts at time ts , and the state R+ (ts ) is the resource state at time ts after the effects of all actions that end at time ts are added to the game state, and R− (ts ) is the resource game state before the effects are added.

Obviously, the preconditions of Ai are satisfied by R+ (ts ). If they are also satisfied by R− (ts ), this means the satisfaction of the preconditions of Ai is not due to any of the actions that end at time ts , Then we can now move action Ai to start earlier than ts , to the previous decision epoch (time where an action starts or ends).This is repeated until the preconditions of A are satisfied by some R+ (ts ) but not R− (ts ), i.e., the satisfaction of the preconditions of A is due to the actions that end at time ts .

The plan is now rescheduled such that action A starts at time ts , and we can proceed to attempt to reschedule the next action in our sequential plan. The pseudo-code for this procedure is given in Algorithm 3.

We can show this procedure is sound using the following informal argument. We need to ensure that when we reschedule an action, every action between the new start time and the old start time remains executable.

Now when processing each action, we can always schedule it before a previous action if they do not consume or borrow the same resource:

1- Consider a pair of actions A and B, A before B, in a valid plan that both consume or borrow resource R, and assume we are about to reschedule B. First, if A and B are adjacent, the state before A must have enough R for A and B to execute. This means that at that state, A and B could be executed concurrently, or B could be scheduled before A, if possible.

2- On the other hand, if A and B were separated by any actions that produced R in order to satisfy B’s precondition, then our procedure would not shift B before the effects of those actions. If A and B are separated by actions not producing R, this effectively reduces to the adjacent case. Thus, this procedure is sound.

Future Work

While this procedure is not guaranteed to produce a plan with the optimal makespan, it is fast (at most quadratic in the number of actions) and performs well in practice. Investigating more complex scheduling algorithms for this problem is a topic of future work.


Posted in AI for Games, Papers Summaries, Planning, Technical Documents | Leave a Comment »

On-line Planning for Resource Production in Real-Time Strategy Games -part4

Posted by ferasferas on November 4, 2010

Sequential Planner

The first component of our online planner is a sequential planner which outputs a sequential plan to achieve the goal from the given initial state.

In principle, any off-the-shelf sequential planner can be used in this step. However, given the specific properties of our domain discussed earlier, a simple sequential planner based on means-ends analysis suffices.

MEA operates by selecting a subgoal to solve which will decrease the difference between the initial state and the goal state, and then executing the necessary actions to solve the sub-goal. Then from the new state which satisfies the subgoal the process is recursively applied until we reach the goal state. Notice that if one subgoal becomes unsolved while solving another, it will be resolved later, in the recursive step.


The pseudo-code is given in Algorithm 2. For simplicity this pseudo-code assumes that there is no resource that is both produced and consumed by a single action. It is straightforward to lift this assumption while still maintaining the polynomial-time guarantee outlined below.

First, means-ends analysis repeatedly picks an unsatisfied sub-goal Ri ≥ gi , and constructs a sub-plan Plan which satisfies it.

We will first solves all renewable resource goals before any non-renewable resource goals (because renewable resource goals, once solved, always stay solved using the monotonic increase property of renewable resources). In this ordering, every action added to the plan is necessary to solve the final goal.

Further, if we choose any other ordering of goals that necessitates revisiting previously solved goals, we will only generate permutations of the set of actions produced by the “canonical” ordering. This is because, if we revisit a goal, it must be because some actions used to solve that goal were “used up” by the preconditions of some other goal. Thus, we are effectively permuting the sequence of necessary actions if we choose a different ordering.

Since the plan found by MEA has the minimal set of actions, it consumes the minimal set of resources necessary to reach the goal. Finally, because each step of means-ends analysis adds at least one useful action to the plan, its running time is bounded by the minimum number of actions to the goal.

Notice that if the dependency graph between resources contains cycles, it is possible for means-ends analysis to get stuck in an infinite loop for certain initial states.

For example, in Wargus, collecting gold requires a townhall and borrows a peasant, while building a townhall or a peasant consumes certain amounts of gold. The presence of these cycles means there is a possibility that there is no plan to achieve a goal in some cases.

However, we can easily extend our algorithm to detect such cases if they happen. Further, as we have noted above, if the initial game state contains a peasant and a townhall, we can guarantee that there is always a plan no matter what the goal state is.

Posted in AI for Games, Papers Summaries, Planning, Technical Documents | Leave a Comment »

On-line Planning for Resource Production in Real-Time Strategy Games -part3

Posted by ferasferas on November 4, 2010

Quick Revision

Some key properties of our domain are:

1. Actions have durations.

2. There are multiple units, so actions can be executed concurrently.

3. Units and buildings can be created as the game progresses.

4. Many actions involve numeric fluents.

5. Solution plans typically involve a large number of actions compared to most standard planning benchmarks, and

6. In our setting, the planner must find a plan in real-time.

Planning Architecture

As the game proceeds, the world map and the planner’s action models may change, requiring fast re-planning with respect to the changed environment. In fact, a changing environment may make it impossible for the agent to execute a plan that is constructed offline ahead of time. Therefore, a planner which takes a long time to plan or does not provide a bound on its planning time is undesirable for our domain even if it may otherwise return the optimal plan. Instead, we aim to a planner which finds a good plan quickly while being able to scale with the number of resources, and the number of actions in the plan.

Characteristics of the Online-Planner:

1- To adapt to changing goals and environments, planner re-plans every certain Game Cycle with the current goal and game state.

2- To find a new plan, it carries out a bounded search over possible intermediate goals. The set of possible intermediate goals includes all states that have an extra renewable resource of every type. For each such goal, the planner employs means-ends analysis followed by a heuristic scheduling process to generate a plan to reach the overall goal via the intermediate goal.

3- To select an action to be executed, the planner chooses the plan with the smallest makespan. If this plan has any action that is executable at the current game state, that action (or actions) is started.


Notice that the plans generated by the planner are not usually completely executed when the planner replans at the next decision epoch using the game state at that point, it may not obtain a suffix of the plan it found at the current epoch. However, constructing such plans are valuable because they help in action selection at the current step.


Algorithm :

The pseudo-code of the main loop of the algorithm is shown in Algorithm 1. Every few game “cycles” (the time unit in Stratagus), the client checks if there are some available actions that can be executed. If so, it calls the planner to find plans which satisfy our goal from the current state. One plan will aim to satisfy the overall resource goal without any intermediate goals, while the others will aim to satisfy each of the intermediate goals which explicitly creates additional renewable resources. The plan with the shortest makespan is preferred, and actions in the plan that should start now are executed.

The planner consists of two main components:

A sequential planner that uses means-ends analysis to find a plan from game state S to goal G with the minimum number of actions, M EA(S, G), and

A heuristic scheduler which reorders actions in a sequential plan to allow concurrency and decrease its makespan, Schedule(Plan).

Next, we discuss the sequential planner and the heuristic scheduler components.

Posted in AI for Games, Papers Summaries, Planning, Technical Documents | Leave a Comment »

On-line Planning for Resource Production in Real-Time Strategy Games -part2

Posted by ferasferas on November 2, 2010

The RTS Resource Production Domain

Human players typically have no difficulty selecting actions that at least achieve a particular set of resource goals. However, it is much more difficult, and requires much more expertise, to find close to minimal makespan plans.

As an example consider the seemingly simple problem of collecting a large amount of gold starting with a single peasant and “townhall”. One could simply repeatedly collect gold with the single peasant, which would eventually achieve the goal, but it’s far from optimal where another player can choose to execute one or more actions, defined from the action set of the game. Each action produces a certain amount of products, but also consumes a certain amount of other resources, and requires that some preconditions are met before it can be executed.

Actions are Durative : take a certain amount of time to finish upon which the products are added to the game state.

In RTS games, resource-production actions are usually deterministic, and the preconditions, effects, and durations of each action are usually given or can be easily discovered through game-play. For certain actions, where a unit has to travel to a destination for an action to take place, the duration of the action will vary due to the spatial properties of a game map. However, for simplicity we assume we have a constant duration for each instance of an action. On average over the many actions taken during a game turns out to be a reasonable assumption.

Game State Representation at time t :

( 1 ) For each resource Ri , the amount ri possessed by the agent.

( 2 ) List of actions Ai , i = 1, . . . , m ,currently being executed along with the start and end times “ts” and “te” for each action ( ts < t < te ).

Goal State Reached When :

All actions currently executing in S have terminated as the projected game state, denoted by Proj(S). This state is timestamped with t = maximum te of all defined actions, has resources updated according to the effects of the actions Ai , and no actions being executed. reach a certain resource goal, G = {R1 ≥ g1 , . . . , Rn ≥ gn },defined as constraints on the resources, from the current game state. In essence, the player must determine a plan, which is a list of actions, ((A1 , t1 s , t1 e ), . . . , (Ak , tk s , tk e )), where Ai is an action that starts at time ti s and ends at time ti e .

Four Resource Tags

Require: An action requires a certain amount of a resource if it needs to be present throughout the execution of the action.

For example, the action collect-gold requires the presence of the townhall In this case, the same town-hall can be used for concurrent collect-gold actions, as the specifications townhall is not “locked up” by the collect-gold actions. Thus, the requires tag allows for sharing of resources.

Borrow: An action borrows a certain amount of a re-production source if it requires that the resource amount be “locked up” throught the execution of the action, so that no other action is allowed to borrow those resources during it’s execution. After the action has completed the resource amount is freed up for use by other actions.

For example, the collect-gold action borrows a peasant. During the execution of collect-gold action, the borrowed peasant may not be borrowed by another action. After the collect-gold action is finished, the peasant becomes available again and can be used for other action. Therefore to allow concurrent collect-gold actions,multiple peasants must be used.

Consume: An action consumes a certain amount of a resource at the start of it’s execution, as this amount is deducted from the game state. As the game state must obey the constraint that every resource value is non-negative the inferred precondition of the action is that this resource amount must be present at the start of the action.

For example, the build-barracks action consumes 700 units of gold and 450 units of wood.

Produce: An action produces a certain amount of a resource at the end of it’s execution, as this amount is added to the game state.

Resources are Divided into two Classes:

-Consumable Resources: are those that are those consumed by actions, such as gold, wood, and supply(a peasant of footman cannot be built unless there is unused supply).

-Renewable Resources: are those that are required or borrowed by actions, such as peasants, town halls and barracks.

Model Domain Characteristics :

1- Dependency Structure between resources is such that, if the initial state has a townhall and a peasant( and assuming the world map has enough consumable resources like gold and wood), there always exists a plan for any resource goal. Further if such a state can’t be reached, then no such plan exists.

2- The amount of renewable resources in a problem never decreases, since no unit is destroyed in our scenarios.

3- By Wargus action specification, all effects at the start of an action are subtractive effects, while all effects at the end of an action are additive effects.

4- Again by Wargus action specification foreach resource there exactly one action that produces it. This property implies that every plan that produces the goal resource from a game state must contain the same set of actions( though possibly not in the same sequence).

Posted in AI Game Engine Programming, Papers Summaries, Planning, Technical Documents | 1 Comment »

Introducing – “On-line Planning for Resource Production in RTS”

Posted by ferasferas on October 31, 2010

On-line Planning for Resource Production in Real-Time Strategy Games

Hei Chan, Alan Fern, Soumya Ray, Nick Wilson and Chris Ventura

School of Electrical Engineering and Computer Science

Oregon State University

Corvallis, OR 97330


Goal :

Develop action-selection mechanism in producing certain amount of resources as fast as possible.

Planner :

Computationally efficient “action-selection” mechanism which at each epoch creates a possibly Sub-Optimal concurrent plan from the current state to the goal, then begin executing the set of initial actions.

How it’s done :

formed via a combination of MEA(means-ends-analysis) and Scheduling and Bounded Search over sub-goals that aren’t required for goal achievements but may improve the make span.

Two Key problem domains :

-Resource production and tactical battles.

In Resource Production : player has to produce ( or gather ) varies raw materials, buildings, civilian and military Units to improve their economic and military power.

Tactical Battles : a player uses military units to gain territory and defeat enemy Units ( offensive or defensive actions ).

“Winning the Resource Production race is often a key factor in overall success”.

Uses :

1- either in computer opponent A.I.

2- Human can use it to specify resources needed and the Planner will get the best way to get this “RTS resource production is interesting from a pure A.I. Planning perspective as it encompasses a number of challenging issues”.

First, resource production involves temporal actions with numeric effects.

Second, performing well in this task requires highly concurrent activity.

Third, real-time constraints of the problem require action selection be computational efficient in apractical sense.

Why? :

Most existing planners are :

1- not handling temporal and numerical domains.

2- simply too inefficient to be useful.

3- produce highly Sub-Optimal plans.

The planning component used by online planner is based on a combination of means-ends analysis (MEA) and scheduling.

Given an initial state and resource goal, is used to compute a sequential plan that reaches the goal using the minimum number of actions and resources in sense that any valid plan must include all of the actions in the MEA plan.

Importantly, the special structure of our domain guarantees that MEA will produce such a plan and do so efficiently (linear time in the plan length).

Given the minimum sequential plan, we then reschedule those actions, allowing for concurrency, in an attempt to minimize the make span. This scheduling step is computationally hard, however, we have found that simple worst-case quadratic time heuristic methods work quite well. Thus both the MEA step and scheduling step are both low-order polynomial operations in the minimum number of actions required to achieve the goal, allowing for real-time efficiency.

Refrences :

MEA( means-ends analysis) :

Posted in AI for Games, Case-Based Planning, Case-Based Reasoning, Papers Summaries, Planning | 3 Comments »

Hierarchical Task Network (HTN)

Posted by Ogail on February 5, 2010

In artificial intelligence, the hierarchical task network, or HTN, is an approach to automated planning in which the dependency among actions can be given in the form of networks.

Planning problems are specified in the hierarchical task network approach by providing a set of tasks, which can be:

  1. primitive tasks, which roughly correspond to the actions of STRIPS;
  2. compound tasks, which can be seen as composed of a set of simpler tasks;
  3. goal tasks, which roughly corresponds to the goals of STRIPS, but are more general.

A primitive task is an action that can be executed. A compound task is a complex task composed of a sequence of actions. A goal task is a task of satisfying a condition. The difference between primitive and other tasks is that the primitive actions can be directly executed. Compound and goal tasks both require a sequence of primitive actions to be performed; however, goal tasks are specified in terms of conditions that have to be made true, while compound tasks can only be specified in terms of other tasks via the task network outlined below.

Constraints among tasks are expressed in form of networks, called task networks. A task network is a set of tasks and constraints among them. Such a network can be used as the precondition for another compound or goal task to be feasible. This way, one can express that a given task is feasible only if a set of other actions (those mentioned in the network) are done, and they are done in such a way that the constraints among them (specified by the network) are satisfied. One particular formalism for representing hierarchical task networks that has been fairly widely used is TAEMS.

A task network can for example specify that a condition is necessary for a primitive action to be executed. When this network is used as the precondition for a compound or goal task, it means that the compound or goal task requires the primitive action to be executed and that the condition must be true for its execution to successfully achieve the compound or goal task.

The best-known domain-independent HTN-planning software is:

HTN is a useful way to provide the planning engine with information about the hierarchical structure of the planning domain. HTN-like planning (as it is practically used) has the same expressivity (i.e. can solve the same domains) as STRIPS. The theoretical model of HTN is strictly more expressive than STRIPS, but cannot be directly used because of its undecidability.

Example of HTN:


Posted in Planning | Tagged: , , | 2 Comments »