1705-Article Text-1701-1-10-20080129
1705-Article Text-1701-1-10-20080129
Articles
Learning-Assisted
Automated Planning
Looking Back, Taking Stock,
Going Forward
■ This article reports on an extensive survey and spanning some 30 years attests that it is an in-
analysis of research work related to machine learn- teresting, broad, and fertile field in which
ing as it applies to automated planning over the learning techniques can be applied to advan-
past 30 years. Major research contributions are tage. We focus here on this learning-in-plan-
broadly characterized by learning method and
ning research and utilize both tables and graph-
then descriptive subcategories. Survey results re-
veal learning techniques that have extensively
ic maps of existing studies to spotlight the
been applied and a number that have received combinations of planning-learning methods
scant attention. We extend the survey analysis to that have received the most attention as well as
suggest promising avenues for future research in those that have scarcely been explored. We do
learning based on both previous experience and not attempt to provide, in this limited space, a
current needs in the planning community. tutorial of the broad range of planning and
learning methodologies, assuming instead that
the interested reader has at least passing famil-
iarity with these fields.
I
n this article, we consider the symbiosis of A cursory review of the state of the art in
two of the most broadly recognized hall- learning in planning during the early to mid-
marks of intelligence: (1) planning—solving 1990s reveals that the primary impetus for
problems in which one uses beliefs about ac- learning was to make up for often debilitating
tions and their consequences to construct a se- weaknesses in the planners themselves. The
quence of actions that achieve one’s goals— general-purpose planning systems of even a
and (2) learning—using past experience and pre- decade ago struggled to solve simple problems
cepts to improve one’s ability to act in the fu- in the classical benchmark domains; blocks
ture. Within the AI research community, ma- world problems of 10 blocks lay beyond their
chine learning is viewed as a potentially capabilities as did most logistics problems
powerful means of endowing an agent with (Kodtratoff and Michalski 1990; Minton 1993).
greater autonomy and flexibility, often com- The planners of the period used only weak
pensating for the designer’s incomplete knowl- guidance in traversing their search spaces, so it
edge of the world that the agent will face and is not surprising that augmenting the systems
incurring low overhead in terms of human to learn some such guidance was often a win-
oversight and control. If we view a computer ning strategy. Relative to the largely naïve base
program with learning capabilities as an agent, planner, the learning-enhanced systems dem-
then we can say that learning takes place as a onstrated improvements in both the size of
result of the interaction of the agent and the problems that could be addressed and the
world and observation by the agent of its own speed with which they could be solved (Kamb-
decision-making processes. Planning is one hampati, Katukam, and Qu 1996; Leckie and
such decision-making process that such an Zukerman 1998; Minton et. al. 1989; Veloso
agent might undertake, and a corpus of work and Carbonell 1993).
Copyright © 2003, American Association for Artificial Intelligence. All rights reserved. 0738-4602-2003 / $2.00 SUMMER 2003 73
Articles
74 AI MAGAZINE
Articles
Neural Network
Compilation
Approaches Bayesian Learning
Figure 1. Five Dimensions Characterizing Automated Planning Systems Augmented with a Learning Component.
CSP = constraint-satisfaction programming. EBL = explanation-based learning. SAT = satisfiability.
execution phase in which learning is conduct- corpus of work to date and the difficulty of vi-
ed, and (5) type of learning method. sualizing and presenting patterns and relation-
We hope to show that this set of dimensions ships in high-dimensional data, we settled on
is useful in both gaining useful perspective on the five dimensions of figure 1 as the most re-
the work that has been done in learning-aug- vealing. Before reporting on the literature sur-
mented planning and speculating about prof- vey, we briefly discuss each of these dimen-
itable directions for future research. Admitted- sions.
ly, these are not independent or orthogonal
dimensions; they also do not make up an ex- Planning Problem Type
haustive list of relevant factors in the design of The nature of the environment in which the
an effective learning component for a given planner must conduct its reasoning defines
planner. Among other candidate dimensions where a given problem lies in the continuum
that could have been included are type of plan of classes from classical to full-scope planning.
(for example, conditional, conformant, serial, Here, classical planning refers to a world model
or parallel actions), type of knowledge learned in which fluents are propositional, and they
(domain or search control), learning impetus don’t change unless the planning agent acts to
(data driven or knowledge driven), and type of change them, all relevant attributes can be ob-
organization (hierarchical or flat). Given the served at any time, the impact of executing an
SUMMER 2003 75
Articles
action on the environment is known and de- state-space search (Kambhampati 2000). BLACK-
terministic, and the effects of taking an action BOX (Kautz and Selman 1999) uses GRAPHPLAN’s
occur instantly. If we relax all these constraints disjunctive representation of states and itera-
such that fluents can take on a continuous tively converts the search into a SAT problem.
range of values (for example, metric), a fluent
might change its value spontaneously or for Goal of Planner’s
reasons other than agent actions—for example, Learning Component
the world has hidden variables, the exact im- There is a wide variety of targets that the learn-
pact of acting cannot be predicted, and actions ing component of a planning system might
have durations—then we are in the class of aim toward, such as learning search control
full-scope planning problems. In between these rules, learning to avoid dead-end or unpromis-
extremes lies a wide variety of interesting and ing states, or improving an incomplete domain
practical planning problem types, such as clas- theory. As indicated in figure 1, they can be
sical planning with a partially observable categorized broadly into one of three groups:
world (for example, playing poker) and classi- (1) learning to speed up planning, (2) learning
cal planning where actions realistically require to elicit or improve the planning domain the-
significant periods of time to execute (for ex- ory, or (3) learning to improve the quality of
ample, logistics domains). The difficulty with the plans produced (where quality can have a
even the classical planning problem is that it wide range of definitions).
largely occupied the full attention of the re-
Learning and Improving Domain Theo-
search community until the past few years. The
ry Automated planning implies the presence
current extension into various neoclassical,
of a domain theory—the descriptions of the ac-
temporal, and metric planning modes has been
tions available to the planner. When an exact
spurred in part by impressive advances in auto-
model of how an agent’s actions affect its
mated planning technology over the past six
world is unavailable (a nonclassical planning
years or so.
problem), there are obvious advantages to a
Planning Approach planner that can evolve its domain theory by
learning. Few interesting environments are
Planning as a subfield of AI has roots in Newell
simple and certain enough to admit a complete
and Simon’s 1960-era problem-solving system,
model of their physics, so it’s likely that even
GPS , and theorem proving. At a high level,
“the best laid plans” based on a static domain
planning can be viewed as either a problem
theory will occasionally (that is, too often) go
solver or theorem prover. Planning methods
astray. Each such instance, appropriately fed
can further be seen as either search processes or
back to the planner, provides a learning oppor-
model checking. Among planners most com-
tunity for evolving the domain theory toward
monly characterized by search mode, there are
a version more consistent with the actual envi-
two broad categories: (1) search in state space
ronment in which its plans must succeed.
and (2) search in a space of plans. It is possible
to further partition current state-space plan- Even in classical planning, the designer of a
ners into those that maintain a conjunctive problem domain generally has many valid al-
state representation and those that search in a ternative ways of specifying the actions, and it
disjunctive representation of possible states. is well known that the exact form of the action
Planners most generally characterized as descriptions can have a large impact on the ef-
model checkers (although they also conduct ficiency of a given planner on a given problem.
search) involve recompiling the planning prob- Even if the human designer can identify some
lem into a representation that can be tackled by of the complex manner in which the actions in
a particular problem solution engine. These sys- a domain description will interact, he/she will
tems can be partitioned into three categories: likely be faced with trade-offs between efficien-
(1) satisfiability (SAT), constraint-satisfaction cy and factors such as compactness, compre-
problems (CSPs), and integer linear program- hensibility, and expressiveness.
ming (IP). Figure 1 lists these three different Planning Speedup In all but the most
methods along with representative planning trivial of problems, a planner will have to con-
systems for each. These categories are not en- duct considerable search to construct a solu-
tirely disjoint for purposes of classifying plan- tion, in the course of which it will be forced to
ners because some systems use a hybrid ap- backtrack numerous times. The primary goals
proach or can be viewed as examples of more of speedup learning are to avoid unpromising
than one method. GRAPHPLAN (Blum and Furst portions of the search space and bias the
1997), for example, can be seen as either a dy- search in directions most likely to lead to
namic CSP or as a conductor for disjunctive high-quality plans.
76 AI MAGAZINE
Articles
Improving Plan Quality This category drawn based on the following formulations of
ranges from learning to bias the planner to- the learning problem:
ward plans with a specified attribute or metric Inductive learning: The learner is confront-
value to learning a user’s preferences in plans ed with a hypothesis space H and a set of train-
and variations of mixed-initiative planning. ing examples D. The desired output is a hy-
pothesis h from H that is consistent with these
Planning Phase in Which training examples.
Learning Is Conducted Analytic learning: The learner is confront-
At least three opportunities for learning pre- ed with the same hypothesis space and train-
sent themselves over the course of a planning ing examples as for inductive learning. Howev-
and execution cycle: (1) before planning starts, er, the learner has an additional input: a
(2) during the process of finding a valid plan, domain theory B composed of background
and (3) during the execution of a plan. knowledge that can be used to help explain ob-
served training examples. The desired output is
Learning before Planning Starts Before
a hypothesis h from H that is consistent with
the solution search even begins, the specifica-
both the training examples D and the domain
tion of the planning problem itself presents
theory B.
learning opportunities. This phase is closely
Understanding the advantages and disad-
connected to the aspect of learning and im-
vantages of applying a given machine learning
proving the domain theory but encompasses
technique to a given planning system can help
only preprocessing of a given domain theory. It
to make sense of any research bias that be-
is done offline and produces a modified do-
comes apparent in the survey tables. The pri-
main that is useful for all future domain prob-
mary types of analytic learning systems devel-
lems.
oped to date, along with their relative
Learning during the Process of Finding strengths and weaknesses and an indication of
a Valid Plan Planners capable of learning their inductive biases, are listed in table 1. The
in this mode have been augmented with some major types of pure inductive learning systems
means of observing their own decision-making are similarly described in table 2. Admittedly,
process. They then take advantage of their ex- the various subcategories within these tables
perience during planning to expedite the fur- are not disjoint, and they don’t nicely partition
ther planning or improve the quality of plans the entire class (inductive or analytic).
generated. The learning process itself can ei- The research literature itself conflicts at
ther be online or offline. times about what constitutes learning in a giv-
Learning during the Execution of a Plan en implementation, so tables 1 and 2 reflect
A planner has yet another opportunity to im- the decisions made in this regard for this study.
prove its performance when it is an embedded The classification scheme we propose for
component of a system that can execute a plan learning-augmented planning systems is per-
and provide sensory feedback. A system that haps most inadequate when it comes to rein-
seeks to improve an incomplete domain theory forcement learning. We discuss this special case,
would conduct learning in this phase, as might in which planning and learning are inextricably
a planner seeking to improve plan quality intertwined, in the sidebar “Reinforcement
based on actual execution experience. The Learning: The Special Case.”
learning process itself can either be online or Analogical learning is only represented in
offline. table 1 by a specialized and constrained form
known as derivational analogy and the closely
Type of Learning related case-based reasoning formulism. More
The machine learning techniques themselves flexible and powerful forms of analogy can be
can be classified in a variety of ways, irrespec- envisioned (compare Hofstadter and Marshall
tive of the learning goal or the planning phase [1996, 1993]), but the lack of active research in
they might be used in. Two of the broadest tra- this area within the machine learning commu-
ditional class distinctions that can be drawn nity effectively eliminates more general analo-
are between so-called inductive (or empirical) gy as a useful category in our learning-in-plan-
methods and deductive (or analytic) methods. ning survey.
In figure 1, we broadly partition the machine The three columns for each technique given
learning–techniques dimension into these two in tables 1 and 2 give a sense of the degree to
categories along with a multistrategy ap- which the method can be effective when ap-
proach. We then consider additional properties plied to a given learning problem, in our case,
that can be used to characterize a given meth- automated planning. Two columns summarize
od. The inductive-deductive classification is the relative strengths and weaknesses of each
SUMMER 2003 77
Articles
Nogood Learning Inconsistent states Simple, fast learning Low strength learning—each
(Memoization, and sets of fluents Generally low computational nogood typically prunes small
Caching) overhead sections of search space
Practical, widely used Difficult to generalize across
problems
Memory requirements can be
high
Search control rules Uses a domain theory—the available Requires a domain theory—
Explanation-Based Domain refinement background knowledge incorrect domain theory can
Learning (EBL) Can learn from a single training lead to incorrect deductions
example
If-then rules are generally intuitive Rule utility problem
(readable)
Widely used
Static Analysis and Existing problem / Performed “offline”, benefits Benefits vary greatly
Abstractions Learning domain invariants or generally available for all subsequent depending on domain
structure problems in domain. and problem
Derivational Analogy / Similarity between Holds potential for shortcutting Large space required as case
Case-Based Reasoning current state and much planning effort where similar library builds
(CBR) previously cataloged problem states arise frequently. Case-matching overhead
states Extendable to full analogy? Revising old plan can be
costly
technique. The column headed Models refers ly justified hypotheses. The logical justifica-
to the type of function or structure that the tions fall short when the prior knowledge is
method was designed to represent or process. A flawed, and the statistical justifications are sus-
method chosen to learn a particular function is pect when data are scarce, or assumptions
not well suited if it is either incapable of ex- about distributions are questionable.
pressing the function or is inherently much We next consider the learning-in-planning
more expressive than required. This choice of work that has been done in light of the charac-
representation involves a crucial trade-off. A terization structure given in figure 1.
very expressive representation that allows the
target function to be represented as close as What Role Has Learning
possible will also require more training data to
choose among the alternative hypotheses it Played in Planning?
can represent. We report here the results of an extensive sur-
The heart of the learning problem is how to vey of AI research literature focused on applica-
successfully generalize from examples. Analyti- tions of machine learning techniques to plan-
c learning leans on the learner’s background ning. Research in the area of machine learning
knowledge to analyze a given training instance goes back at least as far back as 1959, with
to discern the relevant features. In many do- Arthur Samuel’s (1959) checkers-playing pro-
mains, such as the stock market, complete and gram that improved its performance through
correct background knowledge is not available. learning. It is noteworthy that perhaps the first
In these cases, inductive techniques that can work in what was to become the AI field of
discern regularities over many examples in the planning (STRIPS [Fikes and Nilsson 1971]) was
absence of a domain model can prove useful. quickly followed by a learning-augmented ver-
One possible motivation for adopting a multi- sion that could improve its performance by an-
strategy approach is that analytic learning alyzing its search experience (Fikes, Hart, and
methods generate logically justified hypothe- Nilsson 1972). Space considerations preclude
ses, but inductive methods generate statistical- an all-inclusive survey for this 30-year span,
78 AI MAGAZINE
Articles
Artificial Discrete-, real-, and Robust to noisy and complex Long training times are
Neural Networks vector-valued functions data, errors in data common; learned target
function is largely
inscrutable
Inductive Logic First-order logic, theories Robust to noisy data, Large training sample size
Programming as logic programs missing values. might be needed to acquire
More expressive than effective set of predicates
propositional-based learners Rule utility problem
Able to generate new
predicates.
If-then rules (Horn clauses)
are easily understandable
Bayesian Learning Probabilistic inference Readily combine prior Require large initial
Hypotheses that make knowledge with observed probability sets
probabilistic predictions data High computational cost to
Modifies hypothesis obtain Bayes's optimal
probability incrementally hypothesis
based on each training
example.
Reinforcement Learning Control policy to Domain theory not required Depends on a real-valued
maximize rewards. Handling actions with non- reward signal for each
Fits the MDP setting deterministic outcomes transition
Optimal policy from Difficulty handling large
nonoptimal training sets, state spaces. Convergence
facilitates life-long learning can be slow, space
requirements can be huge
but we wanted to list either seminal studies in visual mapping of the studies’ demographics
each category or a typical representative study along the five dimensions.
if the category has many. We discuss each of these representations in
It is difficult to present the survey results in the following subsections.
2-dimensional (2D) format such that the five
dimensions represented in figure 1 are usefully Survey Tables according to Learning
reflected. We used three different formats, em- Type and Planning Type
phasizing different combinations and order- Table 3A deals with studies focused primarily
ings of the figure 1 dimensions: on analytic (deductive) learning in its various
First is a set of three tables organized around forms, and table 3B is concerned with induc-
just two dimensions: (1) type of learning and tive learning. Table 3C addresses studies and
(2) type of planning. multistrategy systems that aim at some combi-
Second is a set of tables reflecting all five di- nation of analytic and inductive techniques.
mensions for each relevant study in the survey. All studies and publications appearing in these
Third is a graphic representation providing a tables are listed in full in the reference section.
SUMMER 2003 79
Articles
Explanation-Based General Problem Fikes and Nilsson (1972) Chien (1989) Wolfman and Weld
Learning (EBL) Solving (Chunking) STRIPS Kambhampati, (1999) LPSAT
Laird et al. (1987) Minton et al. (1989) PRODIGY Katukam, and Qu [RELSAT]
SOAR (1996) UCPOP-EBL
Gratch and DeJong (1992) Nogood Learning
Horn Clause Rules COMPOSER [PRODIGY] Kautz and Selman
Kedar-Cabelli (1987) Bhatnagar and Mostow (1994) (1999) BLACKBOX
Prolog-EBG FAILSAFE (using RELSAT)
Borrajo and Veloso (1997) Do and Kambham-
Symbolic Integration HAMLET (See also multistrategy) pati (2001) GP-CSP
Mitchell et al. (1986) [GRAPHPlan]
LEX-2 (See also multi- Kambhampati (2000)
strategy) GRAPHPlan-EBL
80 AI MAGAZINE
Articles
Planning Applications
Inductive Learning General Applications
State Space Plan Space Compilation
(Conjunctive / Disjunctive) (CSP / SAT / IP)
Reflex/Reactive
Pomerleau (1993) ALVINN
First-Order Logic Hornlike Clauses Leckie and Zukerman (1998) Estlin and Mooney Huang, Selman, and
Inductive Logic Quinlan (1990) FOIL GRASSHOPPER [PRODIGY] (1996) (See also Kautz (2000) (See also
multistrategy) multistrategy)
Programming (ILP) Muggleton and Feng (1990) Zelle and Mooney (1993) (See
GOLEM also multistrategy)
Lavrac, Dzeroski, and Grobel- Reddy and Tadepalli (1999)
nik (1991) LINUS ExEL
Text Classification
Lang (1995) NEWSWEEDER
Plan Rewriting
Ambite, Knoblock, and Minton
(2000) PBR
SUMMER 2003 81
Articles
Planning Applications
Multistrategy General Applications
Learning State Space Plan Space Compilation
Conjunctive/Disjunctive [ CSP / SAT/ IP ]
Explanation-Based Search Control for Logic Programs Zelle and Mooney (1993) Estlin and Mooney EBL, ILP, and Some
Learning and Cohen (1990) AxA-EBL DOLPHIN [PRODIGY/FOIL] (1996) SCOPE Static Analysis
Inductive Logic Zelle and Mooney (1993) [FOIL] Huang, Selman, and
Programming DOLPHIN [FOIL/PRODIGY] Kautz (2000)
[BLACKBOX-FOIL]
Explanation-Based
Dietterich and Flann (1997)
Learning and
EBRL Policies
Reinforcement
Learning
The table rows feature the major learning studies and implementations of the learning
types outlined in tables 1 and 2, occasionally technique in the first column. These General
further subdivided as indicated in the leftmost Applications were deemed particularly rele-
column. The second column contains a listing vant to planning, and of course, the list is
of some of the more important nonplanning highly abridged. Comparing the General Ap-
82 AI MAGAZINE
Articles
plications column with the Planning columns main theory. Also obvious is the extent to
for each table provides a sense of which ma- which research has focused on learning prior
chine learning methods have been applied to or during planning, with scant attention
within the planning community. The three paid to learning during plan execution.
columns making up the Planning Applications
partition subdivide the applications into state Graphic Analysis of Survey
space; plan space; and CSP, SAT, and IP plan- There are obvious limitations to what can read-
ning. Studies dealing with planning problems ily be gleaned from any tabular presentation of
beyond classical planning (as defined in Plan- a data set across more than two or three dimen-
ning Problem Type earlier) appear in shaded sions. To more easily visualize patterns and re-
blocks in these tables. lationships in learning-in-planning work, we
Table 3C, covering multistrategy learning, have devised a graphic method of depicting
reflects the fact that the particular combina- the corpus of work in this survey with respect
tion of techniques used in some studies could to the five dimensions given in figure 1. Figure
not always be easily subcategorized relative to 2 illustrates this method of depiction by map-
the analytic and inductive approaches of tables ping two studies from the survey onto a ver-
3A and 3B. This is often the case, for example, sion of figure 1.
with an inductive learning implementation In this manner, every study or project cov-
that exploits the design of a particular plan- ered in the survey has been mapped onto at
ning system. Examples include HAMLET (Borrajo least one 5-node, directed subgraph of figure 3
and Veloso 1997), which exploits the search (classical planning systems) or figure 4 (sys-
tree produced by the PRODIGY 4.0 planning sys- tems designed to handle problems beyond the
tem to lazily learn search control heuristics, classical paradigm). The edges express which
and EGBG and PEGG (Zimmerman and Kamb- combinations of the figure 1 dimensional at-
hampati 2002, 1999), which exploit GRAPH- tributes were actually realized in a system cov-
PLAN’s use of the planning graph structure to ered by the survey.
learn to shortcut the iterative search episodes. Besides providing a visual characterization
Studies such as these appear in table 3c under of the corpus of research in learning in plan-
the broader category, analytic and inductive. ning, this graphic presentation mode permits
In addition to classifying the studies sur- quick identification of all planner-learning sys-
veyed along the learning-type and planning- tem configurations that embody any of the as-
type dimensions, these tables illustrate several pects of the five dimensions (nodes). For exam-
foci of this corpus of work. For example, the ple, because the survey tables don’t show all
preponderance of research in analytic learning possible values in each dimension’s range, as-
as it applies to planning rather than inductive pects of learning in planning that have re-
learning styles is apparent, as is the heavy ceived scant attention are not obvious until
weighting in the area of state-space planning. one glances at the graphs, which entails simply
We return to such issues when discussing im- observing the edges incident on any given
plications for future research in the final sec- node. Admittedly, a disadvantage of this pre-
tion. sentation mode is that the specific planning
system associated with a given subgraph can-
Survey Tables Based on not be extracted from the figure alone. Howev-
All Five Dimensions er, the tables can assist in this regard.
The same studies appearing in tables 3A, 3B, Learning within the Classical Planning
and 3C are tabulated in tables 4A and 4B ac- Framework Figure 3 indicates with dashed
cording to all five dimensions in figure 1. We lines and fading those aspects (nodes) of the
have used a block structure within the tables to five dimensions of learning in planning that
emphasize shared attribute values wherever are not relevant to classical planning. Specifi-
possible, given the left-to-right ordering of the cally, Learning or Improving the Domain The-
dimensions. Here, the two dimensions not rep- ory is inconsistent with the classical planning
resented in the previous set of tables, “Plan- assumption of a complete and correct domain
ning-Learning Goal” and “Learning Phase,” are theory. Similarly, the strength of reinforcement
ordered first, so this block structure reveals the learning lies in its ability to handle stochastic
most about the distribution of work across at- environments in which the domain theory is
tributes in these dimensions. It’s apparent that either unknown or incomplete. (Dynamic pro-
the major focus of learning-in-planning work gramming, a close cousin to reinforcement
has been on speedup, with much less attention learning methods, requires a complete and per-
given to the aspects of learning to improve fect domain theory, but because of efficiency
plan quality or building and improving the do- considerations, it has remained primarily of
SUMMER 2003 83
Articles
Dimensions
Planning/ Learning Phase Planning Planning Systems / Studies
Type of Learning
Learning Goal Approach
. Analytic Plan space Smith and Peot (1993) [SNLP]
. . Static analysis Gerevini and Schubert (1996) [UCPOP]
. Before planning . Etzioni (1993) STATIC [PRODIGY]
. starts
. Dawson and Siklossy (1977) REFLECT
. .
Nebel, Koehler, and Dimopoulos (1997) RIFO
. .
State space Fox and Long (1998, 1999), Rintanen (2000)
. . STAN / TIM [GrRAPHPLAN]
. .
Static analysis: . Sacerdoti (1974) ABSTRIPS
. Learn abstractions . Knoblock (1990) ALPINE [PRODIGY]
.
Before and during Static analysis and . Perez and Etzioni (1992) DYNAMIC
.
planning EBL [PRODIGY]
.
Analytic . Fikes and Nilsson (1972) STRIPS
. . . Minton (1989) PRODIGY/EBL
Speedup
. . State space Gratch and DeJong (1992) COMPOSER
.
. . . [PRODIGY]
.
. EBL . Bhatnagar (1994) FAILSAFE
.
. . Kambhampati (2000) GRAPHPLAN-EBL
.
. . Plan space Chein (1989)
.
. . Kambhampati, Katukam, and Qu (1996)
.
. . UCPOP-EBL
.
During planning .
. (Compilation) Nogood Learning
. Kautz and Selman (1999) BLACKBOX
. SAT
.
.
. LP & SAT Wolfman and Weld (1999) LPSAT [RELSAT]
.
.
.
. CSP Nogood Learning
.
. Do and Kambhampati (2001) GP-CSP
.
. [GRAPHPLAN]
.
. Analytic Learning Various Abstraction-Level Cases
.
. Analogical . Bergmann and Wilke (1996) PARIS
.
. .
. User Assist Planning
. Case-Based . Avesani, Perini, and Ricci (2000) CHARADE
.
. Reasoning State space
Transformational Analogy / Adaptation
. Hammond (1989) CHEF
. Kambhampati and Hendler (1992) PRIAR
. Hanks and Weld (1995) SPA
. Leake, Kinley, and Wilson (1996) DIAL
.
Derivational Analogy / Adaptation
Veloso and Carbonell (1993) PRODIGY /
ANALOGY
With EBL
Ihrig and Kambhampati (1997) [UCPOP]
Table 4A. Survey Studies Mapped across All Five Dimensions, Part 1.
CSP = constraint-satisfaction programming. EBL = explanation-based learning. LP = linear programming. SAT = satisfiability. Stud-
ies in heavily shaded blocks feature planners applied to problems beyond classical planning. Implemented system and program
names appear in all caps, and underlying planners and learning subsystems appear in small caps but enclosed in brackets.
84 AI MAGAZINE
Articles
Dimensions
Planning Systems / Studies
Planning / Planning
Learning Phase Type of Learning
Learning Goal Approach
Inductive: . Leckie and Zuckerman (1998)
. . Inductive logic State space GRASSHOPPER [PRODIGY]
.
. Zelle and Mooney (1993) DOLPHIN
.
. State space [PRODIGY/FOIL]
.
. EBL and ILP
Plan space Estlin and Mooney (1996) SCOPE [FOIL]
.
Learn or improve . EBL and RL State space EBRL Dietterich and Flann (1997)
domain theory .
and During planning . Incremental Dynamic Programming
Inductive:
improve plan . Sutton (1991) DYNA
. Reinforcement
quality .
. learning Planning with Learned Operators
Garcia-Martinez and Borrajo (2000) LOPE
Table 4B. Survey Studies Mapped across All Five Dimensions, Part 2.
EBL = explanation-based learning. ILP = inductive logic programming. RL = reinforcement learning. SAT = satisfiability. Studies
in heavily shaded blocks feature planners applied to problems beyond classical planning. Implemented system and program
names appear in small caps, and underlying planners and learning subsystems appear in small caps but enclosed in brackets.
SUMMER 2003 85
Articles
Inductive
Beyond Classical Decision Tree
Modes
Compilation During Planning Inductive Logic
Approaches Process Programming
theoretical interest with respect to classical ning. Not surprisingly, learning in the third
planning.) phase, during plan execution, is not a focus for
Broadly, the figure indicates that some form classical planning scenarios because this mode
of learning has been implemented with all has clear affinity with improving a faulty do-
planning approaches. If we consider the Learn- main theory—a nonclassical problem.
ing Phase dimension of figure 3, it is obvious It is apparent, based on the figure 3 graph in
that the vast majority of the work to date has combination with the survey tables, that ex-
focused on learning conducted during the planation-based learning (EBL) has been exten-
planning process. Work in automatic extrac- sively studied and applied to every planning
tion of domain-specific knowledge through approach and both relevant planning-learning
analysis of the domain theory (Fox and Long goals. This is perhaps not surprising given that
1999, 1998; Gerevini and Schubert 1998) con- planning presumes the sort of domain theory
stitutes the learning conducted before plan- that EBL can readily exploit. Perhaps more no-
86 AI MAGAZINE
Articles
Inductive
Beyond Classical Decision Tree
Modes
Compilation During Planning Inductive Logic
Approaches Process Programming
table is the scant attention paid to inductive of figure 4 for each combination. Learning in a
learning techniques for classical planners. Al- dynamic, stochastic world is the natural do-
though ILP has extensively been applied as a main of reinforcement learning systems, and
learning tool for planners, other inductive as discussed earlier, this popular machine
techniques such as decision tree learning, learning field does not so readily fit our five-di-
neural networks, and Bayesian learning, have mensional learning-in-planning perspective.
seen few planning applications. Figure 4 therefore represents reinforcement
Learning within a Nonclassical Planning learning in a different manner than the other
Framework Figure 4 covers planning systems approaches; a single shade, brick crosshatch set
designed to learn in the wide range of problem of edges is used to span the five dimensions.
classes beyond the classical formulation The great majority of reinforcement learning
(shown in shaded blocks in tables 3A, 3B, and systems to date adopt a state-space perspective,
3C and 4A and 4B). There are, as yet, far fewer so there is an edge skirting this node. With re-
such learning-augmented systems, although spect to the planning-learning goal dimension,
this area of planning community interest is reinforcement learning can be viewed as both
growing. Those “beyond classical planning” “improving plan quality” (the process moves
systems that exist extend the classical planning toward the optimal policy) and “learning the
problem in a variety of different ways, but be- domain theory” (it begins without a model of
cause of space considerations, we have not re- transition probability between states). This
flected these variations with separate versions view is reflected in figure 4 as the vertical rein-
SUMMER 2003 87
Articles
Beyond Classical
Planning Inductive
Dynamic world
Stochastic Improve Decision Tree
Compilation
… Plan
Approaches Inductive Logic
Quality
Programming
88 AI MAGAZINE
Articles
bining reinforcement learning with SAT, which faster processing of grounded versions involv-
does not capture the concept of a state). In as- ing only propositions. The cost of rule check-
sessing the survey tables here, however, we ing and matching in more recent systems that
seek learning-in-planning configurations that use grounded operators is much lower than for
are feasible, have been largely ignored, and ap- planning systems that handle uninstantiated
pear to hold promise. variables.
Nonanalytic Learning Techniques The Not conceding these hurdles to be insur-
survey tables suggest a considerable bias to- mountable, we suggest the following research
ward analytic learning in planning, which de- approaches:
serves to be questioned. Why is analytic learn- One trade-off associated with a move to
ing so favored? In a sense, a planner using EBL planning with grounded operators is the loss of
is learning guaranteed knowledge, control infor- generality in the basic precepts that are most
mation that is provably correct. However, it is readily learned. For example, GRAPHPLAN can
well known within the machine learning com- learn a great number of “no goods” during
munity that approximately correct knowledge search on a given problem, but in their basic
can be at least as useful, particularly if we’re form, they are only relevant to the given prob-
careful not to sacrifice completeness. Given the lem. GRAPHPLAN retains no interproblem mem-
presence of a high-level domain theory, it is ory. It is worth considering what might consti-
reasonable to exploit it to learn. However, large tute effective interproblem learning for such a
constraints are placed on just what can be system.
learned if the planner doesn’t also take advan- The rule utility issue faced by analytic learn-
tage of the full planning search experience. ing systems (and possibly all systems that learn
The tables and figures of this study indicate the search control rules) can be viewed as the prob-
extent to which ILP has been used in this spirit lem of incurring the cost of a large set of sound,
together with EBL. This is a logical marriage of exact, and probably overspecific rules. Learn-
two mature methodologies; ILP in particular ing systems that can reasonably relax the
has powerful engines for inducing logical ex- soundness criterion for learned rules can move
pressions, such as FOIL (Quinlan 1990), that can broadly toward a problem goal using generally
readily be employed. It is curious to note, how- correct search control. Some of the multistrat-
ever, that decision tree learning has been used egy studies reflected in table 3C are relevant to
in only one study in this entire survey, yet this this view to the extent that they attempt to
inductive technique is at least as mature and leverage the strengths of both analytic and in-
features its own very effective engines such as ductive learning techniques to acquire more
ID 3 and C 4.5 (Quinlan 1993, 1986). In the useful rules. Initial work with an approach that
1980s, decision tree algorithms were generally does not directly depend on a large set of train-
not considered expressive enough to capture ing examples was reported in Kambhampati
complex target concepts (such as under what (1999). Here, a system is described that seeks to
conditions to apply an operator). However, giv- learn approximately correct rules by relaxing
en subsequent evolutions in both decision tree the constraint of the UCPOP-EBL system that re-
methods and the current opportunities for quires regressed failure explanations from all
learning to assist the latest generation of plan- branches of a search subtree before a search
ners, the potential of decision tree learning in control rule is constructed.
planning merits reconsideration. Perhaps the most ambitious approach to
Learning across Problems A learning as- learning across problems would be to extend
pect that has largely fallen out of favor in re- some of the work being done in analogical rea-
cent years is the compilation and retention of soning elsewhere in AI to the planning field.
search guidance that can be used across differ- The goal is to exploit any similarity between
ent problems and perhaps even different do- problems to speed up solution finding. Current
mains. One of the earliest implementations of case-based reasoning implementations in plan-
this took the form of learning search control ning are capable of recognizing a narrow range
rules (for example, using EBL). There might be of similarities between an archived partial plan
two culprits that led to disenchantment with and the current state the planner is working
learning this interproblem search control: from. Such systems cannot apply knowledge
First is the utility problem that can surface learned in one logistics domain, for example,
when too many, or relatively ineffective rules to another system—even though a human
are learned. would find it natural to use what he/she has
Second is the propositionalization of the plan- learned in solving an AIPS planning competi-
ning problem, wherein lifted representations tion driver log problem to a depot problem. We
of the domain theory were forsaken for the note that transproblem learning has been ap-
SUMMER 2003 89
Articles
90 AI MAGAZINE
Articles
sical planning—the learning of domain invari- planner at key decision points in its search. As
ants before planning starts. This static analysis such, considerable research effort is focusing
has been shown to be an effective speedup ap- on finding more effective domain-indepen-
proach for many classical planning domains, dent heuristics and tuning heuristics to partic-
and there is no reason to believe it cannot sim- ular problems and domains. The role that
ilarly boost nonclassical planning. learning might play in acquiring or refining
On another front, there has been much en- such heuristics has largely been unexplored. In
thusiasm in parts of the planning community particular, learning such heuristics inductively
for applying domain-specific knowledge to during the planning process would seem to
speed up a given planner (for example, TL PLAN hold promise. Generally, the heuristic values
[Bacchus and Kabanza 2000] and BLACKBOX are calculated by a linear combination of
[Kautz and Selman 1998]). This advantage has weighted terms where the designer chooses
also been realized in hierarchical task network both the terms and their weights in hopes of
(HTN) planning systems by supplying domain- obtaining an equation that will be robust
specific task-reduction schemas to the planner across a variety of problems and domains. The
(SHOP [Nau et al. 1999]). Such leveraging of search trace (states visited) resulting from a
user-supplied domain knowledge has been problem-solving episode could provide the
shown to greatly decrease planning time for a negative and positive examples needed to train
variety of domains and problems. One draw- a neural network or learn a decision tree. Pos-
back of this approach is the burden it places on sible target functions for inductively learning
the user to correctly hand code the domain or improving heuristics include term weights
knowledge ahead of time and in a form usable that are most likely to lead to higher-quality
by the particular planner. Offline learning solutions for a given domain, term weights
techniques might be exploited here. If the user that will be most robust across many domains,
provides very high-level domain knowledge in attributes that are most useful for classifying
a format readily understandable by humans, states, exceptions to an existing heuristic such
the system could learn in supervised fashion to as used in LRTA* (Korf 1990), and a metalevel
operationalize this background knowledge to function that selects or modifies a search
the particular formal representation usable by heuristic based on the problem or domain.
a given target planning system. If the user is Multistrategy learning might also play a role
not to be burdened with learning the planner’s in that the user might provide background
low-level language for knowledge representa- knowledge in the form of the base heuristic.
tion, this approach might entail solving sam- The ever-growing cadre of planning ap-
ple problems iteratively with combinations of proaches and learning tools, each with their
these domain rules to determine both correct- own strengths and weaknesses, suggests an-
ness and efficacy. other inviting direction for speedup learning.
An interesting related issue is the question of Learning a rule set or heuristic that will direct
which types of knowledge are easiest and hard- the application of the most effective approach
est to learn, which has a direct impact on the (or multiple approaches) for a given problem
types of knowledge that might actually be could lead to a metaplanning system with ca-
worth learning. The closely related machine pabilities well beyond any individual planner.
learning aspect of sample complexity addresses Interesting steps in this direction have been
the number and type of examples that are taken by Horvitz et al. (2001) using the con-
needed to induce a given concept or target struction and use of Bayesian models to predict
function. To date, the relative difficulty of the run time of various problem solvers.
learning tasks has received little attention with Learning to Improve Plan Quality The
respect to the domain-specific knowledge used survey tables and figures suggest that the issue
by some planners. What are the differences in of improving plan quality using learning has
terms of the sample complexity of learning dif- received much less attention in the planning
ferent types of domain-specific control knowl- community than speedup learning. However,
edge? For example, it would be worth catego- because planning systems are ported into real-
rizing the TL PLAN control rules versus the world applications, this concern is likely to be
SHOP/HTN–style schemas in terms of their sam- a primary one. Many planning systems that
ple complexity. successfully advance into the marketplace will
Learning to Improve Heuristics The need to interact frequently with human users
credit for both the revival of plan-space plan- in ways that have received scant attention in
ning and the impressive performance of most the lab. Such users are likely to have individual
state-space planners in recent years goes largely biases with respect to plan quality that they
to the development of heuristics that guide the can be hard pressed to quantify. These plan-
SUMMER 2003 91
Articles
92 AI MAGAZINE
Articles
cial Intelligence 72(1): 81–138. ings of the Fifth International Joint Conference on
Ashley, K. D., and McLaren, B. 1995. Reasoning with Artificial Intelligence, 465–471. Menlo Park, Calif.:
Reasons in Case-Based Comparisons. In Proceedings International Joint Conferences on Artificial Intelli-
of the First International Conference on Cased-Based gence.
Reasoning (ICCBR-95), 133–144. Berlin: Springer. Dearden, R.; Friedman, N.; and Russell, S. 1998.
Bacchus, F., and Kabanza, F. 2000. Using Temporal Bayesian Q-Learning. In Proceedings of the Fifteenth
Logics to Express Search Control Knowledge for Plan- National Conference on Artificial Intelligence (AAAI-
ning. Artificial Intelligence 116(1–2): 123–191. 98), 761–768. Menlo Park, Calif.: American Asso-
ciation for Artificial Intelligence.
Bennett, S. W., and DeJong, G. F. 1996. Real-World
Robotics: Learning to Plan for Robust Execution. Ma- Dempster, A. P.; Laird, N. M.; and Rubin, D. B. 1977.
chine Learning 23(2–3): 121–161. Maximum Likelihood from Incomplete Data via the
EM Algorithm. Journal of the Royal Statistical Society
Bergmann, R., and Wilke, W. 1996. On the Role of
B39(1): 1–38.
Abstractions in Case-Based Reasoning. In Proceedings
of EWCBR-96, the European Conference on Case-Based Dietterich, T. G., and Flann, N. S. 1997. Explanation-
Reasoning, 28–43. New York: Springer. Based Learning and Reinforcement Learning: A Uni-
fied View. Machine Learning 28:169–210.
Bhatnagar, N., and Mostow, J. 1994. Online Learning
from Search Failures. Machine Learning 15(1): 69–117. Do, B., and Kambhampati, S. 2003. Planning as Con-
straint Satisfaction: Solving the Planning Graph by
Blum, A., and Furst, M. L. 1997. Fast Planning
Compiling It into a CSP. Artificial Intelligence 132:
through Planning Graph Analysis. Artificial Intelli-
151–182.
gence 90(1–2): 281–300.
Estlin, T. A., and Mooney, R. J. 1996. Multi-Strategy
Borrajo D., and Veloso, M. 1997. Lazy Incremental
Learning of Search Control for Partial-Order Plan-
Learning of Control Knowledge for Efficiently Ob-
ning. In Proceedings of the Thirteenth National Con-
taining Quality Plans. Artificial Intelligence Review
ference on Artificial Intelligence, 843–848. Menlo
11(1–5): 371–405.
Park, Calif.: American Association for Artificial Intel-
Bylander, T. 1992. Complexity Results for Serial De-
ligence.
composability. In Proceedings of the Tenth National
Etzioni, O. 1993. Acquiring Search-Control Knowl-
Conference on Artificial Intelligence (AAAI-92),
edge via Static Analysis. Artificial Intelligence 62(2):
729–734. Menlo Park, Calif.: American Association
265–301.
for Artificial Intelligence.
Fikes, R. E., and Nilsson, N .J. 1971. STRIPS: A New Ap-
Calistri-Yeh, R.; Segre, A.; and Sturgill, D. 1996. The
proach to the Application of Theorem Proving to
Peaks and Valleys of ALPS: An Adaptive Learning and
Problem Solving. Artificial Intelligence 2(3–4):
Planning System for Transportation Scheduling. Pa-
per presented at the Third International Conference 189–208.
on Artificial Intelligence Planning Systems (AIPS-96), Fikes, R. E.; Hart, P.; and Nilsson, N. J. 1972. Learning
29–31 May, Edinburgh, United Kingdom. and Executing Generalized Robot Plans. Artificial In-
Carbonell, Y. G., and Gil, Y. 1990. Learning by Exper- telligence 3:251–288.
imentation: The Operator Refinement Method. In Fox, M., and Long, D. 1999. The Detection and Ex-
Machine Learning: An Artificial Intelligence Approach, ploitation of Symmetry in Planning Problems. Paper
Volume 3, eds. Y. Kodtratoff and R. S. Michalski, presented at the Sixteenth International Joint Con-
191–213. San Francisco, Calif.: Morgan Kaufmann. ference on Artificial Intelligence, 31 July–6 August,
Stockholm, Sweden.
Chien, S. A. 1989. Using and Refining Simplifi-
cations: Explanation-Based Learning of Plans in In- Fox, M., and Long, D. 1998. The Automatic Inference
tractable Domains. In Proceedings of the Eleventh of State Invariants in TIM. Journal of Artificial Intelli-
International Joint Conference on Artificial Intelli- gence Research 9: 317–371.
gence, 590–595. Menlo Park, Calif.: International Fu, L.-M. 1989. Integration of Neural Heuristics into
Joint Conferences on Artificial Intelligence. Knowledge-Based Inference. Connection Science 1(3):
Cohen, W. W. 1990. Learning Approximate Control 325–340.
Rules of High Utility. Paper presented at the Seventh García-Martínez, R., and Borrajo, D. 2000. An Inte-
International Conference on Machine Learning, grated Approach of Learning, Planning, and Execu-
21–23 June, Austin, Texas. tion. Journal of Intelligent and Robotic Systems 29(1):
Cohen, W. W., and Singer, Y. 1999. A Simple, Fast, 47-78.
and Effective Rule Learner. In Proceedings of the Six- Gerevini, A., and Schubert, L. 1998. Inferring State
teenth National Conference on Artificial Intelligence Constraints for Domain-Independent Planning. In
(AAAI-99), 335–342. Menlo Park, Calif.: American Proceedings of the Fifteenth National Conference on
Association for Artificial Intelligence. Artificial Intelligence, 905–912. Menlo Park, Calif.:
Craven, M., and Shavlik, J. 1993. Learning Symbolic American Association for Artificial Intelligence.
Rules Using Artificial Neural Networks. Paper pre- Gerevini, A., and Schubert, L. 1996. Accelerating Par-
sented at the Tenth International Conference on Ma- tial-Order Planners: Some Techniques for Effective
chine Learning, 24–27 July, London, United King- Search Control and Pruning. Journal of Artificial Intel-
dom. ligence Research 5:95–137.
Dawson, C., and Siklossy, L. 1977. The Role of Pre- Gil, Y. 1994. Learning by Experimentation: Incre-
processing in Problem-Solving Systems. In Proceed- mental Refinement of Incomplete Planning Do-
SUMMER 2003 93
Articles
mains. Paper presented at the Eleventh International Kambhampati, S. 2000. Planning Graph as (Dynam-
Conference on Machine Learning, 10–13 July, New ic) CSP: Exploiting EBL, DDB, and Other CSP Tech-
Brunswick, New Jersey. niques in GRAPHPLAN. Journal of Artificial Intelligence
Gratch, J., and Dejong, G. 1992. COMPOSER: A Proba- Research 12:1–34.
bilistic Solution to the Utility Problem in Speed-Up Kambhampati, S. 1998. On the Relations between In-
Learning, In Proceedings of the Tenth National Con- telligent Backtracking and Failure-Driven Explana-
ference on Artificial Intelligence (AAAI-92), 235–240. tion-Based Learning in Planning. Constraint Satisfac-
Menlo Park, Calif.: American Association for Artifi- tion and Artificial Intelligence 105(1–2): 161–208.
cial Intelligence. Kambhampati, S., and Hendler, J. 1992. A Validation
Hammond, K. 1989. Case-Based Planning: Viewing Structure–Based Theory of Plan Modification and
Planning as a Memory Task. San Diego, Calif.: Acade- Reuse. Artificial Intelligence 55(23): 193–258.
mic Press. Kambhampati, S., and Katukam, Y. Q. 1996. Failure-
Hanks, S., and Weld, D. 1995. A Domain-Indepen- Driven Dynamic Search Control for Partial Order
dent Algorithm for Plan Adaptation. Journal of Artifi- Planners: An Explanation-Based Approach. Artificial
cial Intelligence Research 2:319–360. Intelligence 88(1–2): 253–315.
Hinton, G. E. 1989. Connectionist Learning Proce- Kautz, H., and Selman, B. 1999. BLACKBOX: Unifying
dures. Artificial Intelligence 40(1–3): 185–234. SAT-Based and Graph-Based Planning. In Proceedings
Hofstadter, D. R., and Marshall, J. B. D. 1993. A Self- of the Sixteenth International Joint Conference on
Watching Cognitive Architecture of High-Level Per- Artificial Intelligence (IJCAI-99), 318–325. Menlo
ception and Analogy-Making. Technical Report, Park, Calif.: International Joint Conferences on Arti-
TR100, Center for Research on Concepts and Cogni- ficial Intelligence.
tion, Indiana University. Kautz, H., and Selman, B. 1998. The Role of Domain-
Hofstadter, D. R., and Marshall, J. B. D. 1996. Beyond Specific Knowledge in the Planning as Satisfiability
Copycat: Incorporating Self-Watching into a Com- Framework. Paper presented at the Fifth Internation-
puter Model of High-Level Perception and Analogy al Conference on Planning and Scheduling (AIPS-
Making. Paper presented at the 1996 Midwest Artifi- 98), 7–10 June, Pittsburgh, Pennsylvania.
cial Intelligence and Cognitive Science Conference, Kedar-Cabelli, S., and McCarty, T. 1987. Explanation-
26–28 April, Bloomington, Indiana. Based Generalization as Resolution Theorem Prov-
Hollatz, J. 1999. Analogy Making in Legal Reasoning ing. In Proceedings of the Fourth International Workshop
with Neural Networks and Fuzzy Logic. Artificial In- on Machine Learning, 383–389. San Francisco, Calif.:
telligence and Law 7(2–3): 289–301. Morgan Kaufmann.
Horvitz, E.; Ruan, Y.; Gomes, C.; Kautz, H.; Selman, Khardon, R. 1999. Learning Action Strategies for
B.; and Chickering, D. M. 2001. A Bayesian Approach Planning Domains. Artificial Intelligence 113(1–2):
to Tackling Hard Computational Problems. Paper 125–148.
presented at the Seventeenth Conference on Uncer- Knoblock, C. 1990. Learning Abstraction Hierarchies
tainty in Artificial Intelligence, 2–5 August, Seattle, for Problem Solving. In Proceedings of the Eighth
Washington. National Conference on Artificial Intelligence,
Huang, Y.; Kautz, H.; and Selman, B. 2000. Learning 923–928. Menlo Park, Calif.: American Association
Declarative Control Rules for Constraint-Based Plan- for Artificial Intelligence.
ning. Paper presented at the Seventeenth Interna- Kodtratoff, Y., and Michalski, R. S., eds. 1990. Ma-
tional Conference on Machine Learning, 29 June–2 chine Learning: An Artificial Intelligence Approach, Vol-
July, Stanford, California. ume 3. San Francisco, Calif.: Morgan Kaufmann.
Hunt, E. B.; Marin, J.; and Stone, P. J. 1966. Experi- Korf, R. 1990. Real-Time Heuristic Search. Artificial
ments in Induction. San Diego, Calif.: Academic Press. Intelligence 42(2–3): 189–211.
Ihrig, L., and Kambhampati, S. 1997. Storing and In- Laird, J.; Newell, A.; and Rosenbloom, P. 1987. SOAR:
dexing Plan Derivations through Explanation-Based An Architecture for General Intelligence. Artificial In-
Analysis of Retrieval Failures. Journal of Artificial In- telligence 33(1): 1–64.
telligence 7:161–198. Lang, K. 1995. NEWSWEEDER: Learning to Filter Net-
Ihrig, L., and Kambhampati, S. 1996. Design and Im- news. In Proceedings of the Twelfth International Con-
plementation of a Replay Framework Based on a Par- ference on Machine Learning, 331–339. San Francisco,
tial-Order Planner. In Proceedings of the Thirteenth Calif.: Morgan Kaufmann.
National Conference on Artificial Intelligence (AAAI- Langley, P. 1997. Challenges for the Application of
96). Menlo Park, Calif.: American Association for Ar- Machine Learning. In Proceedings of the ICML ‘97
tificial Intelligence. Workshop on Machine Learning Application in the Real
Jones, R., and Langley, P. 1995. Retrieval and Learn- World: Methodological Aspects and Implications, 15–18.
ing in Analogical Problem Solving. In Proceedings of San Francisco, Calif.: Morgan Kaufmann.
the Seventeenth Conference of the Cognitive Science Lau, T.; Domingos, P.; and Weld, D. 2000. Version
Society, 466–471. Pittsburgh, Pa.: Lawrence Erlbaum. Space Algebra and Its Application to Programming
Kakuta, T.; Haraguchi, M.; Midori-ku, N.; and Okubo, by Demonstration. Paper presented at the Seven-
Y. 1997. A Goal-Dependent Abstraction for Legal teenth International Conference on Machine Learn-
Reasoning by Analogy. Artificial Intelligence and Law ing, 29 June–2 July, Stanford, California.
5(1–2): 97–118. Lavrac, N.; Dzeroski, S.; and Grobelnik, M. 1991.
94 AI MAGAZINE
Articles
Learning Nonrecursive Definitions of Relations with Ninth International Conference on Machine Learn-
LINUS. In Proceedings of the Fifth European Working Ses- ing, 27–30 June, Bled, Slovenia.
sion on Learning, 265–281. Berlin: Springer. Pomerleau, D. A. 1993. Knowledge-Based Training of
Leake, D.; Kinley, A.; and Wilson, D. 1996. Acquiring Artificial Neural Networks for Autonomous Robot
Case Adaptation Knowledge: A Hybrid Approach. In Driving. In Robot Learning, eds. J. Connell and S. Ma-
Proceedings of the Thirteenth National Conference hadevan, 19–43. Boston: Kluwer Academic.
on Artificial Intelligence, 684–689. Menlo Park, Quinlan, J. R. 1993. C4.5: Programs for Machine Learn-
Calif.: American Association for Artificial Intelli- ing. San Francisco, Calif.:. Morgan Kaufmann.
gence.
Quinlan, J. R. 1990. Learning Logical Definitions
Leckie, C., and Zuckerman, I. 1998. Inductive Learn- from Relations. Machine Learning 5:239–266.
ing of Search Control Rules for Planning. Artificial In-
Quinlan, J. R. 1986. Induction of Decision Trees. Ma-
telligence 101(1–2): 63–98.
chine Learning 1(1): 81–106.
Martin, M., and Geffner, H. 2000. Learning General-
Reddy, C., and Tadepalli, P. 1999. Learning Horn De-
ized Policies in Planning Using Concept Languages.
finitions: Theory and an Application to Planning.
In Proceedings of the Seventh International Conference
New Generation Computing 17(1): 77–98.
on Knowledge Representation and Reasoning (KR 2000),
667–677. San Francisco, Calif.: Morgan Kaufmann. Rintanen, J. 2000. An Iterative Algorithm for Synthe-
sizing Invariants. In Proceedings of the Seventeenth
Minton, S., ed. 1993. Machine Learning Methods for
National Conference on Artificial Intelligence and
Planning. San Francisco, Calif.: Morgan Kaufmann.
the Twelfth Innovative Applications of AI Confer-
Minton, S.; Carbonell, J.; Knoblock, C.; Kuokka, D. ence, 806-811. Menlo Park, Calif.: American Associ-
R.; Etzioni, O.; and Gil, Y. 1989. Explanation-Based ation for Artificial Intelligence.
Learning: A Problem-Solving Perspective. Artificial
Sacerdoti, E. 1974. Planning in a Hierarchy of Ab-
Intelligence 40:63–118.
straction Spaces. Artificial Intelligence 5(2): 115–135.
Mitchell, T. M., and Thrun, S. B. 1995. Learning An-
Samuel, A. L. 1959. Some Studies in Machine Learn-
alytically and Inductively. In Mind Matters: A Tribute
ing Using the Game of Checkers. IBM Journal of Re-
to Allen Newell (Carnegie Symposia on Cognition), eds.
search and Development 3(3): 210–229.
J. D. Steier, T. Mitchell, and A. Newell. New York:
Lawrence Erlbaum. Schmill, M.; Oates, T.; and Cohen, P. 2000. Learning
Planning Operators in Real-World, Partially Observ-
Mitchell, T.; Keller, R.; and Kedar-Cabelli, S. 1986. Ex-
able Environments. Paper presented at the Fifth Con-
planation-Based Generalization: A Unifying View.
ference on Artificial Intelligence Planning Systems
Machine Learning 1(1): 47–80.
(AIPS-2000), 14–17 April, Breckenridge, Colorado.
Muggleton, S., and Feng, C. 1990. Efficient Induction
Shavlik, J. W., and Towell, G. G. 1989. An Approach
of Logic Programs. Paper presented at the First Con-
to Combining Explanation-Based and Neural Learn-
ference on Algorithmic Learning Theory, 8–10 Octo-
ing Algorithms. Connection Science 1(3): 231–253.
ber, Ohmsma, Tokyo, Japan.
Sheppard, J., and Salzberg, S. 1995. Combining Ge-
Munoz-Avila, H.; Aha, D. W.; Breslow, L.; and Nau, D.
netic Algorithms with Memory-Based Reasoning. Pa-
1999. HICAP: An Interactive Case-Based Planning Ar-
per presented at the Sixth International Conference
chitecture and Its Application to Noncombatant
on Genetic Algorithms, 15–19 July, Pittsburgh, Penn-
Evacuation Operations. In Proceedings of the Ninth
sylvania.
Conference on Innovative Applications of Artificial
Intelligence, 879–885. Menlo Park, Calif.: American Smith, D., and Peot, M. 1993. Postponing Threats in
Association for Artificial Intelligence. Partial-Order Planning. In Proceedings of the
Eleventh National Conference on Artificial Intelli-
Nau, D.; Cao, Y.; Lotem, A.; and Munoz-Avila, H.
gence (AAAI-93), 500–506. Menlo Park, Calif.: Amer-
1999. SHOP: Simple Hierarchical Order Planner. In
ican Association for Artificial Intelligence.
Proceedings of the Sixteenth International Joint
Conference on Artificial Intelligence (IJCAI-99), Sutton, R. 1991. Planning by Incremental Dynamic
968–975. Menlo Park, Calif.: International Joint Programming. In Proceedings of the Eighth Internation-
Conference on Artificial Intelligence. al Conference on Machine Learning, 353–357. San Fran-
cisco, Calif.: Morgan Kaufmann.
Nebel, B.; Dimopoulos, Y.; and Koehler, J. 1997. Ig-
noring Irrelevant Facts and Operators in Plan Gener- Sutton, R. 1988. Learning to Predict by the Methods
ation. Paper presented at the Fourth European Con- of Temporal Differences. Machine Learning 3(1): 9–44.
ference on Planning (ECP-97), 24–26 September, Sutton, R., and Barto, G. 1998. Reinforcement Learn-
Toulouse, France. ing—-An Introduction. Cambridge, Mass.: MIT Press.
Ourston, D., and Mooney, R. 1994. Theory Refine- Sycara, K.; Guttal, R.; Koning, J.; Narasimhan, S.; and
ment Combining Analytical and Empirical Methods. Navinchandra, D. 1992. CADET: A Case-Based Synthe-
Artificial Intelligence 66(2): 273–309. sis Tool for Engineering Design. International Journal
Pazzani, M. J.; Brunk, C. A.; and Silverstein, G. 1991. of Expert Systems 4(2).
A Knowledge-Intensive Approach to Learning Rela- Veloso, M., and Carbonell, J. 1993. Derivational
tional Concepts. Paper presented at the Eighth Inter- Analogy in PRODIGY: Automating Case Acquisition,
national Workshop on Machine Learning, 27–29 Storage, and Utilization. Machine Learning 10(3):
June, Evanston, Illinois. 249–278.
Perez, M., and Etzioni, O. 1992. DYNAMIC: A New Role Wang, X. 1996a. A Multistrategy Learning System for
for Training Problems in EBL. Paper presented at the Planning Operator Acquisition. Paper presented at
SUMMER 2003 95
Articles
A classic Zweben, M.; Davis, E.; Daun, B.; Drascher, E.; Deale,
M.; and Eskey, M. 1992. Learning to Improve Con-
from AAAI
heuristic search control, learning, and optimization
over multiple-quality criteria. He previously con-
ducted probabilistic risk assessment and reliability
analysis for energy facilities and developed software
for statistical analysis of experimental nuclear fuel
(members receive a 20% discount!) assemblies. His e-mail address is [email protected].
96 AI MAGAZINE