Hostname: page-component-586b7cd67f-t8hqh Total loading time: 0 Render date: 2024-11-22T03:08:13.918Z Has data issue: false hasContentIssue false

A review of machine learning for automated planning

Published online by Cambridge University Press:  12 November 2012

Sergio Jiménez
Affiliation:
Departamento de Informática, Universidad Carlos III de Madrid, Avda. de la Universidad, 30. Leganés (Madrid), Spain; e-mail: [email protected]
Tomás De La Rosa
Affiliation:
Departamento de Informática, Universidad Carlos III de Madrid, Avda. de la Universidad, 30. Leganés (Madrid), Spain; e-mail: [email protected]
Susana Fernández
Affiliation:
Departamento de Informática, Universidad Carlos III de Madrid, Avda. de la Universidad, 30. Leganés (Madrid), Spain; e-mail: [email protected]
Fernando Fernández
Affiliation:
Departamento de Informática, Universidad Carlos III de Madrid, Avda. de la Universidad, 30. Leganés (Madrid), Spain; e-mail: [email protected]
Daniel Borrajo
Affiliation:
Departamento de Informática, Universidad Carlos III de Madrid, Avda. de la Universidad, 30. Leganés (Madrid), Spain; e-mail: [email protected]

Abstract

Recent discoveries in automated planning are broadening the scope of planners, from toy problems to real applications. However, applying automated planners to real-world problems is far from simple. On the one hand, the definition of accurate action models for planning is still a bottleneck. On the other hand, off-the-shelf planners fail to scale-up and to provide good solutions in many domains. In these problematic domains, planners can exploit domain-specific control knowledge to improve their performance in terms of both speed and quality of the solutions. However, manual definition of control knowledge is quite difficult. This paper reviews recent techniques in machine learning for the automatic definition of planning knowledge. It has been organized according to the target of the learning process: automatic definition of planning action models and automatic definition of planning control knowledge. In addition, the paper reviews the advances in the related field of reinforcement learning.

Type
Articles
Copyright
Copyright © Cambridge University Press 2012

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aler, R., Borrajo, D., Isasi, P. 2002. Using genetic programming to learn and improve control knowledge. Artificial Intelligence 141(1–2), 2956.CrossRefGoogle Scholar
Amir, E., Chang, A. 2008. Learning partially observable deterministic action models. Journal of Artificial Intelligence Research 33, 349402.CrossRefGoogle Scholar
Bacchus, F., Kabanza, F. 2000. Using temporal logics to express search control knowledge for planning. Artificial Intelligence 116(1–2), 123191.CrossRefGoogle Scholar
Barto, A., Duff, M. 1994. Monte carlo matrix inversion and reinforcement learning. Advances in Neural Information Processing Systems 6, 687–694.Google Scholar
Bellingham, J., Rajan, K. 2007. Robotics in remote and hostile environments. Science 318(5853), 10981102.CrossRefGoogle ScholarPubMed
Bellman, R., Kalaba, R. 1965. Dynamic Programming and Modern Control Theory. Academic Press.Google Scholar
Benson, S. S. 1997. Learning Action Models for Reactive Autonomous Agents. PhD thesis, Stanford University.Google Scholar
Bergmann, R., Wilke, W. 1996. PARIS: flexible plan adaptation by abstraction and refinement. In Workshop on Adaptation in Case-Based Reasoning, ECAI-96.Google Scholar
Bertsekas, D. P. 1995. Dynamic Programming and Optimal Control. Athena Scientific.Google Scholar
Bertsekas, D. P., Tsitsiklis, J. N. 1996. Neuro-Dynamic Programming (Optimization and Neural Computation Series, 3). Athena Scientific.Google Scholar
Blockeel, H., De Raedt, L. 1998. Top-down induction of first-order logical decision trees. Artificial Intelligence 101, 285297.CrossRefGoogle Scholar
Blockeel, H., Raedt, L. D., Ramong, J. 1998. Top-down induction of clustering trees. In Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA, USA.Google Scholar
Blum, A. L., Furst, M. L. 1995. Fast planning through planning graph analysis. Artificial Intelligence 90(1), 16361642.Google Scholar
Bonet, B., Geffner, H. 2001. Planning as heuristic search. Artificial Intelligence 129(1–2), 533.CrossRefGoogle Scholar
Borrajo, D., Veloso, M. 1997. Lazy incremental learning of control knowledge for efficiently obtaining quality plans. AI Review Journal—Special Issue on Lazy Learning 11(1–5), 371405.CrossRefGoogle Scholar
Botea, A., Enzenberger, M., Mller, M., Schaeffer, J. 2005a. Macro-FF: improving AI planning with automatically learned macro-operators. Journal of Artificial Intelligence Research 24, 581621.CrossRefGoogle Scholar
Botea, A., Müller, M., Schaeffer, J. 2005b. Learning partial-order macros from solutions. In ICAPS 2005. Proceedings of the 15th International Conference on Automated Planning and Scheduling, Biundo, S., Myers, K. & Rajan, K. (eds). Monterey, California, 231–240.Google Scholar
Botea, A., Müller, M., Schaeffer, J. 2007. Fast planning with iterative macros. In Proceedings of the International Joint Conference on Artificial Intelligence IJCAI-07, 1828–1833.Google Scholar
Boutilier, C., Reiter, R., Price, B. 2001. Symbolic dynamic programming for first-order MDPs. In International Joint Conference on Artificial Intelligence, Seattle, Washington, USA.Google Scholar
Brazdil, P., Giraud-Carrier, C., Soares, C., Vilalta, R. 2009. Metalearning: Applications to Data Mining—Cognitive Technologies. Springer.CrossRefGoogle Scholar
Bresina, J. L., Jansson, A. K., Morris, P. H., Rajan, K. 2005. Mixed-initiative activity planning for mars rovers. In IJCAI, Edinburgh, Scotland, UK, 1709–1710.Google Scholar
Bui, H. H., Venkatesh, S., West, G. 2002. Policy recognition in the abstract hidden Markov model. Journal of Artificial Intelligence Research 17, 451499.CrossRefGoogle Scholar
Bulitko, V., Lee, G. 2006. Learning in real-time search: a unifying framework. Journal of Artificial Intelligence Research 25, 119157.CrossRefGoogle Scholar
Bylander, T. 1991. Complexity results for planning. In International Joint Conference on Artificial Intelligence, IJCAI-91, Sydney, Australia.CrossRefGoogle Scholar
Bylander, T. 1994. The computational complexity of propositional STRIPS planning. Artificial Intelligence 69(1–2), 165204.CrossRefGoogle Scholar
Castillo, L., Fdez-Olivares, J., García-Pérez, O., Palao, F. 2006. Bringing users and planning technology together. Experiences in SIADEX. In International Conference on Automated Planning and Scheduling (ICAPS 2006), Cumbria, UK.Google Scholar
Charniak, E., Goldman, R. P. 1993. A bayesian model of plan recognition. Artificial Intelligence 64(1), 5379.CrossRefGoogle Scholar
Cohen, W. W. 1990. Learning approximate control rules of high utility. In International Conference on Machine Learning, Austin, Texas, USA.CrossRefGoogle Scholar
Coles, A., Smith, A. 2007. Marvin: a heuristic search planner with online macro-action learning. Journal of Artificial Intelligence Research 28, 119156.CrossRefGoogle Scholar
Cortellessa, G., Cesta, A. 2006. Evaluating mixed-initiative systems: an experimental approach. In Proceedings of the 16th International Conference on Automated Planning & Scheduling, ICAPS-06, Cumbria, UK.Google Scholar
Cresswell, S., McCluskey, T. L., West, M. 2009. Acquisition of object-centred domain models from planning examples. In Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS-09), Thessaloniki, Greece.CrossRefGoogle Scholar
Croonenborghs, T., Driessens, K., Bruynooghe, M. 2007a. Learning relational options for inductive transfer in relational reinforcement learning. In Proceedings of the 17th Conference on Inductive Logic Programming, Corvallis, OR, USA.Google Scholar
Croonenborghs, T., Ramon, J., Blockeel, H., Bruynooghe, M. 2007b. Online learning and exploiting relational models in reinforcement learning. In Proceedings of the 20th International Joint Conference on Artificial Intelligence. AAAI Press, 726731.Google Scholar
Cussens, J. 2001. Parameter estimation in stochastic logic programs. Machine Learning 44(3), 245271.CrossRefGoogle Scholar
Dawson, C., Silklossly, L. 1977. The role of preprocessing in problem solving system. In International Joint Conference on Artificial Intelligence, IJCAI-77, Cambridge, MA, USA, 465–471.Google Scholar
de la Rosa, T., García-Olaya, A., Borrajo, D. 2007. Using cases utility for heuristic planning improvement. In International Conference on Case-Based Reasoning, Belfast, Northern Ireland.CrossRefGoogle Scholar
de la Rosa, T., Jiménez, S., Borrajo, D. 2008. Learning relational decision trees for guiding heuristic planning. In International Conference on Automated Planning and Scheduling (ICAPS 08), Sydney, Australia.Google Scholar
de la Rosa, T., Jiménez, S., García-Durán, R., Fernández, F., García-Olaya, A., Borrajo, D. 2009. Three relational learning approaches for lookahead heuristic planning. In Working Notes of the ICAPS'09 Workshop on Planning and Learning, Thessaloniki, Greece.Google Scholar
Driessens, K., Matwin, S. 2004. Integrating guidance into relational reinforcement learning. Machine Learning 57, 271304.CrossRefGoogle Scholar
Driessens, K., Ramon, J. 2003. Relational instance based regression for relational reinforcement learning. In International Conference on Machine Learning, Washington, DC, USA.Google Scholar
Dzeroski, S., Raedt, L. D., Driessens, K. 2001. Relational reinforcement learning. Machine Learning 43, 752.CrossRefGoogle Scholar
Edelkamp, S. 2002. Symbolic pattern databases in heuristic search planning. In International Conference on Automated Planning and Scheduling, Toulouse, France.Google Scholar
Ernst, G. W., Newell, A. 1969. GPS: A Case Study in Generality and Problem Solving, ACM Monograph Series. Academic Press.Google Scholar
Erol, K., Nau, D. S., Subrahmanian, V. S. 1992. On the complexity of domain-independent planning. Artificial Intelligence 56, 223254.Google Scholar
Estlin, T. A., Mooney, R. J. 1996. Hybrid learning of search control for partial-order planning. In In New Directions in AI Planning. IOS Press, 115128.Google Scholar
Etzioni, O. 1993. Acquiring search-control knowledge via static analysis. Artificial Intelligence 62(2), 255301.CrossRefGoogle Scholar
Ferguson, G., Allen, J. F., Miller, B. 1996. Trains-95: towards a mixed-initiative planning assistant. In International Conference on Artificial Intelligence Planning Systems, AIPS96, Edinburgh, UK. AAAI Press, 7077.Google Scholar
Fern, A., Yoon, S., Givan, R. 2004. Learning domain-specific control knowledge from random walks. In International Conference on Automated Planning and Scheduling, Whistler, Canada, 191–199.Google Scholar
Fern, A., Yoon, S. W., Givan, R. 2006. Approximate policy iteration with a policy language bias: solving relational Markov decision processes. Journal of Artificial Intelligence Research 25, 75118.CrossRefGoogle Scholar
Fikes, R., Hart, P., Nilsson, N. J. 1972. Learning and executing generalized robot plans. Artificial Intelligence 3, 251288.CrossRefGoogle Scholar
Fikes, R., Nilsson, N. J. 1971. STRIPS: a new approach to the application of theorem proving to problem solving. Artificial Intelligence 2, 189208.CrossRefGoogle Scholar
Florez, J. E., Garca, J., Torralba, A., Linares, C., Garca-Olaya, A., Borrajo, D. 2010. Timiplan: an application to solve multimodal transportation problems. In Proceedings of SPARK, Scheduling and Planning Applications woRKshop, ICAPS'10, Toronto, Canada.Google Scholar
Fox, M., Long, D. 2003. PDDL2.1: an extension to PDDL for expressing temporal planning domains. Journal of Artificial Intelligence Research, 61124.CrossRefGoogle Scholar
Fuentetaja, R., Borrajo, D. 2006. Improving control-knowledge acquisition for planning by active learning. In European Conference on Learning, Berlin, Germany, 138–149.Google Scholar
García-Durán, R., Fernández, F., Borrajo, D. 2006. Combining macro-operators with control knowledge. In ILP, Santiago de Compostela, Spain.Google Scholar
García-Durán, R., Fernández, F., Borrajo, D. (2012). A prototype-based method for classification with time constraints: a case study on automated planning. Pattern Analysis and Applications Journal 15(3), 261277.CrossRefGoogle Scholar
García-Martínez, R., Borrajo, D. 2000. An integrated approach of learning, planning, and execution. Journal of Intelligent and Robotics Systems 29, 4778.CrossRefGoogle Scholar
Gartner, T., Driessens, K., Ramon, J. 2003. Graph kernels and Gaussian processes for relational reinforcement learning. In International Conference on Inductive Logic Programming, ILP 2003, Szeged, Hungary.CrossRefGoogle Scholar
Gerevini, A., Saetti, A., Vallati, M. 2009a. An automatically configurable portfolio-based planner with macro-actions: PbP. In Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS-09), Thessaloniki, Greece.CrossRefGoogle Scholar
Gerevini, A. E., Haslum, P., Long, D., Saetti, A., Dimopoulos, Y. 2009b. Deterministic planning in the fifth international planning competition: Pddl3 and experimental evaluation of the planners. Artificial Intelligence 173(5–6), 619668.CrossRefGoogle Scholar
Ghallab, M., Nau, D., Traverso, P. 2004. Automated Planning Theory and Practice. Morgan Kaufmann.Google Scholar
Gil, Y. 1992. Acquiring Domain Knowledge for Planning by Experimentation. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.Google Scholar
Gretton, C., Thiébaux, S. 2004. Exploiting first-order regression in inductive policy selection. In Conference on Uncertainty in Artificial Intelligence, Banff, Canada.Google Scholar
Helmert, M. 2009. Concise finite-domain representations for pddl planning tasks. Artificial Intelligence.CrossRefGoogle Scholar
Hernández, C., Meseguer, P. 2007. Improving LRTA*(k). In International Joint Conference on Artificial Intelligence, IJCAI-07, Hyderabad, India, 2312–2317.Google Scholar
Hoffmann, J., Nebel, B. 2001a. The FF planning system: fast plan generation through heuristic search. Journal of Artificial Intelligence Research 14, 253302.CrossRefGoogle Scholar
Hoffmann, J., Nebel, B. 2001b. The FF planning system: fast plan generation through heuristic search. Journal of Artificial Intelligence Research 14, 253302.CrossRefGoogle Scholar
Hogg, C., Muñoz-Avila, H., Kuter, U. 2008. HTN-MAKER: learning HTNs with minimal additional knowledge engineering required. In National Conference on Artificial Intelligence (AAAI'2008), Chicago, Illinois, USA.Google Scholar
Hogg, C., Kuter, U., Muñoz-Avila, H. 2009. Learning hierarchical task networks for nondeterministic planning domains. In International Joint Conference on Artificial Intelligence, IJCAI-09, Pasadena, CA, USA.Google Scholar
Howe, A. E., Dahlman, E., Hansen, C., Scheetz, M., Mayrhauser, A. V. 1999. Exploiting competitive planner performance. In Proceedings of the 5th European Conference on Planning, Durham, UK.CrossRefGoogle Scholar
Ilghami, O., Nau, D. S., Muñoz-Avila, H. 2002. CaMeL: learning method preconditions for HTN planning. In Proceedings of the 6th International Conference on AI Planning and Scheduling, Toulouse, France. AAAI Press, 131141.Google Scholar
Ilghami, O., Muñoz-Avila, H., Nau, D. S., Aha, D. W. 2005. Learning approximate preconditions for methods in hierarchical plans. In International Conference on Machine Learning, Bonn, Germany.CrossRefGoogle Scholar
Ilghami, O., Nau, D. S., Muñoz-Avila, H. 2006. Learning to do HTN planning. In International Conference on Automated Planning and Scheduling, ICAPS 2006, Cumbria, UK.Google Scholar
Jaeger, M. 1997. Relational bayesian networks. In Conference on Uncertainty in Artificial Intelligence, Rhode Island, Providence, USA.Google Scholar
Jiménez, S., Fernández, F., Borrajo, D. 2008. The PELA architecture: integrating planning and learning to improve execution. In Proceedings of the 23rd AAAI Conference on Artificial Intelligence (AAAI-08), Chicago, IL, USA.Google Scholar
Kaelbling, L. P., Littman, M. L., Moore, A. P. 1996. Reinforcement learning: a survey. Journal of Artificial Intelligence Research 4, 237285.CrossRefGoogle Scholar
Kambhampati, S. 2007. Model-lite planning for the web age masses: the challenges of planning with incomplete and evolving domain models. In: Senior Member Track of the AAAI. AAAI Press/MIT Press.Google Scholar
Kambhampati, S., Hendler, J. A. 1992. A validation structure-based theory of plan modification and reuse. Artificial Intelligence Journal 55, 193258.CrossRefGoogle Scholar
Keller, R. 1987. The Role of Explicit Contextual Knowledge in Learning Concepts to Improve Performance. PhD thesis, Rutgers University.Google Scholar
Kersting, K., Raedt, L. D. 2001. Towards combining inductive logic programming with Bayesian networks. In International Conference on Inductive Logic Programming, Strasbourg, France, 118–131.Google Scholar
Khardon, R. 1999. Learning action strategies for planning domains. Artificial Intelligence 113, 125148.CrossRefGoogle Scholar
Kittler, J. 1998. Combining classifiers: A theoretical framework. Pattern Analysis and Application 1(1), 1827.Google Scholar
Korf, R. E. 1985. Macro-operators: a weak method for learning. Artificial Intelligence 26, 3577.CrossRefGoogle Scholar
Korf, R. E. 1990. Real-time heuristic search. Artificial Intelligence 42(2–3), 189211.CrossRefGoogle Scholar
Lanchas, J., Jiménez, S., Fernández, F., Borrajo, D. 2007. Learning action durations from executions. In Working notes of the ICAPS'07 Workshop on AI Planning and Learning, Rhode Island, Providence, USA.Google Scholar
Larkin, J., Reif, F., Carbonell, J. 1988. FERMI: a flexible expert reasoner with multi-domain inference. Cognitive Science 12(1), 101138.Google Scholar
Leckie, C., Zukerman, I. 1991. Learning search control rules for planning: an inductive approach. In Proceedings of the International Workshop on Machine Learning. Morgan Kaufmann, 422426.Google Scholar
Martin, M., Geffner, H. 2000. Learning generalized policies in planning using concept languages. In International Conference on Artificial Intelligence Planning Systems, AIPS00, Breckenridge, USA.Google Scholar
Mcallester, D., Givan, R. 1989. Taxonomic syntax for first order inference. Journal of the ACM 40, 289300.Google Scholar
McCluskey, T. L. 1987. Combining weak learning heuristics in general problem solvers. In IJCAI'87: Proceedings of the 10th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., Milan, Italy, 331333.Google Scholar
McGann, C., Py, F., Rajan, K., Ryan, J., Henthorn, R. 2008. Adaptive control for autonomous underwater vehicles. In National Conference on Artificial Intelligence (AAAI'2008), Chicago, Illinois, USA.Google Scholar
Mehta, N., Natarajan, S., Tadepalli, P., Fern, A. 2008. Transfer in variable-reward hierarchical reinforcement learning. Machine Learning 73(3), 289312.CrossRefGoogle Scholar
Minton, S. 1988. Learning Effective Search Control Knowledge: An Explanation-Based Approach. Kluwer Academic Publishers.CrossRefGoogle Scholar
Mitchell, T. M. 1997. Machine Learning. McGraw-Hill.Google Scholar
Mitchell, T., Utgoff, T., Banerji, R. 1982. Learning problem solving heuristics by experimentation. In Machine Learning: An Artificial Intelligence Approach, Michalski, R. S., Carbonell, J. G. & Michell, T. M. (eds). Morgan Kaufmann.Google Scholar
Mourão, K., Petrick, R. P. A., Steedman, M. 2008. Using kernel perceptrons to learn action effects for planning. In Proceedings of the International Conference on Cognitive Systems (CogSys 2008), Karlsruhe, Germany.Google Scholar
Mourão, K., Petrick, R. P. A., Steedman, M. 2010. Learning action effects in partially observable domains. In European Conference on Artificial Intelligence, Barcelona, Spain.Google Scholar
Muggleton, S. 1995. Stochastic logic programs. In International Workshop on Inductive Logic Programming, Leuven, Belguim.CrossRefGoogle Scholar
Muise, C., McIlraith, S., Baier, J. A., Reimer, M. 2009. Exploiting N-gram analysis to predict operator sequences. In 19th International Conference on Automated Planning and Scheduling (ICAPS), Thessaloniki, Greece.CrossRefGoogle Scholar
Muñoz-Avila, Aha, H., Breslow, D. & Nau, L. 1999. HICAP: An interactive case based planning architecture and its application to noncombatant evacuation operations. In Conference on Innovative Applications of Artificial Intelligence, IAAI-99, Orlando, Florida, USA.Google Scholar
Nau, D. S., Smith, S. J., Erol, K. 1998. Control strategies in htn planning: theory versus practice. In In AAAI-98/IAAI-98 Proceedings, Madison, Wisconsin. USA.Google Scholar
Nau, D., Ilghami, O., Kuter, U., Murdock, J. W., Wu, D., Yaman, F. 2003. SHOP2: an HTN planning system. Journal of Artificial Intelligence Research 20, 379404.CrossRefGoogle Scholar
Nayak, P., Kurien, J., Dorais, G., Millar, W., Rajan, K., Kanefsky, R. 1999. Validating the DS-1 remote agent experiment. In Artificial Intelligence, Robotics and Automation in Space.Google Scholar
Newton, M. A. H., Levine, J., Fox, M., Long, D. 2007. Learning macro-actions for arbitrary planners and domains. In International Conference on Automated Planning and Scheduling, Providence, USA.Google Scholar
Nilsson, N. J. 1984. Shakey the Robot. Technical Report 323, AI Center, SRI International.Google Scholar
Oates, T., Cohen, P. R. 1996. Searching for planning operators with context-dependent and probabilistic effects. In National Conference on Artificial Intelligence, Portland, Oregon, USA.Google Scholar
Otterlo, M. V. 2009. The Logic of Adaptive Behavior: Knowledge Representation and Algorithms for Adaptive Sequential Decision Making under Uncertainty in First-Order and Relational Domains. IOS Press.Google Scholar
Pasula, H. M., Zettlemoyer, L. S., Kaelbling, L. P. 2007. Learning symbolic models of stochastic domains. Journal of Artificial Intelligence Research 29, 309352.CrossRefGoogle Scholar
Porteous, J., Sebastia, L. 2004. Ordered landmarks in planning. Journal of Artificial Intelligence Research 22, 215278.Google Scholar
Quinlan, J., Cameron-Jones, R. 1995. Introduction of logic programs: FOIL and related systems. New Generation Computing, Special issue on Inductive Logic Programming 13(3–4), 287312.Google Scholar
Raedt, L. D. 2008. Logical and Relational Learning. Springer.CrossRefGoogle Scholar
Ramirez, M., Geffner, H. 2009. Plan recognition as planning. In IJCAI'09: Proceedings of the 21st International Jont Conference on Artifical Intelligence, Pasadena, CA, USA.Google Scholar
Ramirez, M., Geffner, H. 2010. Probabilistic plan recognition using off-the-shelf classical planners. In National Conference on Artificial Intelligence (AAAI'2010), Atlanta, Georgia, USA.CrossRefGoogle Scholar
Reynolds, S. I. 2002. Reinforcement Learning with Exploration. PhD thesis, The University of Birmingham, UK.Google Scholar
Richardson, M., Domingos, P. 2006. Markov logic networks. Machine Learning 62, 107136.CrossRefGoogle Scholar
Rivest, R. L. 1987. Learning decision lists. Machine Learning 2(3), 229246.CrossRefGoogle Scholar
Sanner, S., Boutilier, C. 2005. Approximate linear programming for first-order mdps. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, Edinburgh, Scotland, UK, 509–517.Google Scholar
Sanner, S., Boutilier, C. 2006. Practical linear value-approximation techniques for first-order MDPs. In Proceedings of the 22nd Conference in Uncertainty in Artificial Intelligence, Cambridge, MA, USA.Google Scholar
Sanner, S., Kersting, K. 2010. Symbolic dynamic programming for first-order pomdps. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI-10), Fox, M. & Poole, D. (eds). Atlanta, Georgia, USA, AAAI Press.Google Scholar
Serina, I. 2010. Kernel functions for case-based planning. Artificial Intelligence 174(16–17), 13691406.CrossRefGoogle Scholar
Shavlik, J. W. 1989. Acquiring recursive and iterative concepts with explanation-based learning. In Machine Learning.CrossRefGoogle Scholar
Shen, W., Simon, H. A. 1989. Rule creation and rule learning through environmental exploration. In International Joint Conference on Artificial Intelligence, IJCAI-89, Detroit, Michigan, USA, 675–680.Google Scholar
Srivastava, S., Immerman, N., Zilberstein, S. 2008. Learning generalized plans using abstract counting. In National Conference on Artificial Intelligence (AAAI'2008), Chicago, Illinois, USA.Google Scholar
Strehl, A. L., Littman, M. L. 2005. A theoretical analysis of model-based interval estimation. In Proceedings of the 21nd International Conference on Machine Learning (ICML-05), Bonn, Germany, 857–864.Google Scholar
Sutton, R. S., Barto, A. G. 1998. Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press.Google Scholar
Taylor, M. E., Stone, P. 2007. Cross-domain transfer for reinforcement learning. In International Conference on Machine Learning, ICML, Corvallis, OR, USA.CrossRefGoogle Scholar
Theocharous, G., Kaelbling, L. P. 2003. Approximate planning in POMDPs with macro-actions. In Proceedings of Advances in Neural Information Processing Systems 16, Whistler, Canada.Google Scholar
Thiébaux, S., Hoffmann, J., Nebel, B. 2005. In defense of PDDL axioms. Artificial Intelligence 168(1–2), 3869.CrossRefGoogle Scholar
Veloso, M. M., Carbonell, J. G. 1993. Derivational analogy in prodigy: automating case acquisition, storage, and utilization. Machine Learning 10, 249278.CrossRefGoogle Scholar
Veloso, M. M., Pérez, M. A., Carbonell, J. G. 1990. Nonlinear planning with parallel resource allocation. In Proceedings of the DARPA Workshop on Innovative Approaches to Planning, Scheduling, and Control, San Diego, CA, USA, Morgan Kaufmann, 207212.Google Scholar
Vrakas, D., Tsoumakas, G., Bassiliades, N., Vlahavas, I. P. 2005. HAPRC: an automatically configurable planning system. AI Communications 18(1), 4160.Google Scholar
Walsh, T. J., Littman, M. L. 2008. Efficient learning of action schemas and web-service descriptions. In AAAI'08: Proceedings of the 23rd National Conference on Artificial Intelligence, Chicago, Illinois, USA. AAAI Press, 714719.Google Scholar
Wang, X. 1994. Learning planning operators by observation and practice. In International Conference on AI Planning Systems, AIPS-94, Chicago, Illinois, USA.Google Scholar
Wang, C., Joshi, S., Khardon, R. 2007. First order decision diagrams for relational MDPs. In International Joint Conference on Artificial Intelligence, IJCAI-07, Hyderabad, India.CrossRefGoogle Scholar
Watkins, C. J. C. H. 1989. Learning from Delayed Rewards. PhD thesis, King's College, Oxford.Google Scholar
Wiering, M. 1999. Explorations in efficient reinforcement learning. PhD thesis, University of Amsterdam IDSIA, the Netherlands.Google Scholar
Winner, E., Veloso, M. 2003. DISTILL: towards learning domain-specific planners by example. In International Conference on Machine Learning, ICML'03, Washington, DC, USA.CrossRefGoogle Scholar
Xu, Y., Fern, A., Yoon, S. W. 2007. Discriminative learning of beam-search heuristics for planning. In International Joint Conference on Artificial Intelligence, Hyderabad, India.Google Scholar
Yang, Q., Wu, K., Jiang, Y. 2007. Learning action models from plan traces using weighted MAX-SAT. Artificial Intelligence Journal 171, 107143.CrossRefGoogle Scholar
Yoon, S., Kambhampati, S. 2007. Towards model-lite planning: a proposal for learning and planning with incomplete domain models. In ICAPS2007 Workshop on Artificial Intelligence Planning and Learning, Providence, USA.Google Scholar
Yoon, S., Fern, A., Givan, R. 2002. Inductive policy selection for first-order MDPs. In Conference on Uncertainty in Artificial Intelligence, UAI02, Alberta, Edmonton, Canada.Google Scholar
Yoon, S., Fern, A., Givan, R. 2006. Learning heuristic functions from relaxed plans. In International Conference on Automated Planning and Scheduling (ICAPS-2006), Cumbria, UK.Google Scholar
Yoon, S., Fern, A., Givan, R. 2007. Using learned policies in heuristic-search planning. In International Joint Conference on Artificial Intelligence, Hyderabad, India.Google Scholar
Yoon, S., Fern, A., Givan, R. 2008. Learning control knowledge for forward search planning. Journal of Machine Learning Research 9, 683718.Google Scholar
Younes, H., Littman, M. L., Weissman, D., Asmuth, J. 2005. The first probabilistic track of the international planning competition. Journal of Artificial Intelligence Research 24, 851887.CrossRefGoogle Scholar
Zelle, J., Mooney, R. 1993. Combining FOIL and EBG to speed-up logic programs. In International Joint Conference on Artificial Intelligence. IJCAI-93, Chambéry, France.Google Scholar
Zhuo, H., Li, L., Yang, Q., Bian, R. 2008. Learning action models with quantified conditional effects for software requirement specification. In ICIC '08: Proceedings of the 4th International Conference on Intelligent Computing, Shanghai, China. Springer-Verlag, 874881.CrossRefGoogle Scholar
Zimmerman, T., Kambhampati, S. 2003. Learning-assisted automated planning: looking back, taking stock, going forward. AI Magazine 24, 7396.Google Scholar