Learning Bayesian networks: approaches and issues

Rónán Daly; Qiang Shen; Stuart Aitken

doi:10.1017/S0269888910000251

Learning Bayesian networks: approaches and issues

Published online by Cambridge University Press: 12 May 2011

Rónán Daly ,

Qiang Shen and

Stuart Aitken

Show author details

Rónán Daly*: Affiliation:
School of Computing Science, University of Glasgow, Glasgow, G12 8QQ, UK; e-mail: [email protected]
Qiang Shen*: Affiliation:
Department of Computer Science, Aberystwyth University, Aberystwyth, SY23 3DB, UK; e-mail: [email protected]
Stuart Aitken*: Affiliation:
School of Informatics, University of Edinburgh, Edinburgh, EH8 9LE, UK; e-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Bayesian networks have become a widely used method in the modelling of uncertain knowledge. Owing to the difficulty domain experts have in specifying them, techniques that learn Bayesian networks from data have become indispensable. Recently, however, there have been many important new developments in this field. This work takes a broad look at the literature on learning Bayesian networks—in particular their structure—from data. Specific topics are not focused on in detail, but it is hoped that all the major fields in the area are covered. This article is not intended to be a tutorial—for this, there are many books on the topic, which will be presented. However, an effort has been made to locate all the relevant publications, so that this paper can be used as a ready reference to find the works on particular sub-topics.

Type: Articles
Information: The Knowledge Engineering Review , Volume 26 , Issue 2 , 12 May 2011 , pp. 99 - 157

DOI: https://doi.org/10.1017/S0269888910000251 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramson, B., Finizza, A. 1991. Using belief networks to forecast oil prices. International Journal of Forecasting 7(3), 299–315.CrossRef Google Scholar

Abramson, B., Brown, J., Edwards, W., Murphy, A., Winkler, R. L. 1996. Hailfinder: a Bayesian system for forecasting severe weather. International Journal of Forecasting 12(1), 57–71.CrossRef Google Scholar

Acid, S., de Campos, L. M. 1995. Approximations of causal networks by polytrees: an empirical study. In Advances in Intelligent Computing – IPMU ’94, Lecture Notes in Computer Science 945, 149–158. Springer.CrossRef Google Scholar

Acid, S., de Campos, L. M. 1996a. An algorithm for finding minimum d-separating sets in belief networks. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 3–10.Google Scholar

Acid, S., de Campos, L. M. 1996b. An Algorithm for Finding Minimum d-Separating Sets in Belief Networks. Technical report DECSAI-96-02-14, Departamento de Ciencias de la Computación e Inteligencia Artificial, Universidad de Granada.Google Scholar

Acid, S., de Campos, L. M. 1996c. BENEDICT: an algorithm for learning probabilistic Bayesian networks. In Proceedings of the Sixth International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems, Granada, Spain, 979–984.Google Scholar

Acid, S., De Campos, L. M. 2000. Learning right sized belief networks by means of a hybrid methodology. In Principles of Data Mining and Knowledge Discovery: 4th European Conference, PKDD 2000, Zighed, D. Komorowski, J. & Żytkow, J. (eds). Lecture Notes in Artificial Intelligence 1910, 309–315, Springer.Google Scholar

Acid, S., de Campos, L. M. 2001. A hybrid methodology for learning belief networks: BENEDICT. International Journal of Approximate Reasoning 27(3), 235–262.CrossRef Google Scholar

Acid, S., de Campos, L. M. 2003. Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. Journal of Artificial Intelligence Research 18, 445–490.CrossRef Google Scholar

Acid, S., de Campos, L. M., Huete, J. F. 2001. The search of causal orderings: a short cut for learning belief networks. In Symbolic and Quantitative Approaches to Reasoning with Uncertainty: Proceedings of the Sixth European Conference, ECSQARU 2001, Lecture Notes in Artificial Intelligence 2143, 216–227. Springer.CrossRef Google Scholar

Acid, S., de Campos, L. M., Fernandez-Luna, J. M., Rodriguez, S., Rodriguez, J. M., Salcedo, J. L. 2004. A comparison of learning algorithms for Bayesian networks: a case study based on data from an emergency medical service. Artificial Intelligence in Medicine 30(3), 215–232.CrossRef Google Scholar PubMed

Aitken, S., Jirapech-Umpai, T., Daly, R. 2005. Inferring gene regulatory networks from classified microarray data: initial results. BMC Bioinformatics 6(Suppl. 3), S4.CrossRef Google Scholar

Aliferis, C. F., Tsamardinos, I. 2002. Algorithms for Large-scale Local Causal Discovery and Feature Selection in the Presence of Limited Sample or Large Causal Neighbourhoods. Technical report DSL-02-08, Department of Biomedical Informatics, Vanderbilt University.Google Scholar

Allen, T. V., Singh, A., Greiner, R., Hoope, P. 2008. Quantifying the uncertainty of a belief net response: Bayesian error-bars for belief net inference. Artficial Intelligence 172(4–5), 483–513.CrossRef Google Scholar

Anderson, B., Moore, A. 2005. Active learning for hidden Markov models: objective functions and algorithms. In Proceedings of the Twenty-Second International Conference on Machine Learning (ICML 2005), De Raedt, L. & Wrobel, S. (eds). ACM, 9–16.Google Scholar

Andersson, S. A., Madigan, D., Perlman, M. D. 1997. A characterization of Markov equivalence classes for acyclic digraphs. The Annals of Statistics 25(2), 505–541.Google Scholar

Andreassen, S., Jensen, F. V., Andersen, S. K., Falck, B., Kjrul, U., Woldbye, M., Srensen, A. R., Rosenfalck, A., Jensen, F. 1989. MUNIN–an expert EMG assistant. In Computer-aided Electromyography and Expert Systems, Desmedt, J. (ed.). Elsevier, 255–277.Google Scholar

Bach, F. R., Jordan, M. I. 2003. Learning graphical models with Mercer kernels. In Advances in Neural Information Processing Systems 15 (NIPS*2002), Becker, S., Thrun, S. & Obermayer, K. (eds). The MIT Press, 1009–1016.Google Scholar

Bauer, E., Koller, D., Singer, Y. 1997. Update rules for parameter estimation in Bayesian networks. In Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-97), Geiger, D. & Shenoy, P. P. (eds). Morgan Kaufmann, 3–13.Google Scholar

Beal, M. J., Ghahramani, Z. 2003. The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. In Bayesian Statistics 7: Proceedings of the Seventh Valencia International Meeting, Bernardo, J. M., Bayarri, M. J., Berger, J. O., Dawid, A. P., Heckerman, D., Smith, A. F. M. & West, M. (eds). Oxford University Press, 453–464.CrossRef Google Scholar

Becker, A., Geiger, D. 1994. Approximation algorithms for the loop cutset problem. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 60–68.Google Scholar

Becker, A., Geiger, D. 1996a. Optimization of Pearl’s method of conditioning and greedy-like approximation algorithms for the vertex feedback set problem. Artificial Intelligence 83(1), 167–188.CrossRef Google Scholar

Becker, A., Geiger, D. 1996b. A sufficiently fast algorithm for finding close to optimal junction trees. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 81–89.Google Scholar

Becker, A., Geiger, D. 2001. A sufficiently fast algorithm for finding close to optimal clique trees. Artificial Intelligence 125(1–2), 3–17.CrossRef Google Scholar

Beinlich, I., Suermondt, H., Chavez, R., Cooper, G. 1989. The ALARM monitoring system: a case study with two probabilistic inference techniques for belief networks. In Proceedings of the Second European Conference on Artificial Intelligence in Medicine (AIME 89), Lecture Notes in Medical Informatics 38, 247–256, Springer.CrossRef Google Scholar

Binder, J., Koller, D., Russell, S., Kanazawa, K. 1997. Adaptive probabilistic networks with hidden variables. Machine Learning 29(2–3), 213–244.CrossRef Google Scholar

Bishop, C., Lawrence, N., Jaakkola, T., Jordan, M. 1998. Approximating posterior distributions in belief networks using mixtures. In Advances in Neural Information Processing Systems 10 (NIPS*1997), Jordan, M. I., Kearns, M, J. & Solla, S. A. (eds). The MIT Press, 416–422.Google Scholar

Blanco, R., Inza, I., Larrañaga, P. 2003. Learning Bayesian networks in the space of structures by estimation of distribution algorithms. International Journal of Intelligent Systems 18(2), 205–220.CrossRef Google Scholar

Borchani, H., Amor, N. B., Mellouli, K. 2006. Learning Bayesian network equivalence classes from incomplete data. In Proceedings of the Ninth International Conference on Discovery Science, Lecture Notes in Artificial Intelligence 4265, 291–295, Springer.CrossRef Google Scholar

Borchani, H., Chaouachi, M., Amor, N. B. 2007. Learning causal Bayesian networks from incomplete observational data and interventions. In Proceedings of the Ninth European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU 2007), Mellouli, K. (ed.). Lecture Notes in Artificial Intelligence 4724, 17–29, Springer.CrossRef Google Scholar

Borchani, H., Amor, N. B., Khalfallah, F. 2008. Learning and evaluating bayesian network equivalence classes from incomplete data. International Journal of Pattern Recognition and Artificial Intelligence 22(2), 253–278.Google Scholar

Bøttcher, S. G. 2004. Learning Bayesian Networks with Mixed Variables. PhD thesis, Department of Mathematical Sciences, Aalborg University.Google Scholar

Bouckaert, R. R. 1993. Probabilistic network construction using the minimum description length principle. In Symbolic and Quantitative Approaches to Reasoning and Uncertainty: European Conference ECSQARU ’93, Lecture Notes in Computer Science 747, 41–48, Springer.CrossRef Google Scholar

Bouckaert, R. R. 1994a. Probabilistic Network Construction Using the Minimum Description Length Principle. Technical report RUU-CS-94-27, Department of Computer Science, Utrecht University.Google Scholar

Bouckaert, R. R. 1994b. Properties of Measures for Bayesian Belief Network Learning. Technical report UU-CS-1994-35, Department of Information and Computing Sciences, Utrecht University.CrossRef Google Scholar

Bouckaert, R. R. 1994c. A Stratified Simulation Scheme for Inference in Bayesian Belief Networks. Technical report UU-CS-1994-16, Department of Computer Science, Utrecht University.CrossRef Google Scholar

Bouckaert, R. R., Castillo, E., Gutiérrez, J. M. 1996. A modified simulation scheme for inference in Bayesian networks. International Journal of Approximate Reasoning 14(1), 55–80.CrossRef Google Scholar

Boutilier, C., Friedman, N., Goldszmidt, M., Koller, D. 1996. Context-specific independence in Bayesian networks. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 115–123.Google Scholar

Boyen, X., Koller, D. 1998. Tractable inference for complex stochastic processes. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), Cooper, G. F. & Moral, S. (eds). Morgan Kaufmann, 33–42.Google Scholar

Boyen, X., Friedman, N., Koller, D. 1999. Discovering the hidden structure of complex dynamic systems. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 91–100.Google Scholar

Breese, J. S., Horvitz, E. 1991. Ideal reformulation of belief networks. In Uncertainty in Artificial Intelligence 6, Bonissone, P., Henrion, M., Kanal, L. & Lemmer, J. (eds). North-Holland, 129–144.Google Scholar

Bromberg, F., Margaritis, D. 2009. Improving the reliability of causal discovery from small data sets using argumentation. Journal of Machine Learning Research 10, 301–340.Google Scholar

Brown, L. E., Tsamardinos, I., Aliferis, C. F. 2004. A novel algorithm for scalable and accurate Bayesian network learning. In Proceedings of the Eleventh World Congress on Medical Informatics (MEDINFO) Fieschi, M., Coiera, E. & Li, Y. J. (eds). 1, IOS Press, 711–715.Google Scholar

Brown, L. E., Tsamardinos, I., Aliferis, C. F. 2005. A comparison of novel and state-of-the-art polynomial Bayesian network learning algorithms. In Proceedings of the Twentieth National Conference On Artificial Intelligence, Veloso, M. M. & Kambhampati, S. (eds). 2, AAAI Press, 739–745.Google Scholar

Buntine, W. 1991. Theory refinement on Bayesian networks. In Proceedings of the Seventh Annual Conference on Uncertainty in Artificial Intelligence (UAI ’91), Ambrosio, B. D. & Smets, P. (eds). Morgan Kaufmann, 52–60.Google Scholar

Buntine, W. L. 1994. Operations for learning with graphical models. Journal of Artificial Intelligence Research 2, 159–225.CrossRef Google Scholar

Buntine, W. 1996. A guide to the literature on learning probabilistic networks from data. IEEE Transactions on Knowledge and Data Engineering 8(2), 195–210.CrossRef Google Scholar

Burge, J., Lane, T. 2006. Improving Bayesian network structure search with random variable aggregation hierarchies. In Proceedings of the Seventeenth European Conference on Machine Learning (ECML 2006), Lecture Notes in Artificial Intelligence 4212, 66–77. Springer.CrossRef Google Scholar

Burge, J., Lane, T. 2007. Shrinkage estimator for Bayesian network parameters. In Proceedings of the Eighteenth European Conference on Machine Learning (EMCL 2007), Kok, J. N., Koronacki, J., de Mantaras, R. L., Matwin, S., Mladenič, D. & Skowron, A. (eds). Lecture Notes in Artificial Intelligence 4701, 67–78. Springer.Google Scholar

Butz, C., Hua, S., Chen, J., Yao, H. 2009. A simple graphical approach for understanding probabilistic inference in Bayesian networks. Information Sciences 179(6), 699–716.CrossRef Google Scholar

Cano, J. E., Hernández, L. D., Moral, S. 1996. Importance sampling algorithms for the propagation of probabilities in belief networks. International Journal of Approximate Reasoning 15(1), 77–92.CrossRef Google Scholar

Cartwright, N. 2001. What is wrong with Bayes nets? The Monist 84(2), 242–264.CrossRef Google Scholar

Cartwright, N. 2002. Against modularity, the causal Markov condition, and any link between the two: comments on Hausman and Woodward. The British Journal for the Philosophy of Science 53(3), 411–453.CrossRef Google Scholar

Cartwright, N. 2006. From metaphysics to method: comments on manipulability and the causal Markov condition. The British Journal for the Philosophy of Science 57(1), 197–218.CrossRef Google Scholar

Castelo, R., Kočka, T. 2003. On inclusion-driven learning of Bayesian networks. Journal of Machine Learning Research 4, 527–574.Google Scholar

Castelo, R., Perlman, M. D. 2002. Learning essential graph Markov models from data. In Proceedings of the First European Workshop on Probabilistic Graphical Models (PGM 2002), Gámez, J. A. & Salmerón, A. (eds). Cuenca, Spain, 17–24.Google Scholar

Castelo, R., Siebes, A. 2000. Priors on network structures. Biasing the search for Bayesian networks. International Journal of Approximate Reasoning 24(1), 39–57.CrossRef Google Scholar

Castillo, E., Gutiérrez, J. M., Hadi, A. S. 1995. Parametric structure of probabilities in Bayesian networks. In Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty (ECSQARU ’95), Lecture Notes in Artificial Intelligence 946, 89–98. Springer.CrossRef Google Scholar

Castillo, E., Gutiérrez, J. M., Hadi, A. S. 1996. A new method for efficient symbolic propagation in discrete bayesian networks. Networks 28(1), 31–43.3.0.CO;2-E>CrossRef Google Scholar

Castillo, E., Gutiérrez, J. M., Hadi, A. S. 1997a. Expert Systems and Probabilistic Network Models. Monographs in Computer Science, Springer.CrossRef Google Scholar

Castillo, E., Hadi, A. S., Solares, C. 1997b. Learning and updating of uncertainty in Dirichlet models. Machine Learning 26(1), 43–63.CrossRef Google Scholar

Chang, K.-C., Fung, R. 1995. Symbolic probabilistic inference with both discrete and continuous variables. IEEE Transactions on Systems, Man and Cybernetics 25(6), 910–916.CrossRef Google Scholar

Chavez, R. M., Cooper, G. F. 1990. An empirical evaluation of a randomized algorithm for probabilistic inference. In Uncertainty in Artificial Intelligence 5, Henrion, M., Shachter, R., Kanal, L. & Lemmer, J. (eds). North-Holland, 191–208.CrossRef Google Scholar

Chavira, M., Darwiche, A. 2007. Compiling Bayesian networks using variable elimination. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence, Veloso, M. M. (ed.). Morgan Kaufmann, 2443–2449.Google Scholar

Cheeseman, P., Stutz, J. 1996. Bayesian classification (AutoClass): theory and results. In Advances in Knowledge Discovery and Data Mining, Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P. & Uthurusamy, R. (eds). AAAI Press, 153–180.Google Scholar

Chen, X.-W., Anantha, G., Lin, X. 2008. Improving Bayesian network structure learning with mutual information-based node ordering in the K2 algorithm. IEEE Transactions on Knowledge and Data Engineering 20(5), 628–640.CrossRef Google Scholar

Cheng, J., Druzdzel, M. J. 2000. AIS-BN: an adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks. Journal of Artificial Intelligence Research 13, 155–188.CrossRef Google Scholar

Cheng, J., Druzdzel, M. 2001. Confidence inference in Bayesian networks. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI-01), Breese, J. & Koller, D. (eds). Morgan Kaufmann, 75–82.Google Scholar

Cheng, J., Greiner, R. 1999. Comparing Bayesian network classifiers. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 101–108.Google Scholar

Cheng, J., Bell, D. A., Liu, W. 1997. An algorithm for Bayesian belief network construction from data. In Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, Smyth, P. & Madigan, D. (eds). Fort Lauderdale, USA, 83–90.Google Scholar

Cheng, J., Greiner, R., Kelly, J., Bell, D., Liu, W. 2002. Learning Bayesian networks from data: an information-theory based approach. Artificial Intelligence 137(1–2), 43–90.CrossRef Google Scholar

Chickering, D. M. 1995. A transformational characterization of equivalent Bayesian network structures. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), Besnard, P. & Hanks, S. (eds). Morgan Kaufmann, 87–98.Google Scholar

Chickering, D. M. 1996a. Learning Bayesian networks is NP-complete. In Learning from Data: Artificial Intelligence and Statistics V, Fisher, D. & Lenz, H.-J. (eds). Lecture Notes in Statistics 112, 121–130. Springer.CrossRef Google Scholar

Chickering, D. M. 1996b. Learning equivalence classes of Bayesian network structures. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 150–157.Google Scholar

Chickering, D. M. 2002a. Learning equivalence classes of Bayesian-network structures. Journal of Machine Learning Research 2, 445–498.Google Scholar

Chickering, D. M. 2002b. Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507–554.Google Scholar

Chickering, D. M., Heckerman, D. 1997. Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Machine Learning 29(2–3), 181–212.CrossRef Google Scholar

Chickering, D. M., Heckerman, D. 1999. Fast learning from sparse data. In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), Morgan Kaufmann, 109–115.Google Scholar

Chickering, D. M., Meek, C. 2002. Finding optimal Bayesian networks. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI-02), Darwiche, A. & Friedman, N. (eds). Morgan Kaufmann, 94–102.Google Scholar

Chickering, D. M., Meek, C. 2006. On the incompatibility of faithfulness and monotone DAG faithfulness. Artificial Intelligence 170(8–9), 653–666.CrossRef Google Scholar

Chickering, D. M., Geiger, D., Heckerman, D. 1996. Learning Bayesian networks: search methods and experimental results. In Learning from Data: Artificial Intelligence and Statistics V, Fisher, D. & Lenz, H.-J. (eds). Lecture Notes in Statistics 112, 112–128. Springer.Google Scholar

Chickering, D. M., Heckerman, D., Meek, C. 1997a. A Bayesian approach to learning Bayesian networks with local structure. In Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-97). Morgan Kaufmann, 80–89.Google Scholar

Chickering, D. M., Heckerman, D., Meek, C. 1997b. A Bayesian Approach to Learning Bayesian Networks with Local Structure. Technical report MSR-TR-97-07, Microsoft Research.Google Scholar

Chickering, D. M., Heckerman, D., Meek, C. 2004. Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research 5, 1287–1330.Google Scholar

Chow, C. K., Liu, C. N. 1968. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14(3), 462–467.CrossRef Google Scholar

Cooper, G. F. 1990. The computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence 42(2–3), 393–405.CrossRef Google Scholar

Cooper, G. F. 1995. A Bayesian method for learning belief networks that contain hidden variables. Journal of Intelligent Information Systems 4(1), 71–88.CrossRef Google Scholar

Cooper, G. F. 1997. A simple constraint-based algorithm for efficiently mining observational databases for causal relationships. Data Mining and Knowledge Discovery 1(2), 203–224.CrossRef Google Scholar

Cooper, G. F., Herskovits, E. 1992. A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9(4), 309–347.CrossRef Google Scholar

Cooper, G. F., Yoo, C. 1999. Causal discovery from a mixture of experimental and observational data. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 116–125.Google Scholar

Correa, E. S., Freitas, A. A., Johnson, C. G. 2007. Particle swarm and Bayesian networks applied to attribute selection for protein functional classification. In Proceedings of the Genetic and Evolutionary Computation Conference, Lipson, H. (ed.). ACM, 2651–2658.Google Scholar

Cotta, C., Muruzábal, J. 2002. Towards a more efficient evolutionary induction of Bayesian networks. In Proceedings of the Seventh International Conference on Parallel Problem Solving from Nature (PPSN VII), Lecture Notes in Computer Science 2439, 730–739. Springer.CrossRef Google Scholar

Cotta, C., Muruzábal, J. 2004. On the learning of Bayesian network graph structures via evolutionary programming. In Proceedings of the Second European Workshop on Probabilistic Graphical Models, Lucas, P. (ed.). Leiden, Netherlands, 65–72.Google Scholar

Cousins, S. B., Chena, W., Frisse, M. E. 1993. A tutorial introduction to stochastic simulation algorithms for belief networks. Artificial Intelligence in Medicine 5(4), 315–340.CrossRef Google Scholar PubMed

Cowell, R. 2001. Conditions under which conditional independence and scoring methods lead to identical selection of Bayesian network models. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI-01), Breese, J. & Koller, D. (eds). Morgan Kaufmann, 91–97.Google Scholar

Cowell, R. G., Dawid, A. P., Lauritzen, S. L., Spiegelhalter, D. J. 1999. Probabilistic Networks and Expert Systems. Statistics for Engineering and Information Science, Springer.Google Scholar

Cruz-Ramírez, N., Acosta-Mesa, H.-G., Barrientos-Martnez, R.-E., Nava-Fernández, L.-A. 2006. How good are the Bayesian information criterion and the minimum description length principle for selection? A Bayesian network analysis. In Proceedings of the Fifth Mexican International Conference on Artificial Intelligence (MICAI 2006), Lecture Notes in Artificial Intelligence 4293, 494–504. Springer.CrossRef Google Scholar

Dagum, P., Horvitz, E. 1993. A Bayesian analysis of simulation algorithms for inference in belief networks. Networks 23(5), 499–516.CrossRef Google Scholar

Dagum, P., Luby, M. 1993. Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artificial Intelligence 60(1), 141–154.CrossRef Google Scholar

Dagum, P., Luby, M. 1997. An optimal approximation algorithm for Bayesian inference. Artificial Intelligence 93(1–2), 1–27.CrossRef Google Scholar

Dagum, P., Galper, A., Horvitz, E. 1992. Dynamic network models for forecasting. In Proceedings of the Eighth Conference on Uncertainty in Artificial Intelligence (UAI-92), Dubois, D., Wellman, M. P., D’Ambrosio, B. & Smets, P. (eds). Morgan Kaufmann, 41–48.Google Scholar

Daly, R., Shen, Q. 2009. Learning Bayesian network equivalence classes with ant colony optimization. Journal of Artificial Intelligence Research 35, 391–447.CrossRef Google Scholar

Daly, R., Shen, Q, Aitken, S. 2006. Speeding up the learning of equivalence classes of Bayesian network structures. In Proceedings of the Tenth IASTED International Conference on Artificial Intelligence and Soft Computing, del Pobil, A. P. (ed.). ACTA Press, 34–39.Google Scholar

Darwiche, A. 1995. Conditioning methods for exact and approximate inference in causal networks. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), Besnard, P. & Hanks, S. (eds). Morgan Kaufmann, 99–107.Google Scholar

Darwiche, A. 1998. Dynamic jointrees. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), Cooper G. F. & Moral S. (eds). Morgan Kaufmann, 97–104.Google Scholar

Darwiche, A. 2001a. Decomposable negation normal form. Journal of the ACM 48(4), 608–647.CrossRef Google Scholar

Darwiche, A. 2001b. Recursive conditioning. Artificial Intelligence 126(1–2), 5–41.CrossRef Google Scholar

Darwiche, A. 2002. A logical approach to factoring belief networks. In Proceedings of the Eight International Conference on Principles of Knowledge Representation and Reasoning (KR-02), Fensel, D., Giunchiglia, F., McGuinness, D. L. & Williams, M.-A. (eds). Morgan Kaufmann, 409–420.Google Scholar

Darwiche, A. 2003. A differential approach to inference in Bayesian networks. Journal of the ACM 50(3), 280–305.CrossRef Google Scholar

Darwiche, A. 2009. Modeling and Reasoning with Bayesian Networks, Cambridge University Press.CrossRef Google Scholar

Dasgupta, S. 1997. The sample complexity of learning fixed-structure Bayesian networks. Machine Learning 29(2–3), 165–180.CrossRef Google Scholar

Dasgupta, S. 1999. Learning polytrees. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 134–141.Google Scholar

Dash, D., Cooper, G. F. 2004. Model averaging for prediction with discrete Bayesian networks. Journal of Machine Learning Research 5, 1177–1203.Google Scholar

Dash, D., Druzdzel, M. J. 1999. A hybrid anytime algorithm for the construction of causal models from sparse data. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H., Laskey, K. (eds). Morgan Kaufmann, 142–149.Google Scholar

Dash, D., Druzdzel, M. 2003. A robust independence test for constraint-based learning of causal structure. In Proceedings of the Ninteenth Conference on Uncertainty in Artificial Intelligence, Meek, C. & Kjærulff, U. (eds). Morgan Kaufmann, 167–174.Google Scholar

de Campos, L. M. 2006. A scoring function for learning bayesian networks based on mutual information and conditional independence tests. Journal of Machine Learning Research 7, 2149–2187.Google Scholar

de Campos, L. M. 1998. Independency relationships and learning algorithms for singly connected networks. Journal of Experimental & Theoretical Artificial Intelligence 10(4), 511–549.CrossRef Google Scholar

de Campos, L. M., Castellano, J. G. 2007. Bayesian network learning algorithms using structural restrictions. International Journal of Approximate Reasoning 45(2), 233–254.CrossRef Google Scholar

de Campos, L. M., Huete, J. F. 1997. On the use of independence relationships for learning simplified belief networks. International Journal of Intelligent Systems 12(7), 495–522.3.0.CO;2-G>CrossRef Google Scholar

de Campos, L. M., Huete, J. F. 2000a. Approximating causal orderings for Bayesian networks using genetic algorithms and simulated annealing. In Proceedings of the Eight Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Madrid, Spain, 333–340.Google Scholar

de Campos, L. M., Huete, J. F. 2000b. A new approach for learning belief networks using independence criteria. International Journal of Approximate Reasoning 24(1), 11–37.CrossRef Google Scholar

de Campos, L. M., Puerta, J.M. 2001. Stochastic local and distributed search algorithms for learning belief networks. In Proceedings of the Third International Symposium on Adaptive Systems: Evolutionary Computation and Probabilistic Graphical Models, Ochoa, A., Mühlenbein, H., English, T. & Larrañaga, P. (eds). ICIMAF, 109–115.Google Scholar

de Campos, L. M., Fernández-Luna, J. M., Gámez, J. A., Puerta, J. M. 2002a. Ant colony optimization for learning Bayesian networks. International Journal of Approximate Reasoning 31(3), 291–311.CrossRef Google Scholar

de Campos, L. M., Fernández-Luna, J. M., Puerta, J. M. 2002b. Local search methods for learning Bayesian networks using a modified neighborhood in the space of DAGs. In Advances in Artificial Intelligence: Proceedings of the Eight Ibero-American Conference on AI (IBERAMIA 2002), Lecture Notes in Artificial Intelligence 2527, 182–192. Springer.CrossRef Google Scholar

de Campos, L. M., Gámez, J. A., Puerta, J. M. 2002c. Learning Bayesian networks by ant colony optimisation: searching in two different spaces. Mathware & Soft Computing 9(3), 251–268.Google Scholar

de Campos, L. M., Fernández-Luna, J. M., Puerta, J. M. 2003. An iterated local search algorithm for learning Bayesian networks with restarts based on conditional independence tests. International Journal of Intelligent Systems 18(2), 221–235.CrossRef Google Scholar

de Santana, A. L., Frances, C. R., Rocha, C. A., Carvalho, S. V., Vijaykumar, N. L., Rego, L. P., Costa, J. C. 2007a. Strategies for improving the modeling and interpretability of Bayesian networks. Data and Knowledge Engineering 63(1), 91–107.CrossRef Google Scholar

de Santana, A. L., Francês, C. R. L., Costa, J. C. W. 2007b. Algorithm for graphical Bayesian modeling based on multiple regressions. In Proceedings of the Sixth Mexican International Conference on Artificial Intelligence (MICAI 2007), Gelbukh, A. & Morales, Á. F. K. (eds). Lecture Notes in Artificial Intelligence 4827, 496–506. Springer.Google Scholar

Dean, T., Kanazawa, K. 1989. A model for reasoning about persistence and causation. Computational Intelligence 5(2), 142–150.CrossRef Google Scholar

Delaplace, A., Brouard, T., Cardot, H. 2006. Two evolutionary methods for learning Bayesian network structures. In Proceedings of the International Conference on Computational Intelligence and Security, Wang, Y., Cheung, Y.-M. & Liu, H. (eds). 1, 137–142. IEEE.Google Scholar

Dempster, A. P., Laird, N. M., Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Socitety. Series B (Methodological) 39(1), 1–38.Google Scholar

desJardins, M., Rathod, P., Getoor, L. 2008. Learning structured Bayesian networks: combining abstraction hierarchies and tree-structured conditional probability tables. Computational Intelligence 24(1), 1–22.CrossRef Google Scholar

Díez, F. J. 1996. Local conditioning in Bayesian networks. Artificial Intelligence 87(1–2), 1–20.CrossRef Google Scholar

Díez, F. J., Mira, J. 1994. Distributed inference in Bayesian networks. Cybernetics and Systems 25(1), 39–61.CrossRef Google Scholar

Dojer, N. 2006. Learning Bayesian networks does not have to be NP-hard. In Proceedings of the Thirty-First International Symposium on Mathematical Foundations of Computer Science, Lecture Notes in Computer Science 4162, 305–314. Springer.Google Scholar

Dor, D., Tarsi, M. 1992. A Simple Algorithm to Construct a Consistent Extension of a Partially Oriented Graph. Technical report R-185, Cognitive Systems Laboratory, Department of Computer Science, UCLA.Google Scholar

Draper, D., Hanks, S. 1994. Localized partial evaluation of belief networks. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 170–177.Google Scholar

Druzdzel, M. J. 1994. Some properties of joint probability distributions. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 187–194.Google Scholar

Druzdzel, M. J. 1996. Qualitative verbal explanations in Bayesian belief networks. Artificial Intelligence and Simulation of Behaviour Quarterly 94, 43–54.Google Scholar

Druzdzel, M. J., Simon, H. A. 1993. Causality in Bayesian belief networks. In Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI-93), Heckerman, D. & Mamdani, A. (eds). Morgan Kaufmann, 3–11.CrossRef Google Scholar

Eaton, D., Murphy, K. 2007a. Bayesian structure learning using dynamic programming and MCMC. In Proceedings of the Twenty-third Annual Conference on Uncertainty in Artificial Intelligence (UAI-07), Parr, R. & van der Gaag, L. (eds). AUAI Press, 101–108.Google Scholar

Eaton, D., Murphy, K. 2007b. Exact Bayesian structure learning from uncertain interventions. In Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics 2, Journal of Machine Learning Research: Workshop and Conference Proceedings, Meila, M. & Shen, X. (eds). JMLR, 107–114.Google Scholar

Eberhardt, F., Glymour, C., Scheines, R. 2005. On the number of experiments sufficient and in the worst case necessary to identify all causal relations among N variables. In Proceedings of the Twenty-first Conference on Uncertainty in Artificial Intelligence (UAI-05), Bacchus, F. & Jaakkola, T. (eds). AUAI Press, 178–184.Google Scholar

Elidan, G., Friedman, N. 2001. Learning the dimensionality of hidden variables. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI-01), Breese, J. & Koller, D. (eds). Morgan Kaufmann, 144–151.Google Scholar

Elidan, G., Gould, S. 2008. Learning bounded treewidth Bayesian networks. Journal of Machine Learning Research 9, 2699–2731.Google Scholar

Elidan, G., Lotner, N., Friedman, N., Koller, D. 2001. Discovering hidden variables: a structure-based approach. In Advances in Neural Information Processing Systems 13, Leen, T. K., Dietterich, T. G. & Tresp, V. (eds). MIT Press, 479–485.Google Scholar

Elidan, G., Ninio, M., Friedman, N., Schuurmans, D. 2002. Data perturbation for escaping local maxima in learning. In Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI-02), Dechter, R., Kearns, M. & Sutton, R. (eds). AAAI Press, 132–139.Google Scholar

Elidan, G., Nachman, I., Friedman, N. 2007. “Ideal parent” structure learning for continuous variable Bayesian networks. Journal of Machine Learning Research 8, 1799–1833.Google Scholar

Faulkner, E. 2007. K2GA: heuristically guided evolution of Bayesian network structures from data. In Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2007), IEEE, 18–25. doi: 10.1109/CIDM.2007.368847.CrossRef Google Scholar

Feelders, A., van Straalen, R. 2007. Parameter learning for Bayesian networks with strict qualitative influences. In Advances in Intelligent Data Analysis VII: Proceedings of the Seventh International Symposium on Intelligent Data Analysis (IDA 2007), Berthold, M. R., Shawe-Taylor, J. & Lavrač, N. (eds). Lecture Notes in Computer Science 4723, 48–58. Springer.CrossRef Google Scholar

Flesch, I., Lucas, P. 2007. Independence decomposition in dynamic Bayesian networks. In Proceedings of the Ninth European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU 2007), Mellouli, K. (ed.). Lecture Notes in Artificial Intelligence 4724560–571. Springer.CrossRef Google Scholar

Forbes, J., Huang, T., Kanazawa, K., Russell, S. 1995. The BATmobile: towards a Bayesian automated taxi. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI 95), Mellish, C. S. (ed.). Morgan Kaufmann, 1878–1885.Google Scholar

Friedman, N. 1997. Learning belief networks in the presence of missing values and hidden variables. In Proceedings of the Fourteenth International Conference on Machine Learning (ICML ’97), Fisher, O. H. (ed.). Morgan Kaufmann, 125–133.Google Scholar

Friedman, N. 1998. The Bayesian structural EM algorithm. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), Cooper, G. F. & Moral, S. (eds). Morgan Kaufmann, 129–138.Google Scholar

Friedman, N. 2004. Inferring cellular networks using probabilistic graphical models. Science 303(5679), 799–805.CrossRef Google Scholar PubMed

Friedman, N., Getoor, L. 1999. Efficient learning using constrained sufficient statistics. In Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, Heckerman, D. & Whittaker, J. (eds). Morgan Kaufmann.Google Scholar

Friedman, N., Goldszmidt, M. 1996a. Discretizing continuous attributes while learning Bayesian networks. In Proceedings of the Thirteenth International Conference on Machine Learning (ICML '96), Saitta, L. (ed.). Morgan Kaufmann, 157–165.Google Scholar

Friedman, N., Goldszmidt, M. 1996b. Learning Bayesian networks with local structure. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 252–262.Google Scholar

Friedman, N., Goldszmidt, M. 1997. Sequential update of Bayesian network structure. In Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI-97), Geiger, D. & Shenoy, P. P. (eds). Morgan Kaufmann, 165–174.Google Scholar

Friedman, N., Koller, D. 2000. Being Bayesian about network structure. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-00), Boutilier, C. & Goldszmidt, M. (eds). Morgan Kaufmann, 201–210.Google Scholar

Friedman, N., Koller, D. 2003. Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine Learning 50(1–2), 95–125.CrossRef Google Scholar

Friedman, N., Yakhini, Z. 1996. On the sample complexity of learning Bayesian networks. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 274–282.Google Scholar

Friedman, N., Geiger, D., Goldszmidt, M. 1997. Bayesian network classifiers. Machine Learning 29(2–3), 131–163.CrossRef Google Scholar

Friedman, N., Murphy, K., Russell, S. 1998. Learning the structure of dynamic probabilistic networks. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), Cooper, G. F. & Moral, S. (eds). Morgan Kaufmann, 139–148.Google Scholar

Friedman, N., Goldszmidt, M., Wyner, A. 1999a. On the application of the Bootstrap for computing confidence measures on features of induced Bayesian networks. In Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, Heckerman, D. & Whittaker, J. (eds). Morgan Kaufmann, 197–202.Google Scholar

Friedman, N., Goldszmidt, M., Wyner, A. 1999b. Data analysis with Bayesian networks: a bootstrap approach. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 196–205.Google Scholar

Friedman, N., Nachman, I., Pe’er, D. 1999c. Learning Bayesian network structure from massive datasets: the “Sparse Candidate” algorithm. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 206–215.Google Scholar

Friedman, N., Linial, M., Nachman, I., Pe’er, D. 2000. Using Bayesian networks to analyze expression data. Journal of Computational Biology 7(3/4), 601–620.CrossRef Google Scholar PubMed

Fu, L. D. 2005. A Comparison of State-of-the-Art Algorithms for Learning Bayesian Network Structure from Continuous Data. Master’s thesis, Vanderbilt University.Google Scholar

Fung, R. M., Chang, K.-C. 1990. Weighing and integrating evidence for stochastic simulation in Bayesian networks. In Uncertainty in Artificial Intelligence 5, Henrion, M., Shachter, R., Kanal, L. & Lemmer, J. (eds). North-Holland, 209–219.CrossRef Google Scholar

Fung, R. M., Crawford, S. L. 1990. Constructor: a system for the induction of probabilistic models. In Proceedings of the Eighth National Conference on Artificial Intelligence 2, AAAI Press, 762–769.Google Scholar

Gámez, J. A., Puerta, J. M. 2002. Searching for the best elimination sequence in Bayesian networks by using ant colony optimization. Pattern Recognition Letters 23(1–3), 261–277.CrossRef Google Scholar

Gao, S., Xiao, Q., Pan, Q., Li, Q. 2007. Learning dynamic Bayesian networks structure based on Bayesian optimization algorithm. In Advances in Neural Networks: Proceedings of the Fourth International Symposium on Neural Networks (ISNN 2007), Lecture Notes in Computer Science Part II 4492, 424–431. Springer.CrossRef Google Scholar

Geiger, D. 1998. Graphical models and exponential families. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), Cooper, G. F. & Moral, S. (eds). Morgan Kaufmann, 156–165.Google Scholar

Geiger, D., Heckerman, D. 1994. Learning Gaussian Networks. Technical report MSR-TR-94-10, Microsoft Research.CrossRef Google Scholar

Geiger, D., Heckerman, D. 1995. A characterization of the Dirichlet distribution with application to learning Bayesian networks. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), Besnard, P. & Hanks, S. (eds). Morgan Kaufmann, 196–207.Google Scholar

Geiger, D., Heckerman, D. 1997. A characterization of the Dirichlet distribution through global and local parameter independence. The Annals of Statistics 25(3), 1344–1369.CrossRef Google Scholar

Geiger, D., Paz, A., Pearl, J. 1990. Learning causal trees from dependence information. In Proceedings of the Eighth National Conference on Artificial Intelligence (AAAI 1990), AAAI Press, 770–776.Google Scholar

Geiger, D., Heckerman, D., Meek, C. 1996. Asymptotic model selection for directed networks with hidden variables. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 283–290.Google Scholar

Geiger, D., Heckerman, D., King, H., Meek, C. 2001. Stratified exponential families: graphical models and model selection. The Annals of Statistics 29(2), 505–529.CrossRef Google Scholar

Ghahramani, Z. 1998. Learning dynamic Bayesian networks. In Adaptive Processing of Sequences and Data Structures, Giles, C. L. & Gori, M. (eds). Lecture Notes in Artificial Intelligence 1387, 168–197. Springer.CrossRef Google Scholar

Ghahramani, Z., Jordan, M. I. 1997. Factorial hidden Markov models. Machine Learning 29(2–3), 245–273.CrossRef Google Scholar

Gillispie, S., Perlman, M. D. 2001. Enumerating Markov equivalence classes of acyclic digraph models. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI-01), Breese, J. & Koller, D. (eds). Morgan Kaufmann, 171–177.Google Scholar

Gillispie, S. B., Perlman, M. D. 2002. The size distribution for Markov equivalence classes of acyclic digraph models. Artificial Intelligence 141(1–2), 137–155.CrossRef Google Scholar

Giudici, P., Castelo, R. 2003. Improving Markov chain Monte Carlo model search for data mining. Machine Learning 50(1–2), 127–158.CrossRef Google Scholar

Giudici, P., Green, P. J. 1999. Decomposable graphical Gaussian model determination. Biometrika 86(4), 785–801.CrossRef Google Scholar

Giudici, P., Green, P., Tarantola, C. 1999. Efficient model determination for discrete graphical models. Discussion paper 99-93, Department of Statistics, Athens University of Economics and Business.Google Scholar

Glymour, C., Cooper, G. F., (eds). 1999. Computation, Causation, & Discovery. The MIT Press.Google Scholar

Glymour, C., Scheines, R., Spirtes, P., Kelly, K. 1986. Discovering Causal Structure: Artifical Intelligence, Philosophy of Science and Statistical Modeling. Report CMU-PHIL-1, Department of Philosophy, Carnegie Mellon University.Google Scholar

Glymour, C., Scheines, R., Spirtes, P., Kelly, K. 1987. Discovering Causal Structure: Artificial Intelligence, Philosophy of Science, and Statistical Modeling. Academic Press.Google Scholar

Gold, E. M. 1967. Language identification in the limit. Information and Control 10(5), 447–474.CrossRef Google Scholar

Goldenberg, A., Moore, A. 2004. Tractable learning of large Bayes net structures from sparse data. In Proceedings of the Twenty-first International Conference on Machine Learning (ICML 2004), Carla E. Brodley (ed.). ACM, 44–51.Google Scholar

Gou, K. X., Jun, G. X., Zhao, Z. 2007. Learning Bayesian network structure from distributed homogeneous data. In Proceedings of the Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007) 3, Wenying Feng & Feng Gao (eds). IEEE, 250–254.Google Scholar

Greiner, R., Grove, A., Schuurmans, D. 1997. Learning Bayesian nets that perform well. In Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI-97), Geiger, D. & Shenoy, P. P. (eds). Morgan Kaufmann, 198–207.Google Scholar

Grzegorczyk, M., Husmeier, D. 2008. Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Machine Learning 71(2–3), 265–305.CrossRef Google Scholar

Guo, H., Hsu, W. 2002. A survey of algorithms for real-time Bayesian network inference. In Papers from the AAAI Workshop on Real-Time Decision Support and Diagnosis Systems, Guo, H., Horvitz, E., Hsu, W. H. & Santos, E. Jr (eds). AAAI Press, 1–12.Google Scholar

Guo, Y.-Y., Wong, M.-L., Cai, Z.-H. 2006. A novel hybrid evolutionary algorithm for learning Bayesian networks from incomplete data. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2006), 916–923.Google Scholar

Guyon, I., Aliferis, C., Cooper, G., Elisseeff, A., Pellet, J.-P., Spirtes, P., Statnikov, A. 2008. Design and analysis of the causation and prediction challenge. In Causation and Prediction Challenge (WCCI 2008), Lawrence, N. (ed.). 3, JMLR Workshop and Conference Proceedings, Journal of Machine Learning Research, 1–33.Google Scholar

Gyftodimos, E., Flach, P. A. 2004. Hierarchical Bayesian networks: an approach to classification and learning for structured data. In Methods and Applications of Artificial Intelligence: Proceedings of the Third Hellenic Conference on AI (SETN 2004), Lecture Notes in Artificial Intelligence 3025, 291–300. Springer.CrossRef Google Scholar

Hausman, D. M., Woodward, J. 1999. Independence, invariance and the causal markov condition. The British Journal for the Philosophy of Science 50(4), 521–583.CrossRef Google Scholar

Hausman, D. M., Woodward, J. 2004. Modularity and the causal Markov condition: a restatement. The British Journal for the Philosophy of Science 55(1), 147–161.CrossRef Google Scholar

He, Y.-B., Geng, Z. 2008. Active learning of causal networks with intervention experiments and optimal designs. Journal of Machine Learning Research 9, 2523–2547.Google Scholar

Heckerman, D. 1995a. A Bayesian Approach to Learning Causal Networks. Technical report MSR-TR-95-04, Microsoft Research.Google Scholar

Heckerman, D. 1995b. A Tutorial on Learning with Bayesian Networks. Technical report MSR-TR-95-06, Microsoft Research.Google Scholar

Heckerman, D. 2007. A Bayesian approach to learning causal networks. In Advances in Decision Analysis: from Foundations to Applications, Edwards, W. & Miles R. F. Jr (eds). Chapter 11, Cambridge University Press, 202–220.CrossRef Google Scholar

Heckerman, D., Breese, J. S. 1996. Causal independence for probability assessment and inference using Bayesian networks. IEEE Transactions on Systems, Man, and Cybernetics–Part A 26(6), 826–831.CrossRef Google Scholar

Heckerman, D., Geiger, D. 1995 . Likelihoods and Parameter Priors for Bayesian Networks. Technical report MSR-TR-95-54, Microsoft Research.Google Scholar

Heckerman, D. E., Horvitz, E. J., Nathwani, B. N. 1992. Toward normative expert systems: part I. The Pathfinder project. Methods of Information in Medicine 31(2), 90–105.Google Scholar PubMed

Heckerman, D., Geiger, D., Chickering, D. M. 1995. Learning Bayesian networks: the combination of knowledge and statistical data. Machine Learning 20(3), 197–243.CrossRef Google Scholar

Heng, X.-C., Qin, Z., Wang, X.-H., Shao, L.-P. 2006. Research on learning Bayesian networks by particle swarm optimization. Information Technology Journal 5(3), 540–545.CrossRef Google Scholar

Henrion, M. 1988. Propagating uncertainty in Bayesian networks by probabilistic logic sampling. In Uncertainty in Artificial Intelligence 2, Lemmer, J. F, & Kanal, L. N. (eds). North-Holland, 149–163.CrossRef Google Scholar

Hernández, L. D., Moral, S., Salmerón, A. 1998. A Monte Carlo algorithm for probabilistic propagation in belief networks based on importance sampling and stratified simulation techniques. International Journal of Approximate Reasoning 18(1–2), 53–91.CrossRef Google Scholar

Herskovits, E., Cooper, G. 1991. Kutató: an entropy-driven system for construction of probabilistic expert systems from data. In Uncertainty in Artificial Intelligence 6, Bonissone, P., Henrion, M., Kanal, L. & Lemmer, J. (eds). North-Holland, 54–62.Google Scholar

Hewawasam, R., Premaratne, K. 2007. Learning Bayesian network parameters from imperfect data: Enhancements to the EM algorithm. In: Intelligent Computing: Theory and Applications V, Proceedings of SPIE, Priddy, K. E. & Ertin, E. (eds). 6560, SPIE, 65600E-1–65600E-10. doi: 10.1117/12.719290.CrossRef Google Scholar

Hoeting, J. A., Madigan, D., Raftery, A. E., Volinsky, C. T. 1999. Bayesian model averaging: a tutorial. Statistical Science 14(4), 382–417.Google Scholar

Hofmann, R., Tresp, V. 1996. Discovering structure in continuous variables using Bayesian networks. In Advances in Neural Information Processing Systems 8 (NIPS*1995), Touretzky, D. S., Mozer, M. C. & Hasselmo, M. E. (eds). The MIT Press, 500–506.Google Scholar

Holness, G. F. 2007. A direct measure for the efficacy of Bayesian network structures learned from data. In Proceedings of the Fifth International Conference on Machine Learning and Data Mining in Pattern Recognition (MLDM 2007), Lecture Notes in Artificial Intelligence 4571, 601–615. Springer.CrossRef Google Scholar

Hsu, W. H., Guo, H., Perry, B. B., Stilson, J. A. 2002. A permutation genetic algorithm for variable ordering in learning Bayesian networks from data. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2002), Langdon, W. B. et al. (eds). Morgan Kaufmann, 383–390.Google Scholar

Huang, C., Darwiche, A. 1996. Inference in belief networks: a procedural guide. International Journal of Approximate Reasoning 15(3), 225–263.CrossRef Google Scholar

Huang, K., Henrion, M. 1996. Efficient search-based inference for noisy-OR belief networks: TopEpsilon. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 325–331.Google Scholar

Huang, Y., Valtorta, M. 2006. Identifiability in causal Bayesian networks: a sound and complete algorithm. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06) 2, AAAI Press, 1149–1154.Google Scholar

Huang, J., Pan, H., Wan, Y. 2005. An algorithm for cooperative learning of Bayesian network structure from data. In Proceedings of the Eight International Conference on Computer Supported Cooperative Work in Design (CSCWD 2004), Lecture Notes in Computer Science 3168, 86–94. Springer.Google Scholar

Huete, J. F., de Campos, L. M. 1993. Learning causal polytrees. In Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty (ECSQARU ’93), Clarke, M., Kruse, R. & Moral, S. (eds). Lecture Notes in Computer Science 747, 180–185. Springer.CrossRef Google Scholar

Husmeier, D. 2003. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19(17), 2271–2282.CrossRef Google Scholar PubMed

Hwang, K.-B., Lee, J. W., Chung, S.-W., Zhang, B.-T. 2002. Construction of large-scale bayesian networks by local to global search. In Trends in Artificial Intelligence: Proceedings of the Seventh Pacific Rim International Conference on Artificial Intelligence (PRICAI 2002), Lecture Notes in Artificial Intelligence 2417, 375–384. Springer.Google Scholar

Hwang, K.-B., Kim, B.-H., Zhang, B.-T. 2006. Learning hierarchical Bayesian networks for large-scale data analysis. In Proceedings of the Thirteenth International Conference on Neural Information Processing (ICONIP 2006), Lecture Notes in Computer Science 4232, 670–679. Springer.CrossRef Google Scholar

Imoto, S., Goto, T., Miyano, S. 2002. Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. In Proceedings of the Seventh Pacific Symposium on Biocomputing, Altman, R. B., Dunker, A. K., Hunter, L. & Klein, T. E. (eds). World Scientific, 175–186.Google Scholar

Jaakkola, T. S., Jordan, M. I. 1996. Computing upper and lower bounds on likelihoods in intractable networks. A.I. Memo 1571, Artficial Intelligence Lab, Massachusetts Institute of Technology.Google Scholar

Jaakkola, T. S., Jordan, M. I. 1997. Recursive algorithms for approximating probabilities in graphical models. In Advances in Neural Information Processing Systems 9 (NIPS*1996), Mozer, M., Jordan, M. I. & Petsche, T. (eds). The MIT Press, 487–493.Google Scholar

Jaakkola, T. S., Jordan, M. I. 1999a. Improving the mean field approximation via the use of mixture distributions. In Learning in Graphical Models, Jordan, M. I. (ed.). MIT Press, 163–174.Google Scholar

Jaakkola, T. S., Jordan, M. I. 1999b. Variational probabilistic inference and the QMR-DT network. Journal of Artificial Intelligence Research 10, 291–322.CrossRef Google Scholar

Jensen, F. V., Jensen, F. 1994. Optimal junction trees. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 360–366.Google Scholar

Jensen, F. V., Nielsen, T. D. 2007. Bayesian networks and decision graphs. Information Science and Statistics, 2nd edn.Springer.Google Scholar

Jensen, F. V., Lauritzen, S. L., Olesen, K. G. 1990a. Bayesian updating in causal probabilistic networks by local computations. Computational Statistics Quarterly 4, 269–282.Google Scholar

Jensen, F. V., Olesen, K. G., Andersen, S. K. 1990b. An algebra of Bayesian belief universes for knowledge-based systems. Networks 20(5), 637–659.CrossRef Google Scholar

Jia, H., Liu, D., Chen, J., Liu, X. 2007. A hybrid approach for learning Markov equivalence classes of Bayesian network. In Proceedings of the Second International Conference on Knowledge Science, Engineering and Management (KSEM 2007), Lecture Notes in Artificial Intelligence 4798, 611–616. Springer.Google Scholar

Jitnah, N., Nicholson, A. E. 1999. Arc weights for approximate evaluation of dynamic belief networks. In Proceedings of the Twelfth Australian Joint Conference on Artificial Intelligence (AI’99), Foo, N. (ed.). Lecture Notes in Artificial Intelligence 1747, 393–404. Springer.Google Scholar

John, G., Langley, P. 1995. Estimating continuous distributions in Bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), Besnard, P. & Hanks, S. (eds). Morgan Kaufmann, 338–345.Google Scholar

Jonsson, A., Barto, A. 2007. Active learning of dynamic Bayesian networks in Markov decision processes. In Proceedings of the Seventh International Symposium on Abstraction, Reformulation, and Approximation (SARA 2007), Lecture Notes in Artificial Intelligence 4612, 273–284. Springer.CrossRef Google Scholar

Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., Saul, L. K. 1999. An introduction to variational methods for graphical models. Machine Learning 37(2), 183–233.CrossRef Google Scholar

Jurgelenaite, R., Heskes, T. 2008. Learning symmetric causal independence models. Machine Learning 71(2–3), 133–153.CrossRef Google Scholar

Kalisch, M., Bühlmann, P. 2007. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research 8, 613–636.Google Scholar

Kanazawa, K., Koller, D., Russell, S. 1995. Stochastic simulation algorithms for dynamic probabilistic networks. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), Besnard, P. & Hanks, S. (eds). Morgan Kaufmann, 346–351.Google Scholar

Kayaalp, M., Cooper, G. F. 2002. A Bayesian network scoring metric that is based on globally uniform parameter priors. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI-02), Darwiche, A. & Friedman, N. (eds). Morgan Kaufmann, 251–258.Google Scholar

Kennedy, J., Eberhart, R. 1995. Particle swarm optimization. In Proceedings of the IEEE International Conference on Neural Networks 4, IEEE, 1942–1948. doi: 10.1109/ICNN.1995.488968.CrossRef Google Scholar

Kennedy, J., Eberhart, R. C. 1997. A discrete binary version of the particle swarm optimization algorithm. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics 5, IEEE, 4104–4108. doi: 10.1109/ICSMC.1997.637339.CrossRef Google Scholar

Kennett, R. J., Korb, K. B., Nicholson, A. E. 2001. Seabreeze prediction using Bayesian networks. In Proceedings of the Fifth Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD 2001), Lecture Notes in Artificial Intelligence 2035, 148–153. Springer.CrossRef Google Scholar

Kim, J. H., Pearl, J. 1983. A computational model for causal and diagnostic reasoning in inference systems. In Proceedings of the Eighth International Joint Conference on Artificial Intelligence (IJCAI 83), Bundy, A. (ed.). William Kaufmann, 190–193.Google Scholar

Kim, K.-J., Cho, S.-B. 2006. Evolutionary aggregation and refinement of Bayesian networks. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2006), IEEE, 1513–1520. doi: 10.1109/CEC.2006.1688488.CrossRef Google Scholar

Kirkpatrick, S., Gelatt, C. D. Jr, Vecchi, M. P. 1983. Optimization by simulated annealing. Science 220(4598), 671–680.CrossRef Google Scholar PubMed

Kjærulff, U. 1992a. A computational scheme for reasoning in dynamic probabilistic networks. In Proceedings of the Eighth Conference on Uncertainty in Artificial Intelligence (UAI-92), Dubois, D., Wellman, M. P., D’Ambrosio, B. & Smets, P. (eds). Morgan Kaufmann, 121–129.Google Scholar

Kjærulff, U. 1992b. Optimal decomposition of probabilistic networks by simulated annealing. Statistics and Computing 2(1), 7–17.CrossRef Google Scholar

Kjærulff, U. 1993. Approximation of Bayesian Networks Through Edge Removals. Technical report IR-93-2007, Department of Mathematics and Computer Science, Aalborg University.Google Scholar

Kjærulff, U. 1994. Reduction of computational complexity in Bayesian networks through removal of weak dependences. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 374–382.Google Scholar

Kjærulff, U. 1997. Nested junction trees. In Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI-97), Geiger, D. & Shenoy, P. P. (eds). Morgan Kaufmann, 294–301.Google Scholar

Kjaerulff, U. B., Madsen, A. L. 2008. Bayesian networks and influence diagrams: a guide to construction and analysis. Information Science and Statistics, Jordan, M., Kleinberg, J. & Schölkopf, B. (eds). Springer.Google Scholar

Kočka, T., Castelo, R. 2001. Improved learning of Bayesian networks. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI-01), Breese, J. & Koller, D. (eds). Morgan Kaufmann, 269–276.Google Scholar

Kočka, T., Bouckaert, R.R., Studený, M. 2001. On the Inclusion Problem. Research report 2010, Institute of Information Theory and Automation, Prague.Google Scholar

Koivisto, M. 2006. Advances in exact Bayesian structure discovery in Bayesian networks. In Proceedings of the Twenty-second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), Dechter, R. & Richardson, T. (eds). AUAI Press, 241–248.Google Scholar

Koivisto, M., Sood, K. 2004. Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research 5, 549–573.Google Scholar

Korb, K. B., Nicholson, A. E. 2004. Bayesian Artificial Intelligence. Series in Computer Science and Data Analysis, Chapman & Hall/CRC.Google Scholar

Korb, K. B., Nyberg, E. 2006. The power of intervention. Minds and Machines 16(3), 289–302.CrossRef Google Scholar

Ku, H. H., Kullback, S. 1969. Approximating discrete probability distributions. IEEE Transactions on Information Theory 15(4), 444–447.CrossRef Google Scholar

Kullback, S., Leibler, R. A. 1951. On information and sufficiency. The Annals of Mathematical Statistics 22(1), 79–86.CrossRef Google Scholar

Kwoh, C. K., Gillies, D. F. 1996. Using hidden nodes in Bayesian networks. Artificial Intelligence 88(1–2), 1–38.CrossRef Google Scholar

Lähdesmäki, H., Shmulevich, I. 2008. Learning the structure of dynamic Bayesian networks from time series and steady state measurements. Machine Learning 71(2–3), 185–217.CrossRef Google Scholar

Lam, W. 1998. Bayesian network refinement via machine learning approach. Transactions on Pattern Analysis and Machine Intelligence 20(3), 240–251.CrossRef Google Scholar

Lam, W., Bacchus, F. 1993. Using causal information and local measures to learn Bayesian networks. In Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI-93), Heckerman, D. & Mamdani, A. (eds). Morgan Kaufmann, 243–250.CrossRef Google Scholar

Lam, W., Bacchus, F. 1994a. Learning Bayesian belief networks: an approach based on the MDL principle. Computational Intelligence 10(3), 269–293.CrossRef Google Scholar

Lam, W., Bacchus, F. 1994b. Using new data to refine a Bayesian network. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 383–390.Google Scholar

Larrañaga, P., Kuijpers, C. M. H., Murga, R. H., Yurramendi, Y. 1996a. Learning Bayesian network structures by searching for the best ordering with genetic algorithms. IEEE Transactions on Systems, Man and Cybernetics—Part A 26(4), 487–493.CrossRef Google Scholar

Larrañaga, P., Poza, M., Yurramendi, Y., Murga, R. H., Kuijpers, C. M. H. 1996b. Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters. Transactions on Pattern Analysis and Machine Intelligence 18(9), 912–926.CrossRef Google Scholar

Lauritzen, S. L. 1995. The EM algorithm for graphical association models with missing data. Computational Statistics & Data Analysis 19(2), 191–201.CrossRef Google Scholar

Lauritzen, S. L., Spiegelhalter, D. J. 1988. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society. Series B (Methodological) 50(2), 157–224.CrossRef Google Scholar

Lauritzen, S. L., Wermuth, N. 1989. Graphical models for associations between variables, some of which are qualitative and some quantitative. The Annals of Statistics 17(1), 31–57.Google Scholar

Leray, P., François, O. 2005. Bayesian network structural learning and incomplete data. In Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR 2005), Honkela, T., Könöner, V., Pöllä, M. & Simula, O. (eds). Espoo, Finland, 33–40.Google Scholar

Li, Z., D’Ambrosio, B. 1994. Efficient inference in Bayes networks as a combinatorial optimization problem. International Journal of Approximate Reasoning 11(1), 55–81.CrossRef Google Scholar

Li, J., Wang, Z. J. 2009. Controlling the false discovery rate of the association/causality structure learned with the PC algorithm. Journal of Machine Learning Research 10, 475–514.Google Scholar

Li, X.-L., Wang, S.-C., He, X.-D. 2006. Learning Bayesian networks structures based on memory binary particle swarm optimization. In Proceedings of the Sixth International Conference on Simulated Evolution and Learning (SEAL 2006), Lecture Notes in Computer Science 4247, 568–574. Springer.CrossRef Google Scholar

Li, G., Dai, H., Tu, Y. 2002. Linear causal model discovery using the MML criterion. In Proceedings of the IEEE International Conference on Data Mining (ICDM 2002), Kumar, V., Tsumoto, S., Zhong, N., Yu, P. S. & Wu, X. (eds). IEEE, 274–281. doi: 10.1109/ICDM.2002.1183913.CrossRef Google Scholar

Liang, F., Zhang, J. 2009. Learning Bayesian networks for discrete data. Computational Statistics & Data Analysis 53(4), 865–876.CrossRef Google Scholar

Lin, Y., Druzdzel, M. J. 1999. Stochastic sampling and search in belief updating algorithms for very large Bayesian networks. In Working Notes of the AAAI Spring Symposium on Search Techniques for Problem Solving under Uncertainty and Incomplete Information, Zhang, W. & Koenig, S. (eds). AAAI Press, 77–82.Google Scholar

Liu, F., Zhu, Q. 2007a. The max-relevance and min-redundancy greedy Bayesian network learning algorithm. In Bio-inspired Modeling of Cognitive Tasks: Proceedings of the Second International Work-Conference on the Interplay between Natural and Artificial Computation (IWINAC 2007), Lecture Notes in Computer Science 4527, 346–356. Springer, Part I.CrossRef Google Scholar

Liu, F., Zhu, Q. 2007b. Max-relevance and min-redundancy greedy Bayesian network learning on high dimensional data. In Proceedings of the Third International Conference on Natural Computation (ICNC 2007), Lei, J., Yoo, J. & Zhang, Q. (eds). 1, IEEE, 217–221.CrossRef Google Scholar

Liu, F., Tian, F., Zhu, Q. 2007a. Bayesian network structure ensemble learning. In Proceedings of the Third International Conference on Advanced Data Mining and Applications (ADMA 2007), Lecture Notes in Artificial Intelligence 4632, 454–465. Springer.CrossRef Google Scholar

Liu, F., Tian, F., Zhu, Q. 2007b. An improved greedy Bayesian network learning algorithm on limited data. In Proceedings of the Seventeenth International Conference on Artificial Neural Networks (ICANN 2007), Lecture Notes in Computer Science 4668, 49–57. Springer, Part I.CrossRef Google Scholar

Lucas, P. 2002. Restricted Bayesian network structure learning. In Proceedings of the First European Workshop on Probabilistic Graphical Models (PGM 2002), Gámez J. A. & Salmeron A. (eds). 117–126.Google Scholar

Lucas, P. J. F., van der Gaag, L. C., Abu-Hanna, A. 2004. Bayesian networks in biomedicine and health-care. Artificial Intelligence in Medicine 30(3), 201–214.CrossRef Google Scholar PubMed

Madigan, D., Raftery, A. E. 1994. Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association 89(428), 1535–1546.CrossRef Google Scholar

Madigan, D., Raftery, A. E., York, J. C., Bradshaw, J. M., Almond, R. G. 1993. Strategies for graphical model selection. In Proceedings of the Fourth International Workshop on Artificial Intelligence and Statistics, Cheeseman, P. & Oldford, R. W. (eds). Fort Lauderdale, USA, 331–336.Google Scholar

Madigan, D., Gavrin, J., Raftery, A. E. 1994. Enhancing the Predictive Performance of Bayesian Graphical Models. Technical report 270, Department of Statistics, University of Washington.Google Scholar

Madigan, D., York, J., Allard, D. 1995. Bayesian graphical models for discrete data. International Statistical Review 63(2), 215–232.CrossRef Google Scholar

Madigan, D., Andersson, S. A., Perlman, M. D., Volinsky, C. T. 1996. Bayesian model averaging and model selection for Markov equivalence classes of acyclic digraphs. Communications in Statistics—Theory and Methods 25(11), 2493–2519.CrossRef Google Scholar

Madigan, D., Mosurski, K., Almond, R. G. 1997. Graphical explanation in belief networks. Journal of Computational and Graphical Statistics 6(2), 160–181.Google Scholar

Malvestuto, F. 1991. Approximating discrete probability distributions with decomposable models. Systems, Man and Cybernetics, IEEE Transactions on 21(5), 1287–1294.CrossRef Google Scholar

Mansinghka, V., Kemp, C., Griffiths, T., Tenenbaum, J. 2006. Structured priors for structure learning. In Proceedings of the Twenty-Second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), Dechter, R. & Richardson, T. (eds). AUAI Press, 324–331.Google Scholar

Margaritis, D. 2004. Distribution-free Learning of Graphical Model Structure in Continuous Domains. Technical report TR-ISU-CS-04-06, Department of Computer Science, Iowa State University.Google Scholar

Margaritis, D., Thrun, V. 2000. Bayesian network induction via local neighborhoods. In Advances in Neural Information Processing Systems 12 (NIPS*1999), Solla, S. A., Leen, T. K. & Müller, K.-R. (eds). The MIT Press, 505–511.Google Scholar

Mascherini, M., Stefanini, F. M. 2007. Using weak prior information on structures to learn bayesian networks. In Proceedings of the Eleventh International Conference on Knowledge-based Intelligent Information and Engineering Systems (KES 2007), Lecture Notes in Artificial Intelligence 4692, 413–420. Springer, Part I.Google Scholar

Meek, C. 1995. Causal inference and causal explanation with background knowledge. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), Besnard, P. & Hanks, S. (eds). Morgan Kaufmann, 403–410.Google Scholar

Meek, C. 1997. Graphical Models: Selecting Causal and Statistical Models. PhD thesis, Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA.Google Scholar

Meek, C., Heckerman, D. 1997. Structure and parameter learning for causal independence and causal interaction models. In Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI-97), Geiger, D. & Shenoy, P. P. (eds). Morgan Kaufmann, 366–375.Google Scholar

Meganck, S., Leray, P., Manderick, B. 2006. Learning causal Bayesian networks from observations and experiments: a decision theoretic approach. In Proceedings of the Third International Conference on Modeling Decisions for Artificial Intelligence (MDAI 2006), Lecture Notes in Computer Science 3885, 58–69. Springer.Google Scholar

Meilaˇ, M., Jaakkola, T. 2006. Tractable Bayesian learning of tree belief networks. Statistics and Computing 16(1), 77–92.CrossRef Google Scholar

Middleton, B., Shwe, M., Heckerman, D., Henrion, M., Horvitz, E., Lehmann, H., Cooper, G. 1991. Probabilistic diagnosis using a reformulation of the INTERNIST-1/QMR knowledge base: II. Evaluation of diagnostic performance. Methods of Information in Medicine 30(4), 256–267.Google Scholar PubMed

Miguel, I., Shen, Q. 2001. Solution techniques for constraint satisfaction problems: advanced approaches. Artificial Intelligence Review 15(4), 269–293.CrossRef Google Scholar

Mondragón-Becerra, R., Cruz-Ramírez, N., Garcia-López, D.A., Gutiérrez-Fragoso, K., Luna-Ramrez, W.A., Ortiz-Hernández, G., Piña-Garca, C.A. 2006. Automatic construction of bayesian network structures by means of a concurrent search mechanism. In Proceedings of the Fifth Mexican International Conference on Artificial Intelligence (MICAI 2006), Lecture Notes in Artificial Intelligence 4293, 652–662. Springer.Google Scholar

Monti, S., Cooper, G. F. 1996. Bounded recursive decomposition: a search-based method for belief-network inference under limited resources. International Journal of Approximate Reasoning 15(1), 49–75.CrossRef Google Scholar

Monti, S., Cooper, G.F. 1997a. Learning Bayesian belief networks with neural network estimators. In Advances in Neural Information Processing Systems 9 (NIPS*1996), Mozer, M., Jordan, M. I. & Petsche, T. (eds). The MIT Press, 578–584.Google Scholar

Monti, S., Cooper, G. F. 1997b. Learning Hybrid Bayesian Networks from Data. Technical report ISSP-97-01, Intelligent Systems Program, University of Pittsburgh.Google Scholar

Monti, S., Cooper, G. F. 1998. A multivariate discretization method for learning Bayesian networks from mixed data. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), Cooper, G. F. & Moral, S. (eds). Morgan Kaufmann, 404–413.Google Scholar

Moore, A., Lee, M. S. 1998. Cached sufficient statistics for efficient machine learning with large datasets. Journal of Artificial Intelligence Research 8, 67–91.CrossRef Google Scholar

Moore, A., Wong, W.-K. 2003. Optimal reinsertion: a new search operator for accelerated and more accurate Bayesian network structure learning. In Proceedings of the Twentieth International Conference on Machine Learning, Fawcett, T. & Mishra, N. (eds). AAAI Press, 552–559.Google Scholar

Morales, M. M., Domínguez, R. G., Ramírez, N. C., Hernández, A. G., Andrade, J. L. J. 2004. A method based on genetic algorithms and fuzzy logic to induce Bayesian networks. In Proceedings of the Fifth Mexican International Conference in Computer Science (ENC ’04), Baeza-Yates, R., Marroquin, J. L. & Chávez, E. (eds). IEEE Computer Society, 176–180.Google Scholar

Munteanu, P., Bendou, M. 2001. The EQ framework for learning equivalence classes of Bayesian networks. In Proceedings of the 2001 IEEE International Conference on Data Mining, Cercone, N., Lin, T. Y. & Wu, X. (eds). IEEE Computer Society, 417–424.CrossRef Google Scholar

Munteanu, P., Cau, D. 2000. Efficient score-based learning of equivalence classes of Bayesian networks. In Proceedings of the Fourth European Conference on the Principles of Data Mining and Knowledge Discovery (PKDD 2000), Zighed, D. A., Komorowski, H. J. & Zytkow, J. M. (eds). Lecture Notes in Computer Science 1910, 96–105. Springer.CrossRef Google Scholar

Murphy, K. P. 2001. Active Learning of Causal Bayes Net Structure. Technical report, Department of Computer Science, University of California, Berkeley.Google Scholar

Murphy, K. P., Mian, S. 1999. Modelling Gene Expression Data Using Dynamic Bayesian Networks. Technical report, Computer Science Division, University of California, Berkeley.Google Scholar

Murphy, K. P., Weiss, Y., Jordan, M. I. 1999. Loopy belief propagation for approximate inference: an empirical study. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 467–475.Google Scholar

Muruzábal, J., Cotta, C. 2004. A primer on the evolution of equivalence classes of Bayesian-network structures. In Proceedings of the 8th International Conference on Parallel Problem Solving from Nature—PPSN VIII, Yao, X., Burke, E., Lozano, J. A., Smith, J., Merelo-Guervós, J. J., Bullinaria, J. A., Rowe, J., Tiňo, P., Kabán, A. & Schwefel, H.-P. (eds). Lecture Notes in Computer Science 3242, 612–621. Springer.Google Scholar

Myers, J. W., Laskey, K. B., DeJong, K. A. 1999a. Learning Bayesian networks from incomplete data using evolutionary algorithms. In Proceedings of the Genetic and Evolutionary Computation Conference, Banzhaf, W., Daida, J., Eiben, A. E., Garzon, M. H., Honavar, V., Jakiela, M. & Smith, R. E. (eds). 1, Morgan Kaufmann, 458–465.Google Scholar

Myers, J. W., Laskey, K. B., Levitt, T. S. 1999b. Learning Bayesian networks from incomplete data with stochastic search algorithms. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 476–484.Google Scholar

Nägele, A., Dejori, M., Stetter, M. 2007. Bayesian substructure learning—approximate learning of very large network structures. In Proceedings of the Eighteenth European Conference on Machine Learning (ECML 2007), Lecture Notes in Artificial Intelligence 4701, 238–249. Springer.Google Scholar

Neal, R. M. 1992. Connectionist learning of belief networks. Artificial Intelligence 56(1), 71–113.CrossRef Google Scholar

Neal, R. M., Hinton, G. E. 1999. A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models, Jordan, M. I. (ed.). MIT Press.Google Scholar

Neapolitan, R. E. 2004. Learning Bayesian Networks. Series in Artificial Intelligence. Prentice Hall.Google Scholar

Neil, J. R., Korb, K. B. 1999. The evolution of causal models: a comparison of Bayesian metrics and structure priors. In Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining (PAKDD ’99), Lecture Notes in Artificial Intelligence 1574, 432–437. Springer.Google Scholar

Neil, J., Wallace, C., Korb, K. 1999. Learning Bayesian networks with restricted causal interactions. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99), Prade, H. & Laskey, K. (eds). Morgan Kaufmann, 486–493.Google Scholar

Nielsen, S. H., Nielsen, T. D. 2008. Adapting Bayes network structures to non-stationary domains. International Journal of Approximate Reasoning 49(2), 379–397.CrossRef Google Scholar

Nielsen, J. D., Kočka, T., Peña, J. 2003. On local optima in learning Bayesian networks. In Proceedings of the Ninteenth Conference on Uncertainty in Artificial Intelligence, Meek, C. & Kjærulff, U. (eds). Morgan Kaufmann, 435–444.Google Scholar

Novobilski, A. J. 2003. The random selection and manipulation of legally encoded Bayesian networks in genetic algorithms. In Proceedings of the First International Conference on Artificial Intelligence (IC-AI ’03), Arabnia, H. R., Joshua, R. & Mun, Y. (eds). 1, CSREA Press, 438–443.Google Scholar

O’Donnell, R. T., Allison, L., Korb, K. B. 2006a. Learning hybrid Bayesian networks by MML. In Advances in Artificial Intelligence: Proceedings of the Ninteenth Australian Joint Conference on Artificial Intelligence (AI 2006), Lecture Notes in Artificial Intelligence 4304, 192–203. Springer.Google Scholar

O’Donnell, R. T., Nicholson, A. E., Han, B., Korb, K. B., Alam, M. J., Hope, L. R. 2006b. Causal discovery with prior information. In Proceedings of the Ninteenth Australian Joint Conference on Artificial Intelligence (AI 2006), Lecture Notes in Artificial Intelligence 4304, 1162–1167. Springer.Google Scholar

Ott, S., Miyano, S. 2003. Finding optimal gene networks using biological constraints. Genome Informatics 14, 124–133.Google Scholar PubMed

Ott, S., Imoto, S., Miyano, S. 2004. Finding optimal models for small gene networks. In Proceedings of the Ninth Pacific Symposium on Biocomputing, Altman, R. B., Dunker, A. K., Hunter, L., Jung, T. A. & Klein, T. E. (eds). World Scientific, 557–567.Google Scholar

Pakzad, P., Anantharam, V. 2002. Belief propagation and statistical physics. In Proceedings of the 2002 Conference on Information Sciences and Systems, Princeton University, USA.Google Scholar

Park, J. D., Darwiche, A. 2004. A differential semantics for jointree algorithms. Artificial Intelligence 156(2), 197–216.CrossRef Google Scholar

Pearl, J. 1982. Reverend Bayes on inference engines: a distributed hierarchical approach. In Proceedings of the Second National Conference on Artificial Intelligence, Waltz, D. L. (ed.). The AAAI Press, 133–136.Google Scholar

Pearl, J. 1986a. A constraint—propagation approach to probabilistic reasoning. In Uncertainty in Artificial Intelligence, Kanal, L. N. & Lemmer, J. F. (eds). North-Holland, 357–369.CrossRef Google Scholar

Pearl, J. 1986b. Fusion, propagation, and structuring in belief networks. Artificial Intelligence 29(3), 241–288.CrossRef Google Scholar

Pearl, J. 1987. Evidential reasoning using stochastic simulation of causal models. Artificial Intelligence 32(2), 245–257.CrossRef Google Scholar

Pearl, J 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Series in Representation and Reasoning, Morgan Kaufmann.Google Scholar

Pearl, J. 2000. Causality. Cambridge University Press.Google Scholar

Pearl, J., Verma, T. S. 1991. A theory of inferred causation. In Proceedings of the Second International Conference on Principles of Knowledge Representation and Reasoning, Allen, J. F., Fikes, R. & Sandewall, E. (eds). San Mateo, California: Morgan Kaufmann, 441–452.Google Scholar

Peng, H., Ding, C. 2003. Structure search and stability enhancement of Bayesian networks. In Proceedings of the Third IEEE International Conference on Data Mining (ICDM 2003), Wu, X., Tuzhilin, A. & Shavlik, J. (eds). IEEE Computer Society, 621–624. doi: 10.1109/ICDM.2003.1250992.CrossRef Google Scholar

Peot, M. A., Shachter, R. D. 1991. Fusion and propagation with multiple observations in belief networks. Artificial Intelligence 48(3), 299–318.CrossRef Google Scholar

Perlman, M. D. 2001. Graphical Model Search Via Essential Graphs. Technical report 367, Department of Statistics, University of Washington.CrossRef Google Scholar

Perrier, E., Imoto, S., Miyano, S. 2008. Finding optimal Bayesian network given a super-structure. Journal of Machine Learning Research 9, 2251–2286.Google Scholar

Poole, D. 1993a. Average-case analysis of a search algorithm for estimating prior and posterior probabilities in Bayesian networks with extreme probabilities. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI 93), Bajcsy, R. (ed.). Morgan Kaufmann, 606–612.Google Scholar

Poole, D. 1993b. The use of conflicts in searching Bayesian networks. In Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI-93), Heckerman, D. & Mamdani, A. (eds). Morgan Kaufmann, 359–367.CrossRef Google Scholar

Poole, D. 1996. Probabilistic conflicts in a search algorithm for estimating posterior probabilities in Bayesian networks. Artificial Intelligence 88(1–2), 69–100.CrossRef Google Scholar

Poole, D. 1997. Probabilistic partial evaluation: exploiting rule structure in probabilistic inference. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI 97), Pollack, M. E. (ed.). Morgan Kaufmann, 1284–1291.Google Scholar

Poole, D. 1998. Context-specific approximation in probabilistic inference. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), Cooper, G. F. & Moral, S. (eds). Morgan Kaufmann, 447–454.Google Scholar

Poole, D., Zhang, N. L. 2003. Exploiting contextual independence in probabilistic inference. Journal of Artificial Intelligence Research 18, 263–313.CrossRef Google Scholar

Pourret, O., Naïm, P., Marcot, B. (eds). 2008. Bayesian Networks: A Practical Guide to Applications. Statistics in Practice, Wiley.CrossRef Google Scholar

Pradhan, M., Dagum, P. 1996. Optimal Monte-Carlo estimation of belief network inference. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 446–453.Google Scholar

Provan, G. M., Singh, M. 1996. Learning Bayesian networks using feature selection. In Learning from Data: Artificial Intelligence and Statistics V, Fisher, D. & Lenz, H.-J. (eds). Lecture Notes in Statistics 112, 450–456. Springer.Google Scholar

Ramoni, M., Sebastiani, P. 1997a. Learning Bayesian Networks from Incomplete Databases. Technical report KMI-TR-43, Knowledge Media Institute, The Open University.Google Scholar

Ramoni, M., Sebastiani, P. 1997b. The use of exogenous knowledge to learn Bayesian networks from incomplete databases. In Proceedings of the Second International Symposium on Advances in Intelligent Data Analysis, Reasoning about Data (IDA ’97), Lecture Notes in Computer Science 1280, 537–548. Springer.CrossRef Google Scholar

Ramoni, M., Sebastiani, P. 1999. Learning conditional probabilities from incomplete databases: an experimental comparison. In Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, Heckerman, D. & Whittaker, J. (eds). Morgan Kaufmann.Google Scholar

Ramoni, M., Sebastiani, P. 2001. Robust learning with missing data. Machine Learning 45(2), 147–170.CrossRef Google Scholar

Rebane, G., Pearl, J. 1987. The recovery of causal poly-trees from statistical data. In Uncertainty in Artificial Intelligence 3, Kanal, L. N., Levitt, T. S. & Lemmer, J. F. (eds). North-Holland, 175–182.Google Scholar

Richardson, T., Spirtes, P. 2002. Ancestral graph Markov models. The Annals of Statistics 30(4), 962–1030.CrossRef Google Scholar

Riggelsen, C. 2008. Learning Bayesian networks: a MAP criterion for joint selection of model structure and parameter. In Proceedings of the Eighth IEEE International Conference on Data Mining (ICDM ’08), Giannotti, F., Gunopulos, D., Turini, F., Zaniolo, C., Ramakrishnan, N. & Wu, X. (eds). IEEE, 522–529.CrossRef Google Scholar

Riggelsen, C., Feelders, A. 2005. Learning Bayesian network models from incomplete data using importance sampling. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, Cowell, R. G. & Ghahramani, Z. (eds). Society for Artificial Intelligence and Statistics, 301–308.Google Scholar

Rissanen, J. 1978. Modeling by shortest data description. Automatica 14(5), 465–471.CrossRef Google Scholar

Robinson, J. W., Hartemink, A. J. 2009. Non-stationary dynamic Bayesian networks. In Advances in Neural Information Processing Systems 21 (NIPS*2008), Koller, D., Schuurmans, D., Bengio, Y. & Bottou, L. (eds). The MIT Press, 1369–1376.Google Scholar

Russell, S. J., Binder, J., Koller, D., Kanazawa, K. 1995. Local learning in probabilistic networks with hidden variables. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI 95), Mellish, C. S. (ed.). 2, Morgan Kaufmann, 1146–1152.Google Scholar

Sahin, F., Devasia, A. 2007. Distributed particle swarm optimization for structural Bayesian network learning. In Swarm Intelligence: Focus on Ant and Particle Swarm Optimization, Chan, F. T. S. & Tiwari, M. K. (eds). chapter 27, I-Tech Education and Publishing, Vienna, Austria, 505–532.Google Scholar

Sanscartier, M. J., Neufeld, E. 2007. Identifying hidden variables from context-specific independencies. In Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2007), Wilson, D. C. & Sutcliffe, G. C. J. (eds). AAAI Press, 472–477.Google Scholar

Santos, E. Jr, Shimony, S. E. 1998. Deterministic approximation of marginal probabilities in Bayes nets. IEEE Transactions on Systems, Man, and Cybernetics–Part A 28(4), 377–393.CrossRef Google Scholar

Santos, E. Jr, Shimony, S. E., Williams, E. 1996. Sample-and-accumulate algorithms for belief updating in Bayes networks. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 477–484.Google Scholar

Santos, E. Jr, Shimony, S. E., Williams, E. 1997. Hybrid algorithms for approximate belief updating in Bayes nets. International Journal of Approximate Reasoning 17(2–3), 191–216.CrossRef Google Scholar

Sarkar, S., Murthy, I. 1996. Constructing efficient belief network structures with expert provided information. IEEE Transactions on Knowledge and Data Engineering 8(1), 134–143.CrossRef Google Scholar

Scheines, R., Spries, P., Glymour, C. 1991. Building Latent Variable Models. Technical report CMU-PHIL-19, Department of Philosophy, Carnegie Mellon University.Google Scholar

Schmidt, T., Shenoy, P. P. 1998. Some improvements to the Shenoy-Shafer and Hugin architectures for computing marginals. Artificial Intelligence 102(2), 323–333.CrossRef Google Scholar

Schulte, O., Luo, W., Greiner, R. 2007. Mind change optimal learning of Bayes net structure. In Learning Theory: Proceedings of the Twentieth Annual Conference on Learning Theory (COLT 2007), Lecture Notes in Artificial Intelligence 4539, 187–202. Springer.CrossRef Google Scholar

Shachter, R. D. 1986a. Evaluating influence diagrams. Operations Research 34(6), 871–882.CrossRef Google Scholar

Shachter, R. D. 1986b. Intelligent probabilistic inference. In Uncertainty in Artificial Intelligence, Kanal, L. N. & Lemmer, J. F. (eds). North-Holland, 371–382.CrossRef Google Scholar

Shachter, R. D. 1988. Probabilistic inference and influence diagrams. Operations Research 36(4), 589–604.CrossRef Google Scholar

Shachter, R., Peot, M. 1990. Simulation approaches to general probabilistic inference on belief networks. In Uncertainty in Artificial Intelligence 5, Henrion, M., Shachter, R., Kanal, L. & Lemmer, J. (eds). North-Holland, 221–234.CrossRef Google Scholar

Shachter, R., Andersen, S., Szolovits, P. 1994. Global conditioning for probabilistic inference in belief networks. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 514–522.Google Scholar

Shafer, G. R., Shenoy, P. P. 1990. Probability propagation. Annals of Mathematics and Artificial Intelligence 2(1–4), 327–351.CrossRef Google Scholar

Shaughnessy, P., Livingston, G. 2005. Evaluating the Causal Explanatory Value of Bayesian Network Structure Learning Algorithms. Research paper 2005-013, Department of Computer Science, University of Massachusetts Lowell.Google Scholar

Shenoy, P. P. 1997. Binary join trees for computing marginals in the Shenoy-Shafer architecture. International Journal of Approximate Reasoning 17(2–3), 239–263.CrossRef Google Scholar

Shenoy, P. P., Shafer, G. 1990. Axioms for probability and belief-function propagation. In Readings in Uncertain Reasoning, Shafer, G. & Pearl, J. (eds). chapter 7, Morgan Kaufmann, 575–610.Google Scholar

Shimony, S. E., Santos, E. Jr 1996. Exploiting case-based independence for approximating marginal probabilities. International Journal of Approximate Reasoning 14(1), 25–54.CrossRef Google Scholar

Shwe, M., Cooper, G. 1991. An empirical analysis of likelihood-weighting simulation on a large, multiply connected medical belief network. Computers and Biomedical Research 24(5), 453–475.CrossRef Google Scholar PubMed

Shwe, M., Middleton, B., Heckerman, D., Henrion, M., Horvitz, E., Lehmann, H., Cooper, G. 1991. Probabilistic diagnosis using a reformulation of the INTERNIST-1/QMR knowledge base: I. The probabilistic model and inference algorithms. Methods of Information in Medicine 30(4), 241–255.CrossRef Google Scholar PubMed

Silander, T., Myllymäki, P. 2006. A simple approach for finding the globally optimal Bayesian network structure. In Proceedings of the Twenty-second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), Dechter, R. & Richardson, T. (eds). AUAI Press, 445–452.Google Scholar

Silander, T., Kontkanen, P., Myllymaki, P. 2007. On sensitivity of the MAP Bayesian network structure to the equivalent sample size parameter. In Proceedings of the Twenty-third Conference on Uncertainty in Artificial Intelligence (UAI-07), AUAI Press, 360–367.Google Scholar

Singh, A. P., Moore, A. W. 2005. Finding Optimal Bayesian Networks by Dynamic Programming. Technical report CMU-CALD-05-106, School of Computer Science, Carnegie Mellon University.Google Scholar

Singh, M., Valtorta, M. 1993. An algorithm for the construction of Bayesian network structures from data. In Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI-93), Heckerman, D. & Mamdani, A. (eds). Morgan Kaufmann, 259–265.CrossRef Google Scholar

Singh, M., Valtorta, M. 1995. Construction of Bayesian network structures from data: a brief survey and an efficient algorithm. International Journal of Approximate Reasoning 12(2), 111–131.CrossRef Google Scholar

Smyth, P. 1997. Belief networks, hidden Markov models, and Markov random fields: a unifying view. Pattern Recognition Letters 18(11–13), 1261–1268.CrossRef Google Scholar

Spiegelhalter, D. J. 1986. Probabilistic reasoning in predictive expert systems. In Uncertainty in Artificial Intelligence, Kanal, L. N. & Lemmer, J. F. (eds). North-Holland, 47–67.CrossRef Google Scholar

Spiegelhalter, D. J., Lauritzen, S. L. 1990. Sequential updating of conditional probabilities on directed graphical structures. Networks 20(5), 579–605.CrossRef Google Scholar

Spirtes, P. 1991. Detecting causal relations in the presence of unmeasured variables. In Proceedings of the Seventh Annual Conference on Uncertainty in Artificial Intelligence (UAI-91), Morgan Kaufmann, San Mateo, CA, 392–397.Google Scholar

Spirtes, P., Glymour, C. 1990a. An Algorithm for Fast Recovery of Sparse Causal Graphs. Report CMU-PHIL-15, Department of Philosophy, Carnegie Mellon University.Google Scholar

Spirtes, P., Glymour, C. 1990b. Casual Structure among Measured Variables Preserved with Unmeasured Variables. Report CMU-PHIL-14, Department of Philosophy, Carnegie Mellon University.Google Scholar

Spirtes, P., Glymour, C. 1991. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review 90(1), 62–72.CrossRef Google Scholar

Spirtes, P., Meek, C. 1995. Learning Bayesian networks with discrete variables from data. In Proceedings of First International Conference on Knowledge Discovery and Data Mining, Fayyad, U. M. & Uthurusamy, R. (eds). AAAI Press, 294–299.Google Scholar

Spirtes, P., Glymour, C., Scheines, R. 1989. Causality from Probability. Report CMU-PHIL-12, Department of Philosophy, Carnegie Mellon University.Google Scholar

Spirtes, P., Glymour, C., Scheines, R. 1990. From probability to causality. Philosophical Studies 64(1), 1–36.CrossRef Google Scholar

Spirtes, P., Glymour, C., Scheines, R. 1993. Causation, Prediction and Search, Lecture Notes in Statistics, 1st edn. 81, Springer.CrossRef Google Scholar

Spirtes, P., Meek, C., Richardson, T. 1995. Causal inference in the presence of latent variables and selection bias. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), Besnard, P. & Hanks, S. (eds). Morgan Kaufmann, 499–506.Google Scholar

Spirtes, P., Glymour, C., Scheines, R. 2000. Causation, Prediction, and Search. Adaptive Computation and Machine Learning, 2nd edn.The MIT Press.Google Scholar

Srinivas, S. 1993. A generalization of the noisy-or model. In Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI-93), Heckerman, D. & Mamdani, A. (eds). Morgan Kaufmann, 208–218.CrossRef Google Scholar

Steck, H. 2000. On the use of skeletons when learning in Bayesian networks. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-00), Boutilier, C. & Goldszmidt, M. (eds). Morgan Kaufmann, 558–565.Google Scholar

Steck, H. 2008. Learning the Bayesian network structure: Dirichlet prior vs data. In Proceedings of the Twenty-fourth Conference on Uncertainty in Artificial Intelligence (UAI-08), McAllester, D. A. & Myllymäki, P. (eds). AUAI Press, 511–518.Google Scholar

Steck, H., Jaakkola, T. S. 2002. Unsupervised active learning in large domains. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI-02), Darwiche, A. & Friedman, N. (eds). Morgan Kaufmann, 469–476.Google Scholar

Steck, H., Jaakkola, T. S. 2003a. On the Dirichlet prior and Bayesian regularization. In Advances in Neural Information Processing Systems 15 (NIPS*2002), Becker, S., Thrun, S. & Obermayer, K. (eds). The MIT Press, 697–704.Google Scholar

Steck, H., Jaakkola, T. S. 2003b. (Semi-)predictive discretization during model selection. AI Memo 2003-002, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.Google Scholar

Steel, D. 2005. Indeterminism and the causal Markov condition. The British Journal for the Philosophy of Science 56(1), 3–26.CrossRef Google Scholar

Steel, D. 2006. Comment on Hausman & Woodward on the causal Markov condition. The British Journal for the Philosophy of Science 57(1), 219–231.CrossRef Google Scholar

Steinsky, B. 2003. Efficient coding of labeled directed acyclic graphs. Soft Computing 7(5), 350–356.CrossRef Google Scholar

Suermondt, H. J., Cooper, G. F. 1988. Updating Probabilities in Multiply-Connected Belief Networks. Technical report SMI-88-0207, Medical Computer Science Group, Stanford University.Google Scholar

Suermondt, H. J., Cooper, G. F. 1990. Probabilistic inference in multiply connected belief networks using loop cutsets. International Journal of Approximate Reasoning 4(4), 283–306.CrossRef Google Scholar

Suermondt, H. J., Cooper, G. F. 1991. Initialization for the method of conditioning in Bayesian belief networks. Artificial Intelligence 50(1), 83–94.CrossRef Google Scholar

Suzuki, J. 1993. A construction of Bayesian networks from databases based on an MDL principle. In Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI-93), Heckerman, D. & Mamdani, A. (eds). Morgan Kaufmann, 266–273.CrossRef Google Scholar

Suzuki, J. 1999. Learning Bayesian belief networks based on the MDL principle: an efficient algorithm using the branch and bound technique. IEICE Transactions on Information and Systems E82-D(2), 356–367.Google Scholar

Teyssier, M., Koller, D. 2005. Ordering-based search: a simple and effective algorithm for learning Bayesian networks. In Proceedings of the Twenty-first Conference on Uncertainty in Artificial Intelligence (UAI-05), Bacchus, F. & Jaakkola, T. (eds). AUAI Press, 584–590.Google Scholar

Thiesson, B. 1995. Accelerated quantification of Bayesian networks with incomplete data. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), Fayyad, U. M. & Uthurusamy, R. (eds). AAAI Press, 306–311.Google Scholar

Thiesson, B. 1997. Score and information for recursive exponential models with incomplete data. In Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI-97), Geiger, D. & Shenoy, P. P. (eds). Morgan Kaufmann, 453–463.Google Scholar

Thiesson, B., Meek, C., Chickering, D. M., Heckerman, D. 1998a. Learning Mixtures of Bayesian Networks. Technical report MSR-TR-97-30, Microsoft Research.Google Scholar

Thiesson, B., Meek, C., Chickering, D. M., Heckerman, D. 1998b. Learning Mixtures of DAG Models. Technical report MSR-TR-97-30, Microsoft Research.Google Scholar

Tian, F., Zhang, H., Lu, Y., Shi, C. 2001. Incremental learning of Bayesian networks with hidden variables. In Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM 2001), Cercone, C., Lin, T. Y. & Wu, X. (eds). IEEE Computer Society, 651–652. doi: 10.1109/ICDM.2001.989594.CrossRef Google Scholar

Tian, F., Zhang, H., Lu, Y. 2003. Learning Bayesian networks from incomplete data based on EMI method. In Proceedings of the Third IEEE Conference on Data Mining (ICDM 2003), Wu, X., Tuzhilin, A. & Shavlik, J. (eds). IEEE Computer Society, 323–330. doi: 10.1109/ICDM.2003.1250936.CrossRef Google Scholar

Tian, F., Li, H., Wang, Z., Yu, J. 2007. Learning Bayesian networks based on a mutual information scoring function and EMI method. In Advances in Neural Networks: Proceedings of the Fourth International Symposium on Neural Networks (ISNN 2007), Lecture Notes in Computer Science 4492, 414–423. Springer, Part II.CrossRef Google Scholar

Tong, S., Koller, D. 2001a. Active learning for parameter estimation in Bayesian networks. In Advances in Neural Information Processing Systems 13 (NIPS*2000), Leen, T. K., Dietterich, T. G. & Tresp, V. (eds). MIT Press, 647–653.Google Scholar

Tong, S., Koller, D. 2001b. Active learning for structure in Bayesian networks. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI 01), Nebel, B. (ed.). Morgan Kaufmann, 863–869.Google Scholar

Tsamardinos, I., Aliferis, C. F., Statnikov, A. 2003a. Algorithms for large scale Markov blanket discovery. In Proceedings of the Sixteenth International FLAIRS Conference, Russell, I. & Haller, S. M. (eds). AAAI Press, 376–381.Google Scholar

Tsamardinos, I., Aliferis, C. F., Statnikov, A. 2003b. Time and sample efficient discovery of Markov blankets and direct causal relations. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’03), Getoor, L., Senator, T. E., Domingos, P. & Faloutsos, C. (eds). ACM, 673–678.CrossRef Google Scholar

Tsamardinos, I., Aliferis, C. F., Statnikov, A., Brown, L. E. 2003c. Scaling-up Bayesian Network Learning to Thousands of Variables Using Local Learning Techniques. Technical report DSL-03-02, Department of Biomedical Informatics, Vanderbilt University, Nashville, Tennessee.Google Scholar

Tsamardinos, I., Brown, L. E., Aliferis, C. F. 2006. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning 65(1), 31–78.CrossRef Google Scholar

Tucker, A., Liu, X. 2004. Learning dynamic Bayesian networks from multivariate time series with changing dependencies. In Advances in Intelligent Data Analysis V: Proceedings of the Fifth International Symposium on Intelligent Data Analysis (IDA 2003), Lecture Notes in Computer Science 2810, 100–110. Springer.Google Scholar

Tucker, A., Liu, X. 1999. Extending evolutionary programming methods to the learning of dynamic Bayesian networks. In Proceedings of the Genetic and Evolutionary Computation Conference, Banzhaf, W., Daida, J., Eiben, A. E., Garzon, M. H., Honavar, V., Jakiela, M. & Smith, R. E. (eds). 1, Morgan Kaufmann, 923–929.Google Scholar

Tucker, A., Liu, X., Ogden-Swift, A. 2001. Evolutionary learning of dynamic probabilistic models with large time lags. International Journal of Intelligent Systems 16(5), 621–646.CrossRef Google Scholar

Valtorta, M., Huang, Y. 2008. Identifiability in causal Bayesian networks: a gentle introduction. Cybernetics and Systems 39(4), 425–442.CrossRef Google Scholar

van Dijk, S., Thierens, D. 2004. On the use of a non-redundant encoding for learning Bayesian networks from data with a GA. In Proceedings of the Eight International Conference on Parallel Problem Solving from Nature (PPSN VIII), Yao, X. et al., (eds). Lecture Notes in Computer Science 3242, 141–150. Springer.CrossRef Google Scholar

van Dijk, S., Thierens, D., van der Gaag, L. C. 2003a. Building a GA from design principles for learning Bayesian networks. In Proceedings of the Genetic and Evolutionary Computation Conference, Lecture Notes in Computer Science 2723, 886–897. Springer, Part I.Google Scholar

van Dijk, S., van der Gaag, L. C., Thierens, D. 2003b. A skeleton-based approach to learning Bayesian networks from data. In Proceedings of the Seventh European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2003), Lavrač, N., Gamberger, D., Todorovski, L. & Blockeel, H. (eds). Lecture Notes in Artificial Intelligence 2838, 132–143. Springer.Google Scholar

van Engelen, R. A. 1997. Approximating Bayesian belief networks by arc removal. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(8), 916–920.CrossRef Google Scholar

Verma, T., Pearl, J. 1991. Equivalence and synthesis of causal models. In Uncertainty in Artificial Intelligence 6, Bonissone, P., Henrion, M., Kanal, L. & Lemmer, J. (eds). North-Holland, 255–268.Google Scholar

Verma, T., Pearl, J. 1992. An algorithm for deciding if a set of observed independencies has a causal explanation. In Proceedings of the Eighth Conference on Uncertainty in Artificial Intelligence (UAI-92), Dubois, D., Wellman, M. P., D’Ambrosio, B. & Smets, P. (eds). Morgan Kaufmann, 323–330.Google Scholar

Wallace, C. S., Boulton, D. M. 1968. An information measure for classification. The Computer Journal 11(2), 185–194.CrossRef Google Scholar

Wallace, C. S., Korb, K. B. 1999. Learning linear causal models by MML sampling. In Causal Models and Intelligent Data Management, Gammerman, A. (ed.). Springer, 89–111.CrossRef Google Scholar

Wallace, C. S., Korb, K. B., Dai, H. 1996. Causal discovery via MML. In Proceedings of the Thirteenth International Conference on Machine Learning (ICML ’96), Saitta, L. (ed.). Morgan Kaufmann, 516–524.Google Scholar

Wang, H., Yu, K., Yao, H. 2006. Learning dynamic Bayesian networks using evolutionary MCMC. In Proceedings of the International Conference on Computational Intelligence and Security, Wang, Y., Cheang, Y. & Liu, H. (eds). 1, IEEE, 45–50.Google Scholar

Wang, M., Chen, Z., Cloutier, S. 2007. A hybrid Bayesian network learning method for constructing gene networks. Computational Biology and Chemistry 31(5–6), 361–372.CrossRef Google Scholar PubMed

Watanabe, K., Shiga, M., Watanabe, S. 2009. Upper bound for variational free energy of Bayesian networks. Machine Learning 75(2), 199–215.CrossRef Google Scholar

Weiss, Y. 2000. Correctness of local probability propagation in graphical models with loops. Neural Computation 12(1), 1–41.CrossRef Google Scholar PubMed

Wellman, M. P., Liu, C.-L. 1994. State-space abstraction for anytime evaluation of probabilistic networks. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 567–574.Google Scholar

Whittaker, J. 1990. Graphical Models in Applied Multivariate Statistics. Wiley.Google Scholar

Williamson, J. 2005. Bayesian Nets and Causality. Oxford University Press.Google Scholar

Wong, M. L., Guo, Y. Y. 2006. Discover Bayesian networks from incomplete data using a hybrid evolutionary algorithm. In Proceedings of the Sixth International Conference on Data Mining (ICDM ’06), Clifton, C. W., Zhong, N., Liu, J., Wah, B. W. & Wu, X. (eds). IEEE, 1146–1150.CrossRef Google Scholar

Wong, M. L., Guo, Y. Y. 2008. Learning Bayesian networks from incomplete databases using a novel evolutionary algorithm. Decision Support Systems 45(2), 368–383.CrossRef Google Scholar

Wong, M. L., Leung, K. S. 2004. An efficient data mining method for learning Bayesian networks using an evolutionary algorithm-based hybrid approach. IEEE Transactions on Evolutionary Computation 8(4), 378–404.CrossRef Google Scholar

Wong, M. L., Lam, W., Leung, K. S. 1999. Using evolutionary programming and minimum description length principle for data mining of Bayesian networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(2), 174–178.CrossRef Google Scholar

Wong, M. L., Lee, S. Y., Leung, K. S. 2002. A hybrid approach to discover Bayesian networks from databases using evolutionary programming. In Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Kumar, V., Tsumoto, S., Zhong, N., Yu, P. S. & Wu, X. (eds). IEEE Computer Society, 498–505. doi: 10.1109/ICDM.2002.1183994.CrossRef Google Scholar

Xiang, Y., Chu, T. 1999. Parallel learning of belief networks in large and difficult domains. Data Mining and Knowledge Discovery 3(3), 315–339.CrossRef Google Scholar

Xiang, Y., Wong, S. K. M., Cercone, N. 1996. Critical remarks on single link search in learning belief networks. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96), Horvitz, E. & Jensen, F. (eds). Morgan Kaufmann, 564–571.Google Scholar

Xie, X., Geng, Z. 2008. A recursive method for structural learning of directed acyclic graphs. Journal of Machine Learning Research 9, 459–483.Google Scholar

Xing-Chen, H., Lei, Q. Z. T., Li-Ping, S. 2007a. Learning Bayesian network structures with discrete particle swarm optimization algorithm. In Proceedings of the IEEE Symposium on Foundations of Computational Intelligence (FOCI 2007), Mendel, J. M., Omori, T. & Yao, X. (eds). IEEE, 47–52. doi: 10.1109/FOCI.2007.372146.CrossRef Google Scholar

Xing-Chen, H., Zheng, Q., Lei, T., Li-Ping, S. 2007b. Research on structure learning of dynamic Bayesian networks by particle swarm optimization. In Proceedings of the IEEE Symposium on Artificial Life (ALIFE ’07), IEEE, 85–91.CrossRef Google Scholar

Yedidia, J. S., Freeman, W. T., Weiss, Y. 2001. Generalized belief propagation. In Advances in Neural Information Processing Systems 13 (NIPS*2000), Leen, T. K., Dietterich, T. G. & Tresp, V. (eds). MIT Press, 689–695.Google Scholar

Yehezkel, R., Lerner, B. 2006. Bayesian network structure learning by recursive autonomy identification. In Proceedings of the Joint IAPR International Workshops on Structural, Syntactic, and Statistical Pattern Recognition (SSPR 2006 and SPR 2006), Lecture Notes in Computer Science 4109, 154–162. Springer.Google Scholar

Yu, K., Wang, H., Wu, X. 2007. A parallel algorithm for learning Bayesian networks. In Proceedings of the Eleventh Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD 2007), Lecture Notes in Artificial Intelligence 4426, 1055–1063. Springer.CrossRef Google Scholar

Zhang, J. 2008. Causal reasoning with ancestral graphs. Journal of Machine Learning Research 9, 1437–1474.Google Scholar

Zhang, J., Spirtes, P. 2008. Detection of unfaithfulness and robust causal inference. Minds and Machines 18(2), 239–271.CrossRef Google Scholar

Zhang, N. L. 1996. Irrelevance and parameter learning in Bayesian networks. Artificial Intelligence 88(1–2), 359–373.CrossRef Google Scholar

Zhang, N. L., Poole, D. 1994a. Intercausal independence and heterogeneous factorization. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), de Mantaras, R. L. & Poole, D. (eds). Morgan Kaufmann, 606–614.Google Scholar

Zhang, N. L., Poole, D. 1994b. A simple approach to Bayesian network computations. In Proceedings of the Tenth Biennial Conference of the Canadian Society for Computational Studies of Intelligence, Banff, Canada, 171–178.Google Scholar

Zhang, N. L., Poole, D. 1996. Exploiting causal independence in Bayesian network inference. Journal of Artificial Intelligence Research 5, 301–328.CrossRef Google Scholar

Zhang, N. L., Yan, L. 1998. Independence of causal influence and clique tree propagation. International Journal of Approximate Reasoning 19(3–4), 335–349.CrossRef Google Scholar

Ziegler, V. 2008. Approximation algorithms for restricted Bayesian network structures. Information Processing Letters 108(2), 60–63.CrossRef Google Scholar

Zuk, O., Margel, S., Domany, E. 2006. On the number of samples needed to learn the correct structure of a Bayesian network. In Proceedings of the Twenty-second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), Dechter, R. & Richardson, T. (eds). AUAI Press, 560–567.Google Scholar

Article contents

Learning Bayesian networks: approaches and issues

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests