Learning the Structure of Bayesian Networks from Incomplete Data Using a Mixture Model
DOI:
https://doi.org/10.31449/inf.v47i1.4497Abstract
In this paper, we provide an approach to learning optimal Bayesian network (BN) structures from incomplete data based on the BIC score function using a mixture model to handle miss- ing values. We have compared the proposed approach with other methods. Our experiments have been conducted on different models, some of them Belief Noisy-Or (BNO) ones. We have performed experiments using datasets with values missing completely at random having differ- ent missingness rates and data sizes. We have analyzed the significance of differences between the algorithm performance levels using the Wilcoxon test. The new approach typically learns additional edges in the case of Belief Noisy-or models. We have analyzed this issue using the Chi-square test of independence between the variables in the true models; this approach reveals that additional edges can be explained by strong dependence in generated data. An important property of our new method for learning BNs from incomplete data is that it can learn not only optimal general BNs but also specific Belief Noisy-Or models which is using in many applica- tions such as medical application.References
Nir Friedman, Dan Geiger, and Moises Gold- szmidt. Bayesian network classifiers. Machine Learning, 20(2-3):131––163, 1997.
Cassio P de Campos, Mauro Scanagatta, Gior- gio Corani, and Marco Zaffalon. Entropy-based pruning for learning Bayesian networks using BIC. Artificial Intelligence, 260:42––50, 2018.
Andrea Ruggieri, Francesco Stranieri, Fabio Stella, and Marco Scutari. Hard and soft EM in Bayesian network learning from incomplete data. Algorithms, 13(12):329, 2020.
Judea Pearl. Probabilistic reasoning in intelli- gent systems: networks of plausible inference. Morgan kaufmann, 1988.
Nir Friedman and Moises Goldszmidt. Learn- ing Bayesian networks with local structure. In Learning in graphical models, page 421––459. Springer, 1998.
Zhifa Liu, Brandon Malone, and Changhe Yuan. Empirical evaluation of scoring functions for Bayesian network model selection. In Proceed- ings of the Ninth Annual MCBIOS Conference. Dealing with the Omics Data Deluge, Oxford, MS, USA., 2012. BMC Bioinformatics.
Poh Choo Song, Hui Yee Chong, Hong Choon Ong, and Sing Yan Looi. A model of Bayesian network analysis of the factors affecting stu- dent’s higher level study decision: The private institution case. journal of Telecommunication, Electronic and Computer Engineering (JTEC), 8(2):105––109, 2016.
Cassio P de Campos, Zhi Zeng, and Qiang Ji. Structure learning of Bayesian networks using constraints. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, page 113––120, New York, NY, USA, 2009. Association for Computing Machin- ery.
James Cussens. Bayesian network learning with cutting planes. In Proceedings of the Twenty- Seventh Conference on Uncertainty in Artificial Intelligence, page 153––160, Arlington, Vir- ginia, USA, 2011. AUAI Press.
Arthur P Dempster, Nan M Laird, and Donald B Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B, 39:1––38, 1977.
Jir ́ıGrim,JanHora,PavelBocˇek,PetrSomol, and Pavel Pudil. Statistical model of the 2001 Czech census for interactive presentation. Jour- nal of Official Statistics, 26(4):673––694, 2010.
J. Grim and P. Bocˇek. Statistical model of prague households for interactive presentation of census data. In SoftStat 95. Advances in Statistical Soft- ware 5. Conference on the Scientific Use of Sta- tistical Software, Heidelberg, DE, 1996.
Luca Scrucca, Michael Fop, T. Brendan Mur- phy, and Adrian E. Raftery. mclust 5: cluster- ing, classification and density estimation using Gaussian finite mixture models. The R Journal, 8(1):289––317, 2016.
Fred Glover. Tabu search-part I. ORSA Journal on computing, 1(3):190––206, 1989.
M Neuha ̈user and Mann-Whitney Test. In- ternational Encyclopedia of Statistical Science. Springer Berlin Heidelberg, 2011.
Marco Scutari and Jean-Baptiste Denis. Bayesian Networks: with Examples in R. Chapman & Hall, Boca Raton, 2014.
MichaelAShwe,BlackfordMiddleton,DavidE Heckerman, Max Henrion, Eric J Horvitz, Harold P Lehmann, and Gregory F Cooper. Probabilistic diagnosis using a reformulation of the internist-1/qmr knowledge base. Methods of information in Medicine, 30(04):241––255, 1991.
B Abramson, J Brown, Ward E, Allan Murphy, and Robert L Winkler. Hailfinder: A bayesian system for forecasting severe weather. Inter- national Journal of Forecasting, 12(1):57–71, 1996. Probability Judgmental Forecasting.
A Philip Dawid. Prequential analysis, stochastic complexity and Bayesian inference. Bayesian statistics, 4:109––125, 1992.
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika