Closed Itemset Mining: A graph theory perspective
DOI:
https://doi.org/10.31449/inf.v48i8.5480Abstract
Data Mining is the field which targets the extraction and the analysis ofusable data from a large database. In this paper, we focus on the most studiedproblems in the field. That is finding closed frequent itemsets. Up to now,various graph theory techniques have been proposed to solve the frequentitemsets problem. Unfortunately, these techniques have the drawback ofneglecting the existing relationship between each pair of items that appearin the same transaction. In this paper, first of all, we present a scalable newmodeling approach which allows the representation of a transaction datasetby an undirected and labeled graph. The labels are astutely computed andproperly assigned to the bonds. Secondly, based on the clique notion ingraph theory, we propose polynomial and exact algorithm that computes allthe closed frequent itemsets. In terms of CPU-time and the memory usage,our first testing results show the efficiency of our algorithm compared torecent methods selected in the literature. Thus, the benefit of using ourgraph model is its ability to be simply extended when the correspondingdataset is updated.In other words, the proposed labeled graph incorporatesall the information contained in the transaction database. This is the strongaspect of the model, which can be used to investigate more challenging issuesrelated to the problem dealt within this paper.References
Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for data mining and
machine learning. In: Proceedings of the Twenty-Fourth AAAI Conference on
Artificial Intelligence. AAAI’10, vol. 24, pp. 1671–1675. AAAI Press, Atlanta-
Georgia (2010)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn.
Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann,
Amsterdam (2011). http://www.sciencedirect.com/science/book/9780123814791
Aggarwal, C.C.: Data Mining-The Textbook. Springer, BM T.J.Watson Research
Center, Yorktown Heights, New York, USA (2015). https://doi.org/10.1007/
-3-319-14142-8 . https://doi.org/10.1007/978-3-319-14142-8
Fournier-Viger, P., Lin, J.C.-W., Kiran, R.U., Koh, Y.S.: A survey of sequential
pattern mining. Data Science and Pattern Recognition 1(1), 54–77 (2017)
Gan, W., Lin, J.C., Fournier-Viger, P., Chao, H., Zhan, J.: Mining of frequent
patterns with multiple minimum supports. Eng. Appl. Artif. Intell. 60, 83–96
(2017) https://doi.org/10.1016/j.engappai.2017.01.009
Agrawal, R., Imieli’nski, T., Swami, A.: Mining association rules between sets of
items in large databases, vol. 22, pp. 207–216 (1993). https://doi.org/10.1145/
170072
De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining.
In: Proceedings of the 14th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining. KDD ’08, pp. 204–212. Association for
Computing Machinery, New York, NY, USA (2008). https://doi.org/10.1145/
1401919 . https://doi.org/10.1145/1401890.1401919
Guns, T., Nijssen, S., Raedt, L.D.: Itemset mining: A constraint programming perspective. Artif. Intell. 175(12-13), 1951–1983 (2011) https://doi.org/10.1016/
j.artint.2011.05.002
Leung, C.K.: Frequent itemset mining with constraints. In: Liu, L., ¨ Ozsu, M.T. MA (2009). https://doi.org/10.1007/978-0-387-39940-9 170 . https://doi.org/10.
/978-0-387-39940-9 170
Belaid, M.-B., Bessiere, C., Lazaar, N.: Constraint programming for mining borders
of frequent itemsets. In: Proceedings of the Twenty-Eighth International
Joint Conference on Artificial Intelligence, IJCAI-19, pp. 1064–1070. International
Joint Conferences on Artificial Intelligence Organization, Macao, China
(2019). https://doi.org/10.24963/ijcai.2019/149 . https://doi.org/10.24963/ijcai.
/149
Mazouri, F.-Z.E., Jabbour, S., Raddaoui, B., Sais, L., Abounaima, M.C., Zenkouar,
K.: Breaking symmetries in association rules. Procedia Computer Science
, 283–290 (2019) https://doi.org/10.1016/j.procs.2019.01.052 . THE SECOND
INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING
IN DATA SCIENCES, ICDS2018
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: maintaining closed frequent
itemsets over a stream sliding window. In: Fourth IEEE International Conference
on Data Mining (ICDM’04), pp. 59–66 (2004). https://doi.org/10.1109/ICDM.
10084
Schlegel, B., Gemulla, R., Lehner, W.: Memory-efficient frequent-itemset mining,
pp. 461–472 (2011). https://doi.org/10.1145/1951365.1951420
Tiwari, V., Tiwari, V., Gupta, S., Tiwari, R.: Association rule mining: A graph
based approach for mining frequent itemsets. In: 2010 International Conference
on Networking and Information Technology, pp. 309–313 (2010). https://doi.org/
1109/ICNIT.2010.5508505
Gouda, K., Zaki, M.: Efficiently mining maximal frequent itemsets, pp. 163–170
(2001). https://doi.org/10.1109/ICDM.2001.989514
Alzoubi, W.: An improved graph based method for extracting association rules.
International Journal of Software Engineering & Applications 6, 1–10 (2015)
https://doi.org/10.5121/ijsea.2015.6301
Fournier-Viger, P., Chun-Wei Lin, J., Truong-Chi, T., Nkambou, R.: In: Fournier-
Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) A Survey of
High Utility Itemset Mining, pp. 1–45. Springer, Cham (2019). https://doi.org/
1007/978-3-030-04921-8 1 . https://doi.org/10.1007/978-3-030-04921-8 1
Jabbour, S., Khiari, M., Sais, L., Salhi, Y., Tabia, K.: Symmetry-based pruning in itemset mining. In: 25th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2013, Herndon, VA, USA, November 4-6, 2013, pp. 483–490. IEEE Computer Society, Los Alamitos, CA, USA (2013). https://doi.org/10.1109/ICTAI.2013.78 .
Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, vol. 1215, pp. 487–499
(1994)
Le, T., Vo, B.: An n-list-based algorithm for mining frequent closed patterns. Expert Systems with Applications 42(19), 6648–6657 (2015) https://doi.org/10.1016/j.eswa.2015.04.048
Aryabarzan, N., Minaei-Bidgoli, B.: Neclatclosed: A vertical algorithm for mining frequent closed itemsets. Expert Syst. Appl. 174, 114738 (2021) https://doi.org/10.1016/j.eswa.2021.114738
Ledmi, M., Zidat, S., Hamdi-Cherif, A.: GrAFCI+ a fast generator-based algorithm for mining frequent closed itemsets. Knowl. Inf. Syst. 63(7), 1873–1908
(2021) https://doi.org/10.1007/s10115-021-01575-3
Grahne, G., Zhu, J.: High performance mining of maximal frequent itemsets. In: 6th International Workshop on High Performance Data Mining, vol. 16, p. 34 (2003)
Szathmary, L.: Symbolic data mining methods with the coron platform. PhD thesis, Universit´e Henri Poincar´e-Nancy 1 (2006)
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika