Closed Itemset Mining: A graph theory perspective

Authors

  • Fatima Zohra Lebbah Higher School of Electrical and Energetic Engineering of Oran (ESG2E)
  • Moussa Larbani School of Mathematics and Statistics, Carleton, Canada. National Higher School of Statistics and Applied Economics (ENSSEA), Pole Universitaire de Kolea, Tipaza, Algiers,Algeria.
  • Abdellatif Rahmoun Laboratory of Research in Computer Science-SBA, Computational Intelligence and Soft Computing Team (CISCO), Higher School of Computer Science, BP 73, Bureau de poste EL WIAM, Sidi-Belabbes, 22016, Algeria

DOI:

https://doi.org/10.31449/inf.v48i8.5480

Abstract

Data Mining is the field which targets the extraction and the analysis ofusable data from a large database. In this paper, we focus on the most studiedproblems in the field. That is finding closed frequent itemsets. Up to now,various graph theory techniques have been proposed to solve the frequentitemsets problem. Unfortunately, these techniques have the drawback ofneglecting the existing relationship between each pair of items that appearin the same transaction. In this paper, first of all, we present a scalable newmodeling approach which allows the representation of a transaction datasetby an undirected and labeled graph. The labels are astutely computed andproperly assigned to the bonds. Secondly, based on the clique notion ingraph theory, we propose polynomial and exact algorithm that computes allthe closed frequent itemsets. In terms of CPU-time and the memory usage,our first testing results show the efficiency of our algorithm compared torecent methods selected in the literature. Thus, the benefit of using ourgraph model is its ability to be simply extended when the correspondingdataset is updated.In other words, the proposed labeled graph incorporatesall the information contained in the transaction database. This is the strongaspect of the model, which can be used to investigate more challenging issuesrelated to the problem dealt within this paper.

Author Biographies

Fatima Zohra Lebbah, Higher School of Electrical and Energetic Engineering of Oran (ESG2E)

Department: Preparatory ClassesAssociate Professor

Moussa Larbani, School of Mathematics and Statistics, Carleton, Canada. National Higher School of Statistics and Applied Economics (ENSSEA), Pole Universitaire de Kolea, Tipaza, Algiers,Algeria.

National Higher School of Statistics and Applied Economics (ENSSEA). AlgeriaProfessor 

Abdellatif Rahmoun, Laboratory of Research in Computer Science-SBA, Computational Intelligence and Soft Computing Team (CISCO), Higher School of Computer Science, BP 73, Bureau de poste EL WIAM, Sidi-Belabbes, 22016, Algeria

Higher School of Computer Science, Sidi-Belabbes, Algeria.Full Proffessor

References

Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for data mining and

machine learning. In: Proceedings of the Twenty-Fourth AAAI Conference on

Artificial Intelligence. AAAI’10, vol. 24, pp. 1671–1675. AAAI Press, Atlanta-

Georgia (2010)

Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn.

Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann,

Amsterdam (2011). http://www.sciencedirect.com/science/book/9780123814791

Aggarwal, C.C.: Data Mining-The Textbook. Springer, BM T.J.Watson Research

Center, Yorktown Heights, New York, USA (2015). https://doi.org/10.1007/

-3-319-14142-8 . https://doi.org/10.1007/978-3-319-14142-8

Fournier-Viger, P., Lin, J.C.-W., Kiran, R.U., Koh, Y.S.: A survey of sequential

pattern mining. Data Science and Pattern Recognition 1(1), 54–77 (2017)

Gan, W., Lin, J.C., Fournier-Viger, P., Chao, H., Zhan, J.: Mining of frequent

patterns with multiple minimum supports. Eng. Appl. Artif. Intell. 60, 83–96

(2017) https://doi.org/10.1016/j.engappai.2017.01.009

Agrawal, R., Imieli’nski, T., Swami, A.: Mining association rules between sets of

items in large databases, vol. 22, pp. 207–216 (1993). https://doi.org/10.1145/

170072

De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining.

In: Proceedings of the 14th ACM SIGKDD International Conference on

Knowledge Discovery and Data Mining. KDD ’08, pp. 204–212. Association for

Computing Machinery, New York, NY, USA (2008). https://doi.org/10.1145/

1401919 . https://doi.org/10.1145/1401890.1401919

Guns, T., Nijssen, S., Raedt, L.D.: Itemset mining: A constraint programming perspective. Artif. Intell. 175(12-13), 1951–1983 (2011) https://doi.org/10.1016/

j.artint.2011.05.002

Leung, C.K.: Frequent itemset mining with constraints. In: Liu, L., ¨ Ozsu, M.T. MA (2009). https://doi.org/10.1007/978-0-387-39940-9 170 . https://doi.org/10.

/978-0-387-39940-9 170

Belaid, M.-B., Bessiere, C., Lazaar, N.: Constraint programming for mining borders

of frequent itemsets. In: Proceedings of the Twenty-Eighth International

Joint Conference on Artificial Intelligence, IJCAI-19, pp. 1064–1070. International

Joint Conferences on Artificial Intelligence Organization, Macao, China

(2019). https://doi.org/10.24963/ijcai.2019/149 . https://doi.org/10.24963/ijcai.

/149

Mazouri, F.-Z.E., Jabbour, S., Raddaoui, B., Sais, L., Abounaima, M.C., Zenkouar,

K.: Breaking symmetries in association rules. Procedia Computer Science

, 283–290 (2019) https://doi.org/10.1016/j.procs.2019.01.052 . THE SECOND

INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING

IN DATA SCIENCES, ICDS2018

Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: maintaining closed frequent

itemsets over a stream sliding window. In: Fourth IEEE International Conference

on Data Mining (ICDM’04), pp. 59–66 (2004). https://doi.org/10.1109/ICDM.

10084

Schlegel, B., Gemulla, R., Lehner, W.: Memory-efficient frequent-itemset mining,

pp. 461–472 (2011). https://doi.org/10.1145/1951365.1951420

Tiwari, V., Tiwari, V., Gupta, S., Tiwari, R.: Association rule mining: A graph

based approach for mining frequent itemsets. In: 2010 International Conference

on Networking and Information Technology, pp. 309–313 (2010). https://doi.org/

1109/ICNIT.2010.5508505

Gouda, K., Zaki, M.: Efficiently mining maximal frequent itemsets, pp. 163–170

(2001). https://doi.org/10.1109/ICDM.2001.989514

Alzoubi, W.: An improved graph based method for extracting association rules.

International Journal of Software Engineering & Applications 6, 1–10 (2015)

https://doi.org/10.5121/ijsea.2015.6301

Fournier-Viger, P., Chun-Wei Lin, J., Truong-Chi, T., Nkambou, R.: In: Fournier-

Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) A Survey of

High Utility Itemset Mining, pp. 1–45. Springer, Cham (2019). https://doi.org/

1007/978-3-030-04921-8 1 . https://doi.org/10.1007/978-3-030-04921-8 1

Jabbour, S., Khiari, M., Sais, L., Salhi, Y., Tabia, K.: Symmetry-based pruning in itemset mining. In: 25th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2013, Herndon, VA, USA, November 4-6, 2013, pp. 483–490. IEEE Computer Society, Los Alamitos, CA, USA (2013). https://doi.org/10.1109/ICTAI.2013.78 .

Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, vol. 1215, pp. 487–499

(1994)

Le, T., Vo, B.: An n-list-based algorithm for mining frequent closed patterns. Expert Systems with Applications 42(19), 6648–6657 (2015) https://doi.org/10.1016/j.eswa.2015.04.048

Aryabarzan, N., Minaei-Bidgoli, B.: Neclatclosed: A vertical algorithm for mining frequent closed itemsets. Expert Syst. Appl. 174, 114738 (2021) https://doi.org/10.1016/j.eswa.2021.114738

Ledmi, M., Zidat, S., Hamdi-Cherif, A.: GrAFCI+ a fast generator-based algorithm for mining frequent closed itemsets. Knowl. Inf. Syst. 63(7), 1873–1908

(2021) https://doi.org/10.1007/s10115-021-01575-3

Grahne, G., Zhu, J.: High performance mining of maximal frequent itemsets. In: 6th International Workshop on High Performance Data Mining, vol. 16, p. 34 (2003)

Szathmary, L.: Symbolic data mining methods with the coron platform. PhD thesis, Universit´e Henri Poincar´e-Nancy 1 (2006)

Downloads

Published

2024-05-24

How to Cite

Lebbah, F. Z., Larbani, M., & Rahmoun, A. (2024). Closed Itemset Mining: A graph theory perspective. Informatica, 48(8). https://doi.org/10.31449/inf.v48i8.5480