Hybrid Variable-Length Spider Monkey Optimization with Good-Point Set Initialization for Data Clustering
DOI:
https://doi.org/10.31449/inf.v47i8.4872Abstract
Data clustering refers to grouping data points that are similar in some way. This can be done in accordance with their patterns or characteristics. It can be used for various purposes, including image analysis, pattern recognition, and data mining. The K-means algorithm, commonly used for clustering, is subject to limitations, such as requiring the number of clusters to be specified and being sensitive to initial center points. To address these limitations, this study proposes a novel method to determine the optimal number of clusters and initial centroids using a variable-length spider monkey optimization algorithm (VLSMO) with a hybrid proposed measure. Results of experiments on real-life datasets demonstrate that VLSMO performs better than the standard k-means in terms of accuracy and clustering capacity.References
I. Aljarah, H. Faris, and S. Mirjalili, Evolutionary data clustering: Algorithms and applications. Springer, 2021.
S. F. Raheem and M. Alabbas, "Optimal k-means clustering using artificial bee colony algorithm with variable food sources length," International Journal of Electrical & Computer Engineering (2088-8708), vol. 12, no. 5, 2022.
C. Yuan and H. Yang, "Research on K-value selection method of K-means clustering algorithm," J, vol. 2, no. 2, pp. 226-235, 2019.
S. Saatchi and C. C. Hung, "Hybridization of the ant colony optimization with the k-means algorithm for clustering," in Image Analysis: 14th Scandinavian Conference, SCIA 2005, Joensuu, Finland, June 19-22, 2005. Proceedings 14, 2005: Springer, pp. 511-520.
A. Kumar, D. Kumar, and S. Jarial, "A novel hybrid K-means and artificial bee colony algorithm approach for data clustering," Decision Science Letters, vol. 7, no. 1, pp. 65-76, 2018.
M. Neshat, S. F. Yazdi, D. Yazdani, and M. Sargolzaei, "A new cooperative algorithm based on PSO and k-means for data clustering," Journal of Computer Science, vol. 8, no. 2, p. 188, 2012.
B. Li, "An experiment of k-means initialization strategies on handwritten digits dataset," Intelligent Information Management, vol. 10, no. 2, pp. 43-48, 2018.
Y. Li, Z. Ni, F. Jin, J. Li, and F. Li, "Research on clustering method of improved glowworm algorithm based on good-point set," Mathematical Problems in Engineering, vol. 2018, 2018.
Z. Bin, G. Zhichun, and H. Qiangqiang, "A Genetic Clustering Method Based on Variable Length String," in 2019 2nd International Conference on Safety Produce Informatization (IICSPI), 2019: IEEE, pp. 460-464.
G. Komarasamy and A. Wahi, "An optimized K-means clustering technique using bat algorithm," European Journal of Scientific Research, vol. 84, no. 2, pp. 263-273, 2012.
T. Hassanzadeh and M. R. Meybodi, "A new hybrid approach for data clustering using firefly algorithm and K-means," in The 16th CSI international symposium on artificial intelligence and signal processing (AISP 2012), 2012: IEEE, pp. 007-011.
G. Zhu and S. Kwong, "Gbest-guided artificial bee colony algorithm for numerical function optimization," Applied mathematics and computation, vol. 217, no. 7, pp. 3166-3173, 2010.
S. F. Raheem and M. Alabbas, "Fuzzy logic-based self-adaptive artificial bee colony algorithm," in AIP Conference Proceedings, 2023, vol. 2591, no. 1: AIP Publishing.
D. Karaboga and B. Akay, "A modified artificial bee colony (ABC) algorithm for constrained optimization problems," Applied soft computing, vol. 11, no. 3, pp. 3021-3031, 2011.
M. Alabbas and A. Abdulkareem, "Hybrid artificial bee colony algorithm with multi-using of simulated annealing algorithm and its application in attacking of stream cipher systems," Journal of Theoretical and Applied Information Technology, vol. 97, pp. 23-33, 01/15 2019.
J. C. Bansal, H. Sharma, S. S. Jadon, and M. Clerc, "Spider monkey optimization algorithm for numerical optimization," Memetic computing, vol. 6, pp. 31-47, 2014.
K. P. Sinaga and M.-S. Yang, "Unsupervised K-means clustering algorithm," IEEE access, vol. 8, pp. 80716-80727, 2020.
G. S. Ohannesian and E. J. Harfash, "Epileptic Seizures Detection from EEG Recordings Based on a Hybrid system of Gaussian Mixture Model and Random Forest Classifier," Informatica, vol. 46, no. 6, 2022.
S. F. Raheem and M. Alabbas, "Dynamic Artificial Bee Colony Algorithm with Hybrid Initialization Method," Informatica, vol. 45, no. 6, 2021.
C. Blake and C. Merz, "UCI repository of machine learning databases, 1998).(http," archive. ics. uci. edu/ml/index. PHP.
V.-P. Ha, T.-K. Dao, N.-Y. Pham, and M.-H. Le, "A variable-length chromosome genetic algorithm for time-based sensor network schedule optimization," Sensors, vol. 21, no. 12, p. 3990, 2021.
L. Cruz-Piris, I. Marsa-Maestre, and M. A. Lopez-Carmona, "A variable-length chromosome genetic algorithm to solve a road traffic coordination multipath problem," IEEE Access, vol. 7, pp. 111968-111981, 2019.
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika