Hyperparameter Optimization for Convolutional Neural Networks using the Salp Swarm Algorithm
DOI:
https://doi.org/10.31449/inf.v47i9.5148Abstract
Convolutional neural networks (CNNs) have exceptionally performed across various computer vision tasks. However, their effectiveness depends heavily on the careful selection of hyperparameters. Optimizing these hyperparameters can be challenging and time-consuming, especially when working with large datasets and complex network architectures. In response, we propose a novel approach for hyperparameter optimization in CNNs using the Salp Swarm Algorithm (SSA). Based on the natural behavior of mollusks, SSA mimics the collective intelligence that governs feeding and navigation. Taking advantage of SSA's unique properties, our research thoroughly explores the hyperparameter space. This exploration aims to identify the algorithm that maximizes CNNs performance. This paper presents the architecture of the SSA-based framework for hyperparameter optimization and compares it to other established optimization techniques, such as Particle Swarm Optimization (PSO) and Genetic Algorithm (GA). We also present experimental results using the MNIST dataset, achieving an impressive classification accuracy of 99.46%. This case study not only contributes to the fields of deep learning and hyperparameter optimization by demonstrating the effectiveness of SSA in optimizing CNNs, but it also provides benefits to researchers and practitioners who are looking for optimal hyperparameter configurations for CNNs in a variety of computer vision applications. We also evaluate the scalability and robustness of our proposed method in the context of different CNNs structures. The insights we gained highlight SSA's potential for addressing challenges related to hyperparameter optimization.References
Gadri, S., Developing an efficient predictive model based on ml and dl approaches to detect diabetes. Informatica, 2021. 45(3).
Abdulla, M. and A. Marhoon, Agriculture based on Internet of Things and Deep Learning. Iraqi Journal for Electrical and Electronic Engineering, 2022. 18(2): p. 1-8.
Xu, Y., et al., Batch normalization with enhanced linear transformation. arXiv preprint arXiv:2011.14150, 2020.
Shrestha, A. and A. Mahmood, Review of deep learning algorithms and architectures. IEEE access, 2019. 7: p. 53040-53065.
Hassan, N.F.A., A.A. Abed, and T.Y. Abdalla, Face mask detection using deep learning on NVIDIA Jetson Nano. International Journal of Electrical & Computer Engineering (2088-8708), 2022. 12(5).
Gaafar, A.S., J.M. Dahr, and A.K. Hamoud, Comparative Analysis of Performance of Deep Learning Classification Approach based on LSTM-RNN for Textual and Image Datasets. Informatica, 2022. 46(5).
Wang, Y., H. Zhang, and G. Zhang, cPSO-CNN: An efficient PSO-based algorithm for fine-tuning hyper-parameters of convolutional neural networks. Swarm and Evolutionary Computation, 2019. 49: p. 114-123.
Darwish, A., D. Ezzat, and A.E. Hassanien, An optimized model based on convolutional neural networks and orthogonal learning particle swarm optimization algorithm for plant diseases diagnosis. Swarm and evolutionary computation, 2020. 52: p. 100616.
Alzubaidi, L., et al., Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data, 2021. 8(1): p. 53.
LeCun, Y., The MNIST database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
Zhang, H., et al., Differential evolution-assisted salp swarm algorithm with chaotic structure for real-world problems. Eng Comput, 2022. 39(3): p. 1735-1769.
Syulistyo, A.R., et al., Particle swarm optimization (PSO) for training optimization on convolutional neural network (CNN). Jurnal Ilmu Komputer dan Informasi, 2016. 9(1): p. 52-58.
Ayumi, V., et al. Optimization of convolutional neural network using microcanonical annealing algorithm. in 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS). 2016. IEEE.
Yoo, J.-H., et al. Optimization of hyper-parameter for CNN model using genetic algorithm. in 2019 1st International conference on electrical, control and instrumentation engineering (ICECIE). 2019. IEEE.
Guo, Y., J.-Y. Li, and Z.-H. Zhan, Efficient hyperparameter optimization for convolution neural networks in deep learning: A distributed particle swarm optimization approach. Cybernetics and Systems, 2020. 52(1): p. 36-57.
Bacanin, N., et al., Optimizing Convolutional Neural Network Hyperparameters by Enhanced Swarm Intelligence Metaheuristics. Algorithms, 2020. 13(3).
Ma, B., et al., Autonomous deep learning: A genetic DCNN designer for image classification. Neurocomputing, 2020. 379: p. 152-161.
Serizawa, T. and H. Fujita, Optimization of convolutional neural network using the linearly decreasing weight particle swarm optimization. arXiv preprint arXiv:2001.05670, 2020.
Nistor, S.C. and G. Czibula, IntelliSwAS: Optimizing deep neural network architectures using a particle swarm-based approach. Expert Systems with Applications, 2022. 187: p. 115945.
Moodie, E.E. and D.A. Stephens, Comment: Clarifying endogeneous data structures and consequent modelling choices using causal graphs. 2020.
Challapalli, J.R. and N. Devarakonda, A novel approach for optimization of convolution neural network with hybrid particle swarm and grey wolf algorithm for classification of Indian classical dances. Knowledge and Information Systems, 2022. 64(9): p. 2411-2434.
Raji, I.D., et al., Simple deterministic selection-based genetic algorithm for hyperparameter tuning of machine learning models. Applied Sciences, 2022. 12(3): p. 1186.
Altwaijry, N. and I. Al-Turaiki, Arabic handwriting recognition system using convolutional neural network. Neural Computing and Applications, 2021. 33(7): p. 2249-2261.
Ren, L., et al., A data-driven auto-CNN-LSTM prediction model for lithium-ion battery remaining useful life. IEEE Transactions on Industrial Informatics, 2020. 17(5): p. 3478-3487.
Ashraf, A.H., et al., Weapons detection for security and video surveillance using cnn and YOLO-v5s. CMC-Comput. Mater. Contin, 2022. 70: p. 2761-2775.
Zamir, M., et al., Face Detection & Recognition from Images & Videos Based on CNN & Raspberry Pi. Computation, 2022. 10(9): p. 148.
Li, C., et al., Segmenting objects in day and night: Edge-conditioned CNN for thermal image semantic segmentation. IEEE Transactions on Neural Networks and Learning Systems, 2020. 32(7): p. 3069-3082.
Haque, M.A., et al. Experimental evaluation of CNN architecture for speech recognition. in First International Conference on Sustainable Technologies for Computational Intelligence: Proceedings of ICTSCI 2019. 2020. Springer.
Khudeyer, R.S. and N.M. Almoosawi, Combination of machine learning algorithms and Resnet50 for Arabic Handwritten Classification. Informatica, 2023. 46(9).
Fregoso, J., C.I. Gonzalez, and G.E. Martinez, Optimization of convolutional neural networks architectures using PSO for sign language recognition. Axioms, 2021. 10(3): p. 139.
Alhijaj, J.A. and R.S. Khudeyer, Integration of EfficientNetB0 and Machine Learning for Fingerprint Classification. Informatica, 2023. 47(5).
Al, N.M.A.-M.M. and R.S. Khudeyer, ResNet-34/DR: a residual convolutional neural network for the diagnosis of diabetic retinopathy. Informatica, 2021. 45(7).
Mirjalili, S., et al., Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems. Advances in engineering software, 2017. 114: p. 163-191.
Duan, Q., et al., Improved salp swarm algorithm with simulated annealing for solving engineering optimization problems. Symmetry, 2021. 13(6): p. 1092.
Faris, H., et al., Salp swarm algorithm: theory, literature review, and application in extreme learning machines. Nature-inspired optimizers: theories, literature reviews and applications, 2020: p. 185-199.
Wu, H., CNN-Based Recognition of Handwritten Digits in MNIST Database. Research School of Computer Science. The Australia National University, Canberra, 2018.
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika