The Effect of Topic Modelling on Prediction of Criticality Levels of Software Vulnerabilities
DOI:
https://doi.org/10.31449/inf.v47i6.3712Abstract
In this day and age, software is an indispensable part of our per diem endeavours, thereby keeping a check on exploitable vulnerabilities has become a vital function of a software firm. The motivation of this paper is to have better understanding of vulnerabilities, creating a tool for the industry practitioners to identify a critical vulnerability that could be detrimental for the firm’s assets. In this article, 1999 vulnerabilities related to Google Chrome was analysed to understand the behaviour of vulnerabilities. The identification of trends and patterns using topic modelling technique lead to extraction of topics. The extricated topics were then implemented in 10 classifiers to foresee the criticality of the vulnerability. The resulting performances were also assessed with the classifiers without implementing topic modelling techniques. A 10-fold validation was conducted on the suggested prediction model.References
Alves, H., Fonseca, B., & Antunes, N. (2016). Software metrics and security vulnerabilities: dataset and exploratory study. 2016 12th European Dependable Computing Conference (EDCC),
Anjum, M., Agarwal, V., Kapur, P., & Khatri, S. K. (2020). Two-phase methodology for prioritization and utility assessment of software vulnerabilities. International Journal of System Assurance Engineering and Management, 11(2), 289-300.
Anjum, M., Kapur, P., Agarwal, V., & Khatri, S. K. (2020). Evaluation and Selection of Software Vulnerabilities. International Journal of Reliability, Quality and Safety Engineering, 27(05), 2040014.
Bulut, F. G., Altunel, H., & Tosun, A. (2019). Predicting software vulnerabilities using topic modeling with issues. 2019 4th International Conference on Computer Science and Engineering (UBMK),
Dam, H. K., Tran, T., & Pham, T. (2016). A deep language model for software code. in workshop on Naturalness of Software (NL+SE), co- located with the 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE),
Farris, K. A., Shah, A., Cybenko, G., Ganesan, R., & Jajodia, S. (2018). Vulcon: A system for vulnerability prioritization, mitigation, and management. ACM Transactions on Privacy and Security (TOPS), 21(4), 1-28.
Filus, K., Siavvas, M., Domańska, J., & Gelenbe, E. (2020). The random neural network as a bonding model for software vulnerability prediction. Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems,
Ji, T., Wu, Y., Wang, C., Zhang, X., & Wang, Z. (2018). The coming era of alphahacking?: A survey of automatic software vulnerability detection, exploitation and patching techniques. 2018 IEEE third international conference on data science in cyberspace (DSC),
Kalouptsoglou, I., Siavvas, M., Tsoukalas, D., & Kehagias, D. (2020). Cross-project vulnerability prediction based on software metrics and deep learning. International Conference on Computational Science and Its Applications,
Kansal, Y., Kapur, P., & Kumar, D. (2016). Assessing optimal patch release time for vulnerable software systems. 2016 International Conference on Innovation and Challenges in Cyber Security (ICICCS-INBUSH),
Kansal, Y., Kumar, U., Kumar, D., & Kapur, P. K. (2018). Fixing of Faults and Vulnerabilities via Single Patch. In Quality, IT and Business Operations (pp. 175-190). Springer.
Kudjo, P. K., Chen, J., Mensah, S., Amankwah, R., & Kudjo, C. (2020). The effect of Bellwether analysis on software vulnerability severity prediction models. Software Quality Journal, 1-34.
Kumar, M., & Sharma, A. (2017). An integrated framework for software vulnerability detection, analysis and mitigation: an autonomic system. Sādhanā, 42(9), 1481-1493.
Li, Z., Zou, D., Xu, S., Jin, H., Zhu, Y., & Chen, Z. (2021). SySeVR: A framework for using deep learning to detect software vulnerabilities. IEEE Transactions on Dependable and Secure Computing.
Malhotra, R. (2021). Severity Prediction of Software Vulnerabilities Using Textual Data. Proceedings of International Conference on Recent Trends in Machine Learning, IoT, Smart Cities and Applications,
Mounika, V., Yuan, X., & Bandaru, K. (2019). Analyzing CVE Database Using Unsupervised Topic Modelling. 2019 International Conference on Computational Science and Computational Intelligence (CSCI),
Narang, S., Kapur, P., Damodaran, D., & Majumdar, R. (2018). Prioritizing types of vulnerability on the basis of their severity in multi-version software systems using DEMATEL technique. 2018 7th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO),
Nguyen, V. H., Dashevskyi, S., & Massacci, F. (2016). An automatic method for assessing the versions affected by a vulnerability. Empirical Software Engineering, 21(6), 2268-2297.
Papadimitriou, C. H., Raghavan, P., Tamaki, H., & Vempala, S. (2000). Latent semantic indexing: A probabilistic analysis. Journal of Computer and System Sciences, 61(2), 217-235.
Rehurek, R., & Sojka, P. (2010). Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks,
Roumani, Y., Nwankpa, J. K., & Roumani, Y. F. (2015). Time series modeling of vulnerabilities. Computers & Security, 51, 32-40.
Shahriar, H., & Haddad, H. (2016). Object injection vulnerability discovery based on latent semantic indexing. Proceedings of the 31st Annual ACM Symposium on Applied Computing,
Sharma, R., Sibal, R., & Sabharwal, S. (2019). Software vulnerability prioritization: A comparative study using TOPSIS and VIKOR techniques. In System performance and management analytics (pp. 405-418). Springer.
Stuckman, J., Walden, J., & Scandariato, R. (2016). The effect of dimensionality reduction on software vulnerability prediction models. IEEE Transactions on Reliability, 66(1), 17-37.
Telang, R., & Wattal, S. (2007). An empirical analysis of the impact of software vulnerability announcements on firm stock price. IEEE Transactions on Software engineering, 33(8), 544-557.
Theisen, C., & Williams, L. (2020). Better together: Comparing vulnerability prediction models. Information and Software Technology, 119, 106204.
Vanamala, M., Yuan, X., & Roy, K. (2020). Topic Modeling And Classification Of Common Vulnerabilities And Exposures Database. 2020 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD),
Vijayarani, S., Ilamathi, M. J., & Nithya, M. (2015). Preprocessing techniques for text mining-an overview. International Journal of Computer Science & Communication Networks, 5(1), 7-16.
Walden, J., Stuckman, J., & Scandariato, R. (2014). Predicting vulnerable components: Software metrics vs text mining. 2014 IEEE 25th international symposium on software reliability engineering,
Wu, F., Wang, J., Liu, J., & Wang, W. (2017). Vulnerability detection with deep learning. 2017 3rd IEEE International Conference on Computer and Communications (ICCC),
Zerkane, S. (2018). Security Analysis and Access Control Enforcement through Software Defined Networks Brest].
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika