Efficient Transformer Based Sentiment Classification Models
DOI:
https://doi.org/10.31449/inf.v46i8.4332Abstract
Recently, transformer models have gained significance as a state-of-the art technique for sentiment prediction based on text. Attention mechanism of transformer model speeds up the training process by allowing modelling of dependencies without regard to their distance in the input or output sequences. There are two types of transformer models – transformer base models and transformer large models. Since the implementation of large transformer models need better hardware and more training time, we propose new simpler models or weak learners with lower training time for sentiment classification in this work. These models enhance the speed of performance without compromising the classification accuracy. The proposed Efficient Transformer-based Sentiment Classification (ETSC) models are built by setting configuration of large models as minimum, shuffling dataset randomly and experimenting with various percentages of training data. Early stopping and smaller batch size in training techniques improve the accuracy of the proposed model. The proposed models exhibit promising performance in comparison with existing transformer-based sentiment classification models in terms of speed and accuracy.References
Zhang L, Wang S, Liu B. Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2018 Jul;8(4):e1253.
Liu R, Shi Y, Ji C, Jia M. A survey of sentiment analysis based on transfer learning. IEEE Access. 2019 Jun 26;7:85401-12.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. Advances in neural information processing systems. 2017;30.
Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018 Oct 11.
Dai J, Yan H, Sun T, Liu P, Qiu X. Does syntax matter? a strong baseline for aspect-based sentiment analysis with roberta. arXiv preprint arXiv:2104.04986. 2021 Apr 11.
“XLNet, RoBERTa, ALBERT models for Natural Language Processing (NLP).” https://iq.opengenus.org/advanced-nlp-models/ (accessed Oct. 30, 2021).
“Binary Classification - Simple Transformers.” https://simpletransformers.ai/docs/binary-classification/ (accessed Oct. 30, 2021).
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations 2020 Oct (pp. 38-45).
“How do pre-trained models work?. …and why you should use them more often | by Dipam Vasani | Towards Data Science.” https://towardsdatascience.com/how-do-pretrained-models-work-11fe2f64eaa2 (accessed Oct. 30, 2021).
Kant N, Puri R, Yakovenko N, Catanzaro B. Practical text classification with large pre-trained language models. arXiv preprint arXiv:1812.01207. 2018 Dec 4.
Kumar V, Choudhary A, Cho E. Data augmentation using pre-trained transformer models. arXiv preprint arXiv:2003.02245. 2020 Mar 4.
Xu H, Shu L, Yu PS, Liu B. Understanding pre-trained bert for aspect-based sentiment analysis. arXiv preprint arXiv:2011.00169. 2020 Oct 31.
Munikar M, Shakya S, Shrestha A. Fine-grained sentiment classification using BERT. In2019 Artificial Intelligence for Transforming Business and Society (AITB) 2019 Nov 5 (Vol. 1, pp. 1-5). IEEE.
Zhao M, Lin T, Mi F, Jaggi M, Schütze H. Masking as an efficient alternative to finetuning for pretrained language models. arXiv preprint arXiv:2004.12406. 2020 Apr 26.
Naseem U, Razzak I, Musial K, Imran M. Transformer based deep intelligent contextual embedding for twitter sentiment analysis. Future Generation Computer Systems. 2020 Dec 1;113:58-69.
Kaiser L, Bengio S, Roy A, Vaswani A, Parmar N, Uszkoreit J, Shazeer N. Fast decoding in sequence models using discrete latent variables. InInternational Conference on Machine Learning 2018 Jul 3 (pp. 2390-2399). PMLR.
Tang T, Tang X, Yuan T. Fine-tuning BERT for multi-label sentiment analysis in unbalanced code-switching text. IEEE Access. 2020 Oct 12;8:193248-56.
Tang T, Tang X, Yuan T. Fine-tuning BERT for multi-label sentiment analysis in unbalanced code-switching text. IEEE Access. 2020 Oct 12;8:193248-56.
Wang C, Li M, Smola AJ. Language models with transformers. arXiv preprint arXiv:1904.09408. 2019 Apr 20.
Farahani M, Gharachorloo M, Farahani M, Manthouri M. Parsbert: Transformer-based model for persian language understanding. Neural Processing Letters. 2021 Dec;53(6):3831-47.
Cheng X, Xu W, Wang T, Chu W. Variational semi-supervised aspect-term sentiment analysis via transformer. arXiv preprint arXiv:1810.10437. 2018 Oct 24.
Biesialska K, Biesialska M, Rybinski H. Sentiment analysis with contextual embeddings and self-attention. InInternational Symposium on Methodologies for Intelligent Systems 2020 Sep 23 (pp. 32-41). Springer, Cham.
Voita E, Talbot D, Moiseev F, Sennrich R, Titov I. Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned. arXiv preprint arXiv:1905.09418. 2019 May 23.
Hoang M, Bihorac OA, Rouces J. Aspect-based sentiment analysis using bert. In Proceedings of the 22nd nordic conference on computational linguistics 2019 (pp. 187-196).
Xu Q, Zhu L, Dai T, Yan C. Aspect-based sentiment classification with multi-attention network. Neurocomputing. 2020 May 7;388:135-43.
Mathew L, Bindu VR. Efficient classification techniques in sentiment analysis using transformers. International Conference on Innovative Computing and Communications 2022 (pp. 849-862). Springer, Singapore.
Ruder S, Peters ME, Swayamdipta S, Wolf T. Transfer learning in natural language processing. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Tutorials 2019 Jun (pp. 15-18).
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. 2019 Jul 26.
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942. 2019 Sep 26.
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika