Automatic Question Generation Monolingual Multilingual pre-trained Models using RNN and Transformer in Low Resource Indonesian Language
DOI:
https://doi.org/10.31449/inf.v46i7.4236Abstract
Although Indonesian is the fourth most frequently used language on the internet, the development of NLP in Indonesian has not been studied intensively. One form of NLP application classified as an NLG task is the Automatic Question Generation task. Generally, the task has proven well, using rule-based and cloze tests, but these approaches depend heavily on the defined rules. While this approach is suitable for automated question generation systems on a small scale, it can become less efficient as the scale of the system grows. Many NLG model architectures have recently proven to have significantly improved performance compared to previous architectures, such as generative pre-trained transformers, text-to-text transfer transformers, bidirectional autoregressive transformers, and many more. Previous studies on AQG in Indonesian were built on RNN-based architecture such as GRU, LSTM, and Transformer. The performance of models in previous studies is compared with state-of-the-art models, such as multilingual models mBART and mT5, and monolingual models such as IndoBART and IndoGPT. As a result, the fine-tuned IndoBART performed significantly higher than either BiGRU and BiLSTM on the SQuAD dataset. Fine-tuned IndoBART on most of the metrics also performed better on the TyDiQA dataset only, which has fewer population than the SQuAD dataset.References
G. Kurdi, J. Leo, B. Parsia, U. Sattler, and
S. Al-Emari, “A systematic review of automatic question generation for educational purposes,” International Journal of Artificial Intelligence in Education, vol. 30, 11 2019.
N.-T. Le, T. Kojiri, and N. Pinkwart, “Automatic question generation for educational
applications – the state of art,” Advances in
Intelligent Systems and Computing, vol. 282,
pp. 325–338, 01 2014.
J. Jamiluddin and V. Ramadayanti, “Developing Students’ Reading Comprehension
Through Question Generation Strategy,” eJournal of ELTS (English Language Teaching
Society), vol. 8, no. 1, Apr. 2020.
S. Cahyawijaya, G. I. Winata, B. Wilie,
K. Vincentio, X. Li, A. Kuncoro, S. Ruder,
Z. Y. Lim, S. Bahar, M. L. Khodra,
A. Purwarianti, and P. Fung, “IndoNLG:
Benchmark and Resources for Evaluating Indonesian Natural Language Generation,” arXiv:2104.08200 [cs], Oct. 2021,
arXiv: 2104.08200. [Online]. Available:
http://arxiv.org/abs/2104.08200
C. A. Nwafor and I. E. Onyenwe, “An
automated multiple-choice question generation using natural language processing
techniques,” International Journal on Natural Language Computing, vol. 10, no. 02,
p. 1–10, Apr 2021. [Online]. Available:
http://dx.doi.org/10.5121/ijnlc.2021.10201
A. D. Lelkes, V. Q. Tran, and C. Yu, “Quizstyle question generation for news stories,”
A. Graesser, V. Rus, S. D’Mello, and G. Jackson, “Autotutor: Learning through natural
language dialogue that adapts to the cognitive
and affective states of the learner,” Current
Perspectives on Cognition, Learning and Instruction: Recent Innovations in Educational
Technology that Facilitate Student Learning,
pp. 95–125, 01 2008.
N.-T. Le, T. Kojiri, and N. Pinkwart, “Automatic question generation for educational
applications – the state of art,” Advances in
Intelligent Systems and Computing, vol. 282,
pp. 325–338, 01 2014.
J. H. Wolfe, “Automatic question generation
from text - an aid to independent study,” in
SIGCSE ’76, 1976.
W. Yuan, T. He, and X. Dai,
“Improving Neural Question Generation using Deep Linguistic Representation,” in Proceedings of the Web Conference
Ljubljana Slovenia: ACM, Apr.
, pp. 3489–3500. [Online]. Available:
https://doi.org/10.1145/3442381.3449975
K. Vachev, M. Hardalov, G. Karadzhov,
G. Georgiev, I. Koychev, and P. Nakov,
“Leaf: Multiple-choice question generation,”
CoRR, vol. abs/2201.09012, 2022. [Online].
Available: https://arxiv.org/abs/2201.09012
G. Lai, Q. Xie, H. Liu, Y. Yang, and
E. H. Hovy, “RACE: large-scale reading
comprehension dataset from examinations,”
CoRR, vol. abs/1704.04683, 2017. [Online].
Available: http://arxiv.org/abs/1704.04683
T. Mihaylov, P. Clark, T. Khot, and
A. Sabharwal, “Can a suit of armor
conduct electricity? A new dataset for
open book question answering,” CoRR, vol. abs/1809.02789, 2018. [Online]. Available:
http://arxiv.org/abs/1809.02789
P. Clark, O. Etzioni, D. Khashabi, T. Khot,
B. D. Mishra, K. Richardson, A. Sabharwal,
C. Schoenick, O. Tafjord, N. Tandon, S. Bhakthavatsalam, D. Groeneveld, M. Guerquin,
and M. Schmitz, “From ’f’ to ’a’ on
the N.Y. regents science exams: An
overview of the aristo project,” CoRR, vol.
abs/1909.01958, 2019. [Online]. Available:
http://arxiv.org/abs/1909.01958
P. Clark, I. Cowhey, O. Etzioni, T. Khot,
A. Sabharwal, C. Schoenick, and O. Tafjord,
“Think you have solved question answering?
try arc, the AI2 reasoning challenge,” CoRR,
vol. abs/1803.05457, 2018. [Online]. Available:
http://arxiv.org/abs/1803.05457
O. Tafjord, P. Clark, M. Gardner,
W. Yih, and A. Sabharwal, “Quarel: A
dataset and models for answering questions
about qualitative relationships,” CoRR, vol.
abs/1811.08048, 2018. [Online]. Available:
http://arxiv.org/abs/1811.08048
F. C. Akyon, D. Cavusoglu, C. Cengiz,
S. O. Altinuc, and A. Temizel, “Automated
question generation and question answering from Turkish texts using text-to-text
transformers,” arXiv:2111.06476 [cs], Nov.
, arXiv: 2111.06476. [Online]. Available:
http://arxiv.org/abs/2111.06476
F. J. Muis and A. Purwarianti, “Sequence-tosequence learning for indonesian automatic
question generator,” 2020 7th International
Conference on Advanced Informatics: Concepts, Theory and Applications, ICAICTA
, 9 2020.
P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuAD: 100,000+
questions for machine comprehension of
text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, Texas: Association for Computational Linguistics, Nov.
, pp. 2383–2392. [Online]. Available:
https://aclanthology.org/D16-1264
J. H. Clark, E. Choi, M. Collins, D. Garrette, T. Kwiatkowski, V. Nikolaev, and J. Palomaki, “TyDi QA: A benchmark for
information-seeking question answering in typologically diverse languages,” Transactions
of the Association for Computational Linguistics, vol. 8, pp. 454–470, 2020. [Online]. Available: https://aclanthology.org/2020.tacl-1.30
M. Schuster and K. Paliwal, “Bidirectional
recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11,
pp. 2673–2681, Nov. 1997. [Online]. Available:
http://ieeexplore.ieee.org/document/650093/
K. Kriangchaivech and A. Wangperawong, “Question Generation by Transformers,” arXiv:1909.05017 [cs], Sep. 2019,
arXiv: 1909.05017. [Online]. Available:
http://arxiv.org/abs/1909.05017
J. Hu, S. Ruder, A. Siddhant, G. Neubig,
O. Firat, and M. Johnson, “XTREME: A
massively multilingual multi-task benchmark
for evaluating cross-lingual generalization,”
CoRR, vol. abs/2003.11080, 2020. [Online].
Available: https://arxiv.org/abs/2003.11080
P. Colombo, N. Noiry, E. Irurozki, and
S. Cl´emen¸con, “What are the best systems?
new perspectives on NLP benchmarking,”
CoRR, vol. abs/2202.03799, 2022. [Online].
Available: https://arxiv.org/abs/2202.03799
B. Wilie, K. Vincentio, G. I. Winata,
S. Cahyawijaya, X. Li, Z. Y. Lim, S. Soleman, R. Mahendra, P. Fung, S. Bahar, and A. Purwarianti, “IndoNLU:
Benchmark and resources for evaluating
Indonesian natural language understanding,” in Proceedings of the 1st Conference
of the Asia-Pacific Chapter of the Association for Computational Linguistics and the
th International Joint Conference on Natural Language Processing. Suzhou, China:
Association for Computational Linguistics,
Dec. 2020, pp. 843–857. [Online]. Available:
https://aclanthology.org/2020.aacl-main.85
A. Wang, A. Singh, J. Michael, F. Hill,
O. Levy, and S. Bowman, “GLUE: A multitask benchmark and analysis platform for
natural language understanding,” in Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Brussels, Belgium: Association for Computational Linguistics,
Nov. 2018, pp. 353–355. [Online]. Available:
https://aclanthology.org/W18-5446
S. Gehrmann, T. Adewumi, K. Aggarwal,
P. S. Ammanamanchi, A. Aremu, A. Bosselut, K. R. Chandu, M.-A. Clinciu, D. Das,
K. Dhole, W. Du, E. Durmus, O. Duˇsek,
C. C. Emezue, V. Gangal, C. Garbacea,
T. Hashimoto, Y. Hou, Y. Jernite, H. Jhamtani, Y. Ji, S. Jolly, M. Kale, D. Kumar,
F. Ladhak, A. Madaan, M. Maddela, K. Mahajan, S. Mahamood, B. P. Majumder,
P. H. Martins, A. McMillan-Major, S. Mille,
E. van Miltenburg, M. Nadeem, S. Narayan,
V. Nikolaev, A. Niyongabo Rubungo, S. Osei,
A. Parikh, L. Perez-Beltrachini, N. R. Rao,
V. Raunak, J. D. Rodriguez, S. Santhanam,
J. Sedoc, T. Sellam, S. Shaikh, A. Shimorina,
M. A. Sobrevilla Cabezudo, H. Strobelt,
N. Subramani, W. Xu, D. Yang, A. Yerukola,
and J. Zhou, “The GEM benchmark: Natural language generation, its evaluation and
metrics,” in Proceedings of the 1st Workshop
on Natural Language Generation, Evaluation,
and Metrics (GEM 2021). Online: Association for Computational Linguistics,
Aug. 2021, pp. 96–120. [Online]. Available:
https://aclanthology.org/2021.gem-1.10
M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and
L. Zettlemoyer, “BART: Denoising sequenceto-sequence pre-training for natural language
generation, translation, and comprehension,”
in Proceedings of the 58th Annual Meeting of
the Association for Computational Linguistics.
Online: Association for Computational Linguistics, Jul. 2020, pp. 7871–7880. [Online].
Available: https://aclanthology.org/2020.aclmain.703
T. Wolf, L. Debut, V. Sanh, J. Chaumond,
C. Delangue, A. Moi, P. Cistac, T. Rault,
R. Louf, M. Funtowicz, J. Davison, S. Shleifer,
P. von Platen, C. Ma, Y. Jernite, J. Plu,
C. Xu, T. Le Scao, S. Gugger, M. Drame,
Q. Lhoest, and A. Rush, “Transformers:
State-of-the-art natural language processing,” in Proceedings of the 2020 Conference
on Empirical Methods in Natural Language Processing: System Demonstrations. Online:
Association for Computational Linguistics,
Oct. 2020, pp. 38–45. [Online]. Available: https://aclanthology.org/2020.emnlpdemos.6
Y. Liu, J. Gu, N. Goyal, X. Li,
S. Edunov, M. Ghazvininejad, M. Lewis,
and L. Zettlemoyer, “Multilingual Denoising Pre-training for Neural Machine
Translation,” Transactions of the Association for Computational Linguistics, vol. 8,
pp. 726–742, 11 2020. [Online]. Available:
https://doi.org/10.1162/tacl a 00343
P. Rajpurkar, R. Jia, and P. Liang, “Know
what you don’t know: Unanswerable questions for SQuAD,” in Proceedings of the
th Annual Meeting of the Association for
Computational Linguistics (Volume 2: Short
Papers). Melbourne, Australia: Association for Computational Linguistics, Jul.
, pp. 784–789. [Online]. Available:
https://aclanthology.org/P18-2124
J. Devlin, M.-W. Chang, K. Lee, and
K. Toutanova, “BERT: Pre-training of deep
bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long
and Short Papers). Minneapolis, Minnesota:
Association for Computational Linguistics,
Jun. 2019, pp. 4171–4186. [Online]. Available:
https://aclanthology.org/N19-1423
T. B. Brown, B. Mann, N. Ryder, M. Subbiah,
J. Kaplan, P. Dhariwal, A. Neelakantan,
P. Shyam, G. Sastry, A. Askell, S. Agarwal,
A. Herbert-Voss, G. Krueger, T. Henighan,
R. Child, A. Ramesh, D. M. Ziegler, J. Wu,
C. Winter, C. Hesse, M. Chen, E. Sigler,
M. Litwin, S. Gray, B. Chess, J. Clark,
C. Berner, S. McCandlish, A. Radford,
I. Sutskever, and D. Amodei, “Language
Models are Few-Shot Learners,” in Advances
in Neural Information Processing Systems,
H. Larochelle, M. Ranzato, R. Hadsell,
M. Balcan, and H. Lin, Eds., vol. 33.
Curran Associates, Inc., Jul. 2020, pp. 1877–1901, arXiv: 2005.14165. [Online]. Available:
https://arxiv.org/pdf/2005.14165.pdf
Y. Tang, C. Tran, X. Li, P.-J. Chen,
N. Goyal, V. Chaudhary, J. Gu, and
A. Fan, “Multilingual translation from
denoising pre-training,” in Findings of
the Association for Computational Linguistics: ACL-IJCNLP 2021. Online: Association for Computational Linguistics,
Aug. 2021, pp. 3450–3466. [Online]. Available: https://aclanthology.org/2021.findingsacl.304
C. Raffel, N. Shazeer, A. Roberts,
K. Lee, S. Narang, M. Matena, Y. Zhou,
W. Li, and P. J. Liu, “Exploring the
limits of transfer learning with a unified text-to-text transformer,” Journal of
Machine Learning Research, vol. 21, no.
, pp. 1–67, 2020. [Online]. Available:
http://jmlr.org/papers/v21/20-074.html
L. Xue, N. Constant, A. Roberts, M. Kale,
R. Al-Rfou, A. Siddhant, A. Barua,
and C. Raffel, “mT5: A massively
multilingual pre-trained text-to-text transformer,” in Proceedings of the 2021 Conference of the North American Chapter of
the Association for Computational Linguistics: Human Language Technologies. Online:
Association for Computational Linguistics,
Jun. 2021, pp. 483–498. [Online]. Available: https://aclanthology.org/2021.naaclmain.41
M. E. Peters, M. Neumann, M. Iyyer,
M. Gardner, C. Clark, K. Lee, and
L. Zettlemoyer, “Deep contextualized word
representations,” in Proceedings of the 2018
Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies, Volume
(Long Papers). New Orleans, Louisiana:
Association for Computational Linguistics,
Jun. 2018, pp. 2227–2237. [Online]. Available:
https://aclanthology.org/N18-1202
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and
I. Polosukhin, “Attention Is All You Need,” in
Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17. Red Hook, NY, USA:
Curran Associates Inc., 2017, p. 6000–6010.
Downloads
Published
How to Cite
Issue
Section
License
I assign to Informatica, An International Journal of Computing and Informatics ("Journal") the copyright in the manuscript identified above and any additional material (figures, tables, illustrations, software or other information intended for publication) submitted as part of or as a supplement to the manuscript ("Paper") in all forms and media throughout the world, in all languages, for the full term of copyright, effective when and if the article is accepted for publication. This transfer includes the right to reproduce and/or to distribute the Paper to other journals or digital libraries in electronic and online forms and systems.
I understand that I retain the rights to use the pre-prints, off-prints, accepted manuscript and published journal Paper for personal use, scholarly purposes and internal institutional use.
In certain cases, I can ask for retaining the publishing rights of the Paper. The Journal can permit or deny the request for publishing rights, to which I fully agree.
I declare that the submitted Paper is original, has been written by the stated authors and has not been published elsewhere nor is currently being considered for publication by any other journal and will not be submitted for such review while under review by this Journal. The Paper contains no material that violates proprietary rights of any other person or entity. I have obtained written permission from copyright owners for any excerpts from copyrighted works that are included and have credited the sources in my article. I have informed the co-author(s) of the terms of this publishing agreement.
Copyright © Slovenian Society Informatika