Modeling Semantic Compositionality of Croatian Multiword Expressions

Authors

  • Jan Šnajder
  • Petra Almić

Abstract

A distinguishing feature of many multiword expressions (MWEs) is their semantic non-compositionality. Determining the semantic compositionality of MWEs is important for many natural language processing tasks. We address the task of modeling semantic compositionality of Croatian MWEs. We adopt a composition-based approach within the distributional semantics framework. We build and evaluate models based on Latent Semantic Analysis and the recently proposed neural network-based Skip-gram model, and experiment with different composition functions. We show that the compositionality scores predicted by the Skip-gram additive models correlate well with human judgments (=0.50). When framed as a classification task, the model achieves an accuracy of 0.64.

Downloads

How to Cite

Šnajder, J., & Almić, P. (2015). Modeling Semantic Compositionality of Croatian Multiword Expressions. Informatica, 39(3). Retrieved from https://puffbird.ijs.si/index.php/informatica/article/view/986

Issue

Section

Regular papers