Türkçe Dizi Etiketleme İçin Sinir Ağ Modelleri
Özet
Because of the inflection of many word forms from the same root in agglutinative languages such as Turkish, modeling the words as a whole causes sparsity problem. Therefore, rather than handling the word as a whole, expressing a word through its characters or considering the morpheme and morpheme label information gives more detailed information about the word and therefore mitigates the sparsity problem.
In this study, a model using deep neural networks is proposed for the sequence labeling task in Turkish. To cope with the sparsity problem, character and morpheme information is used and the effect of this information on sequence labeling problem is examined. The existing deep learning models are applied using different word or sub-word representations for Named Entity Recognition (NER) and Part-of-Speech Tagging (POS Tagging) in Turkish. The results show that using morpheme information improves the sequence labelling in Turkish. Moreover, more accurate results are obtained by using the contextual information in the model.