Basit öğe kaydını göster

dc.contributor.authorOzturk, Burak
dc.contributor.authorCan, Burcu
dc.date.accessioned2021-06-07T07:30:02Z
dc.date.available2021-06-07T07:30:02Z
dc.date.issued2019
dc.identifier.issn1300-0632
dc.identifier.urihttp://dx.doi.org/10.3906/elk-1804-10
dc.identifier.urihttp://hdl.handle.net/11655/24590
dc.description.abstractTurkish is an agglutinative language with rich morphology. A Turkish verb can have thousands of different word forms. Therefore, sparsity becomes an issue in many Turkish natural language processing (NLP) applications. This article presents a model for Turkish lexicon expansion. We aimed to expand the lexicon by using a morphological segmentation system by reversing the segmentation task into a generation task. Our model uses finite-state automata (FSA) to incorporate orthographic features and morphotactic rules. We extracted orthographic features by capturing phonological operations that are applied to words whenever a suffix is added. Each FSA state corresponds to either a stem or a suffix category. Stems are clustered based on their parts-of-speech (i.e. noun, verb, or adjective) and suffixes are clustered based on their allomorphic features. We generated approximately 1 million word forms by using only a few thousand Turkish stems with an accuracy of 82.36%, which will help to reduce the out-of-vocabulary size in other NLP applications. Although our experiments are performed on Turkish language, the same model is also applicable to other agglutinative languages such as Hungarian and Finnish.
dc.language.isoen
dc.relation.isversionof10.3906/elk-1804-10
dc.rightsAttribution 4.0 United States
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectfinite-state automata
dc.subjectlexicon expansion
dc.subjectmorphological generation
dc.subjectMorphology
dc.titleTurkish Lexicon Expansion By Using Finite State Automata
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/publishedVersion
dc.relation.journalTurkish Journal Of Electrical Engineering And Computer Sciences
dc.contributor.departmentBilgisayar Mühendisliği
dc.identifier.volume27
dc.identifier.issue2
dc.description.indexWoS
dc.description.indexScopus


Bu öğenin dosyaları:

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster

Attribution 4.0 United States
Aksi belirtilmediği sürece bu öğenin lisansı: Attribution 4.0 United States