Identifying Collocations in Turkish Using Statistical Methods

dc.contributor.authorMetin, Senem Kumova
dc.contributor.authorKaraoglan, Bahar
dc.date.accessioned2019-10-27T23:09:28Z
dc.date.available2019-10-27T23:09:28Z
dc.date.issued2016
dc.departmentEge Üniversitesien_US
dc.description.abstractCollocation is the combination of words in which words appear together more often than by chance in order to create a block of meaning. Since the extraction of collocations provides many benefits in automatic processing, translation of Turkish texts and in learning Turkish, it is an important issue in Turkish natural language processing. In this study several statistical techniques, including occurrence frequency, pointwise mutual information and hypothesis tests, are applied on Turkey Turkish corpus to automatically identify collocations. We have utilized both stemmed and surface forms of words in order to explore the effect of stemming in collocation extraction. The techniques are evaluated using the F-measure. The chi-square hypothesis test and pointwise mutual information methods have produced better results compared to other methods. In addition, we have observed that when words are stemmed, methods which may be considered as successful in collocation extraction may be more clearly discriminated.en_US
dc.identifier.endpage286en_US
dc.identifier.issn1301-0549
dc.identifier.issn1301-0549en_US
dc.identifier.issue78en_US
dc.identifier.startpage253en_US
dc.identifier.urihttps://hdl.handle.net/11454/52656
dc.identifier.wosWOS:000390423500010en_US
dc.identifier.wosqualityQ4en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.language.isotren_US
dc.publisherAhmet Yesevi Univen_US
dc.relation.ispartofBiligen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectCollocationen_US
dc.subjectTurkey Turkishen_US
dc.subjectnatural language processingen_US
dc.subjectcorpusen_US
dc.titleIdentifying Collocations in Turkish Using Statistical Methodsen_US
dc.typeArticleen_US

Dosyalar