Investigation of Luhn's claim on information retrieval

dc.contributor.authorKocabas, Ilker
dc.contributor.authorDincer, Bekir Taner
dc.contributor.authorKaraoglan, Bahar
dc.date.accessioned2019-10-27T21:25:44Z
dc.date.available2019-10-27T21:25:44Z
dc.date.issued2011
dc.departmentEge Üniversitesien_US
dc.description.abstractIn this study, we show how Luhn's claim about the degree of importance of a word in a document can be related to information retrieval. His basic idea is transformed into z -scores as the weights of terms for the purpose of modeling terra frequency (If) within documents. The Luhn-based models represented in this paper are considered as the TF component of proposed TF x IDF weighing schemes. Moreover, the final term weighting functions appropriate for the TF x IDF weighting scheme are applied to TREC-6, -7, and -8 databases. The experimental results show relevance to Luhn's claim by having high mean average precision (MAP) for the terms with frequencies around the mean frequency of terms within a document. On the other hand, the weighting, which significantly discriminates the importance between low/high frequencies and medium frequencies, degrades the retrieval performance. Therefore, any weighting scheme (TF) that is directly proportional to If has a probability of high retrieval performance, if this can optimally indicate the difference of the importance regarding tf values and also optimally eliminate the terms that have high frequencies.en_US
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TUBITAK)Turkiye Bilimsel ve Teknolojik Arastirma Kurumu (TUBITAK) [107E192]en_US
dc.description.sponsorshipThis work was supported by the Scientific and Technological Research Council of Turkey (TUBITAK) within the scope of Project No. 107E192. The authors thank TUBITAK for supporting this project.en_US
dc.identifier.doi10.3906/elk-1003-448
dc.identifier.endpage1004en_US
dc.identifier.issn1300-0632
dc.identifier.issn1303-6203
dc.identifier.issn1300-0632en_US
dc.identifier.issn1303-6203en_US
dc.identifier.issue6en_US
dc.identifier.scopusqualityQ3en_US
dc.identifier.startpage993en_US
dc.identifier.urihttps://doi.org/10.3906/elk-1003-448
dc.identifier.urihttps://hdl.handle.net/11454/44865
dc.identifier.volume19en_US
dc.identifier.wosWOS:000295497900013en_US
dc.identifier.wosqualityQ4en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherTubitak Scientific & Technical Research Council Turkeyen_US
dc.relation.ispartofTurkish Journal of Electrical Engineering and Computer Sciencesen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectLuhnen_US
dc.subjectinformation retrievalen_US
dc.subjectterm weightingen_US
dc.subjectindexingen_US
dc.titleInvestigation of Luhn's claim on information retrievalen_US
dc.typeArticleen_US

Dosyalar