DTreeSim: A new approach to compute decision tree similarity using re-mining
Küçük Resim Yok
Tarih
2017
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
info:eu-repo/semantics/openAccess
Özet
A number of recent studies have used a decision tree approach as a data mining technique; some of them needed to evaluate the similarity of decision trees to compare the knowledge reflected in different trees or datasets. There have been multiple perspectives and multiple calculation techniques to measure the similarity of two decision trees, such as using a simple formula or an entropy measure. The main objective of this study is to compute the similarity of decision trees using data mining techniques. This study proposes DTreeSim, a new approach that applies multiple data mining techniques (classification, sequential pattern mining, and k-nearest neighbors) sequentially to identify similarities among decision trees. After the construction of decision trees from different data marts using a classification algorithm, sequential pattern mining was applied to the decision trees to obtain rules, and then the k-nearest neighbor algorithm was performed on these rules to compute similarities using two novel measures: general similarity and pieced similarity. Our experimental studies compared the results of these novel similarity measures and also compared our approach with existing approaches. Our comparisons indicate that our proposed approach performs better than existing approaches, because it takes into account the values of the branches in the trees through sequential pattern mining.
A number of recent studies have used a decision tree approach as a data mining technique; some of them needed to evaluate the similarity of decision trees to compare the knowledge reflected in different trees or datasets. There have been multiple perspectives and multiple calculation techniques to measure the similarity of two decision trees, such as using a simple formula or an entropy measure. The main objective of this study is to compute the similarity of decision trees using data mining techniques. This study proposes DTreeSim, a new approach that applies multiple data mining techniques (classification, sequential pattern mining, and k-nearest neighbors) sequentially to identify similarities among decision trees. After the construction of decision trees from different data marts using a classification algorithm, sequential pattern mining was applied to the decision trees to obtain rules, and then the k-nearest neighbor algorithm was performed on these rules to compute similarities using two novel measures: general similarity and pieced similarity. Our experimental studies compared the results of these novel similarity measures and also compared our approach with existing approaches. Our comparisons indicate that our proposed approach performs better than existing approaches, because it takes into account the values of the branches in the trees through sequential pattern mining.
A number of recent studies have used a decision tree approach as a data mining technique; some of them needed to evaluate the similarity of decision trees to compare the knowledge reflected in different trees or datasets. There have been multiple perspectives and multiple calculation techniques to measure the similarity of two decision trees, such as using a simple formula or an entropy measure. The main objective of this study is to compute the similarity of decision trees using data mining techniques. This study proposes DTreeSim, a new approach that applies multiple data mining techniques (classification, sequential pattern mining, and k-nearest neighbors) sequentially to identify similarities among decision trees. After the construction of decision trees from different data marts using a classification algorithm, sequential pattern mining was applied to the decision trees to obtain rules, and then the k-nearest neighbor algorithm was performed on these rules to compute similarities using two novel measures: general similarity and pieced similarity. Our experimental studies compared the results of these novel similarity measures and also compared our approach with existing approaches. Our comparisons indicate that our proposed approach performs better than existing approaches, because it takes into account the values of the branches in the trees through sequential pattern mining.
Açıklama
Anahtar Kelimeler
Mühendislik, Elektrik ve Elektronik
Kaynak
Turkish Journal of Electrical Engineering and Computer Sciences
WoS Q Değeri
Scopus Q Değeri
Cilt
25
Sayı
1