Bilişsel tanı modellerinde data/q matris genişletme yönteminin örtük sınıf dağılımları üzerine etkisi

Yükleniyor...
Küçük Resim

Tarih

2019

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Ege Üniversitesi, Eğitim Bilimleri Enstitüsü

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

Bu araştırmada çoktan seçmeli ve yapılandırılmış yanıtlı madde formatlarında kısmi puanlamanın bilişsel tanı modellerine göre örtük sınıf değişmezliği üzerine etkisi incelenmiştir ve karma maddelerden oluşan test üzerinde data ve Q matris genişletme yöntemleri kullanılarak daha hassas yetenek kestirimi yapılması amaçlanmıştır. Ayrıca Q matrisinin genişletilip yeniden yapılandırılmasının bilişsel tanı modellerinde kullanılabilirliği incelenmektedir. Tezde kullanılan ölçme aracı, yapılandırılmış yanıtlı ve çoktan seçmeli maddelerden oluşan BTM kullanılarak yetenek kestirimlerinin yapılabilmesi amacıyla oluşturulmuş testlerdir. Testler 115K531 no'lu TÜBİTAK Projesi kapsamında yapılan araştırma için hazırlanmıştır ve çalışmanın verileri,115K531 no'lu TÜBİTAK Projesi kapsamında 6.sınıf düzeyindeki öğrencilere uygulanılan matematik alanına ait testlerden elde edilmiştir. Testte maddeler üst düzey düşünme becerilerini ölçecek nitelikte hazırlanmıştır. Maddelerin hazırlanması için öncelikle üst düzey düşünme becerileri ile ilgili madde yazma eğitimi verilmiştir. Sonra maddeler soru havuzundan, içlerinde alan uzmanlarının da bulunduğu hakemlerce puanlanarak seçilmiştir. Pilot uygulamalar sonrasında test hazırlanmış ve esas uygulamalar yapılmıştır. Oluşturulan testin PISA ve TIMMS ile aynı düzeye ve psikometrik özelliklere sahip olduğu yapılan analizler ile ortaya koyulmuştur. Araştırmada bilişsel tanı modeline uygun analizlerin gerçekleştirilebilmesi için test ve maddelerin ilişkilendirilebilecekleri özellik kümeleri belirlenmiş ve buna göre Q matrisler oluşturulmuştur. Alan uzmanları tarafından belirlenmiş Q matris ile maddeler 1-0 şeklinde puanlanıp yetenek kestirimi yapılmıştır. Daha sonra Q matris yeniden yapılandırılarak kısmi puanlanabilecek maddeler belirlenmiştir. Ardından maddeler tekrar puanlanarak yetenek kestirimi yapılmıştır. Bu şekilde madde sayısı iki katına çıkmıştır. Genişletilen datalar ile sınıflama düzeyinde yetenek kestirimleri yapılmıştır ve elde edilen veriler ilk datalardan elde edilen sonuçlar ile karşılaştırılmıştır. Ayrıca çoktan seçmeli ve yapılandırılmış yanıtlı maddelerin kısmi puanlanmasının parametre kestirimlerine etkisi incelenmiştir. Araştırmanın sonuçları incelendiğinde, maddenin yapısına göre kısmi puanlanacak şekilde Q matris ve data genişletildiğinde öğrencilere ilişkin daha detaylı bilgiler elde etme, öğrencileri daha geniş bir yetenek düzleminde görebilme ve onlara daha net bir geri bildirim verme imkanına sahip olunduğu gözlenmiştir.
Introduction In this study, multiple choice and constructed response item formats were analyzed on latent class invariance according to Cognitive Diagnostic Models (CDM). In a qualified education, formative assessment has a great place. In this way, learning deficiencies can be identified and effective learning can be provided by making necessary corrections in a timely manner. Tracking cognitive process skills is also an important part of formative assessment. Commonly used measurement instruments fail to identify these deficiencies. However, field knowledge, high-level thinking skills and cognitive process skills were measured together in the 6th grade mathematics course through the CDM used in the research. Moreover, more than one skill could be examined with one question and the level of having cognitive process skills was determined. Research on the use of CDM is important for tests that measure high level skills in mathematics. It is very important to choose the item format to be used when preparing the measurement instrument. Two factors that will be based on choosing the item format; should be to measure cognitive process and content. In this context, it is the best format to reveal the cognitive process that is highly representative of content and intended to be observed.(Haladyna, 2004). In other words, the format that provides the most accurate assessment for the purpose of the exam is the best. Because each item format and content of the measure different cognitive processes, mixed-use format is recommended to take full advantage of this situation.(Haladyna, 2004). In this research, mathematical thinking skills required for the relationship, mathematical, reasoning and evidence to show, strategy development and so on. competences have been checked. Therefore, since the use of substance in a single format would not be sufficient for the outcomes to be achieved, constructed response and selected response item formats were used together in the study. In this sense, the study is the first in terms of approaches using different substance formats together. In addition, it will contribute to the field in terms of preparing the measurement instrument to measure high-level thinking skills. In this study, DINA model parameters were estimated by means of a composite material test prepared on the basis of CDM. In general, the DINA model parameters are estimated by simulation data or by preparing the Q matrix of the pre-prepared tests. Conducting the study using real data contributes to the literature for this area, which remains at the theoretical level in general. Then, it is aimed to make more accurate talent estimation by using data and Q matrix expansion methods on the test consisting of mixed substances. In addition, the usability of Q matrix expansion and restructuring in CDM is examined. However, the research includes an application of the method of analyzing by reconstructing the Q matrix with partially scored items. It is also one of the first studies conducted with this method in the literature. Considering the limited number of studies based on partial scoring and based on CDM, it is expected to contribute to the literature since no study has been conducted on the determination of implicit classes for both binary and different scoring methods by Q matrix and data expansion. While conducting research, it is essential to reach the correct information and generalize the obtained data. Considering the characteristics of the study, this is an important study in terms of access to realistic parameters due to the large sample size. Method The measurement instrument used in the thesis is the tests designed to make ability estimation by using CDM consisting of constructed responses and multiple choice items. The tests were prepared for the research conducted within the scope of TUBITAK Project 115K531 and the data of the study were obtained from the tests of mathematics applied to 6th grade students under TUBITAK Project 115K531. The items were designed to measure high-level thinking skills. In order to prepare the items, first of all, substance writing training related to high level thinking skills was given. The items were then selected from the question pool by the referees, including field experts. After the pilot applications, the test was prepared and the main applications were made. The analysis revealed that the test had the same level and psychometric properties as PISA and TIMMS.In this study, feature sets that the test and items could be related to were determined and Q matrices were formed according to the cognitive diagnostic model. With the Q matrix determined by the field experts, the items were scored as 1-0 and talent estimation was performed. Then, the Q matrix was restructured and the items that could be scored partially were determined. Then, the items were re-scored and ability estimation was performed. In this way, the number of items has doubled. Ability estimations were made at the classification level with the expanded data and the obtained data were compared with the results obtained from the first data. In addition, the effect of partial scoring of multiple choice and constructed response items on parameter estimations was examined. Findings When examining the DINA model parameters in CDM, g and s parameters calculated at the item level for this model and the standard errors of these parameters are taken into consideration. The basis is the idea that each item divides the test group into two classes. The status of being in the same class is interpreted as equal probability of respondents to the same class. In this model which is based on probability, s parameter is; the student does not answer the item correctly even though he / she has all the necessary attributes to answer the item, g parameter; The student is interpreted as correct answer to the item even though it does not have the necessary attributes to respond to the item (De La Torre, 2008). When the g parameters of the partially scored items were examined, it was found that the lowest g parameter was 0 and the highest g parameter was 0.71. It is concluded that the values of the g parameter vary between 0 and 0.71 in the items scored dichotomically. When the s parameters related to the partially scored items were examined, it was found that the lowest s parameter was 0 and the highest s parameter was 0.77. In this case, it is concluded that s parameter takes values between 0 and 0.77. When the g parameters of dicotomic (binary) items were examined, the lowest g parameter was found to be 0.12 and the highest g parameter was 0.42. It is concluded that the values of the g parameter vary between 0.12 and 0.42 in the items scored dichotomically. When the s parameters were examined, it was found that the lowest s parameter was 0.07 and the highest s parameter was 0.77. In this case, it is concluded that s parameter takes values between 0.07 and 0.77. Another finding of the study is the latent classes and posterior probabilities of dichotomic scored items (DSI) and partial scoring items (PSI). In order to compare the latent class posterior probabilities observed for DSI with PSI, total probabilities for corresponding probability patterns were determined in PSI and the number of latent classes observed after analysis in PSI were also indicated. It is seen that the items for partial scores have lower probability values because they are distributed to more implicit classes. However, it is understood that the sum of probability values is close to each other with respect to dichotomic substances. There are 16 implicit classes for dicotomic substances and 256 different implicit classes for partially scored items. Any of the implicit classes (11110000), (01000000), (00010000), (01010000), (01100000), (01110000) may correspond. This is indicative of a more detailed analysis for the same students. In addition, it is observed that the probability increases especially in the classes with more attributes. When the partial scores were evaluated, it was observed that the probabilities for the 8 properties ranged between 0.5047 and 0.6342. When the communication and association skills were determined to be 0.4956 observed probability in DSI, it was observed that the probability of observation increased to 0.6097 and 0.6347. This can be considered as an indicator of increased sensitivity, in other words, increased observation. When the items were transformed into constructed response item forms, the average difficulty values for the latent classes and the findings of the IRT ability estimations were observed, the latent class distributions of the students were close to each other. Although a high correlation was determined for both cases, it was observed that the students who fell into the same range with the dichotomic scoring method had a wider range of predictions when the partial scoring method was used. This can be considered as an indicator that more reliable measurements are made for the items scored partially for reliability.After reconstructing the Q matrix, the ability scores obtained from the partially scored items according to the IRT and the findings of Absolute Success Percentages were observed to increase as the number of implicit features increased and the predictions of the students increased in both types. However, it was observed that both lower ability level and upper ability level can be predicted for the items scored partially. For example; While the average of the group without any characteristics in the IRT analysis was -0.92, the same situation reached up to -1.46 in the partially scored items. This shows that it is possible to evaluate students at a wider scale.After the Q matrix is structured according to partial scoring, when the data related to property distributions are examined, it is seen that the distributions of property possessions are similar in both cases, but it is seen in the students with four characteristics are distributed to higher classes in partial scoring. In addition, when the findings related to the level of possessions were examined, it was found that some features were not completely hierarchical. Discussion and Conclusion In this section, the findings of the tests used in the research are discussed and interpreted within the framework of the literature.When the results of the study were examined, it was found that a test prepared using the data augmentation method was more structurally reliable and could be organized in a way to give more detailed information about the students. As it was seen in the results, the results were consistent and similar between the four skill tests and the eight attributes tests. It can be said that it may be appropriate for CDM to create measurement situations that can get more detailed information for a single test by partial scoring the items. However, the findings of the study show that this method can be used not only for constructed response items but also for selected response items. As a method for the first time in the field, augmentation of the Q matrix according to the properties of the material also means increasing the number of items in the test. Furthermore, it is observed that the psychometric properties of the test are maintained after this procedure. As it is known, there are problems in statistical inferences regarding the reliability and validity of the test in structured responsive or open-ended item approaches to obtain more detailed information about the student. Therefore, although constructed response items have the potential to provide more detailed information about the student, its validity and reliability are difficult to prove.In this study, it was found that the method can be used with similar statistical robustness for partial scoring in constructed response and selected response items for CDM models.At the same time, another result shows that it is possible to set up a test structure where constructed response and selected response items can work together by organizing the analyzes in the tests performed with mixed item format. When the studies on the mixed item format are examined, it is seen that the approach here is usable and can be examined in terms of the field, especially considering the limited empirical data in the field of CDM.When the findings obtained according to the results of the research are taken into consideration, it is seen that it is possible to expand the matrices of the investigators who want to develop tests according to CDM not only for the correct answer cases but also for the matrices of other alternative responses.

Açıklama

Anahtar Kelimeler

Bilişsel Tanı Modelleri, Data/Q Matris Genişletme Yöntemi, Örtük Sınıf Değişmezliği, Karma Madde Formatı, Cognitive Diagnosis Models, Data / Q Matrix Augmentation Method, Latent Class Invariance, Constructed Response Items, Selected Response Items

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye