JJES » JJES Issues
 
 Yarmouk Journals
Home
Editorial Board
Consulative Board
Manuscript Submission
Publication Guidelines
JJES Issues  
Contact Address
 

 

استخدام نظرية إمكانية التعميم في تقدير ثبات اختبار تقييم كفاءة الرياضيات لدى طلاب السنة الرابعة ابتدائي

فاروق طباع

جامعة مولود معمري – الجزائر.

Doi://10.47015/16.1.1  

JJES,16(1), 2020, 1-18

ملخص: استخدمت الدراسة الحالية نظرية إمكانية التعميم في تقدير ثبات اختبار تقييم كفاءة الطلاب في الرياضيات، وأثناء الدراسة، طبق اختبار يشتمل تسع مهمات معقدة موزعة على ثلاث صيغ: أ) ثلاث مهمات محكمة البناء، ب) ثلاث مهمات غير محكمة البناء، ج) ثلاث مهمات ذات معلومات مشوشة، كأساس لتقييم كفاءة الأعداد والحساب، على عينة مكونة من (331) طالبًا في السنة الرابعة ابتدائي. وقد شارك في عملية تقييم أداء الطلاب ثلاثة مقدرين مدربين باستخدام شبكات تصحيح تحليلية، وتم تحليل البيانات بواسطة تصميم ثنائي البُعد متقاطع كليًا "طالب× مهمة× مقدر" باستخدام حزمة "EduG". أظهرت نتائج الدراسة وجود مصادر خطأ كبيرة راجعة إلى أثر تفاعل الطالب مع المهمة والتأثير الرئيسي للمهمة. ومن أجل ضمان مستويات ثبات مقبولة يجب زيادة عدد المهمات وليس عدد المقدرين. كما ينبغي العناية أثناء استخدام المهمات المعقدة في قياسات تقييم الكفاءة.

(الكلمات المفتاحية: نظرية التعميم، الثبات، اختبار تقييم الكفاءة، مصادر تباين الخطأ، المهمات المعقدة).

 

Using Generalizability Theory in Estimating Reliability of a Mathematical Competence Assessment Test of Fourth Year Primary School Students

 

Farouq Tebaa, Mouloud Mammeri University, Algeria.

 

Abstract: The current study used Generalizability Theory to estimate the reliability of a mathematical competence assessment test. During the study, the test was composed of nine different complex task formats: a) three well-defined tasks, b) three ill-defined tasks and c) three tasks with parasite data. These tasks were administered to a sample of (331) fourth year primary school students. Three trained raters participated in the scoring process by means of analytic scoring rubrics. Data collected were analyzed in terms of a fully crossed two-faceted design “person× task× rater” using “EduG” package. Research results showed substantial sources of error due to person-task interaction effect and task main effect. To ensure acceptable levels of reliability, the number of tasks should be increased but not the number of raters. As such, special caution should be put on the use of complex tasks in competence assessment measures.

(Keywords: Generalizability Theory, Reliability, Competence Assessment Test, Sources of Error Variance, Complex Tasks).

 

References

Allam, S. (2000). Psychological and educational measurement and assessment: Principles, practices and modern perspectives. Cairo: Dar Al-Fikr Al-Arabi.

Allam, S. (2004). Alternative educational assess-ment: Theoritical and methodological basics and pratical applications. Cairo: Dar Al-Fikr Al-Arabi.

Baartman, L., Bastiaens, T., Kirschner, P., & Van der Vleuten, C. (2006). The wheel of competency assessment: Presenting quality criteria for competency assessment program-mes. Studies in Educational Evaluation, 32(2), 153-177.

Baartman, L., Prins, F., Kirschner, P., & Van der Vleuten, C. P. (2007). Determining the quality of competence assessment programs: A self-evaluation procedure. Studies in Educational Evaluation, 33(3-4), 258-281.

Bain, D. (2008). Radiographie d’une épreuve commune de mathématiques au moyen du modèle de la généralisabilité. Actes du 20ème Colloque. Genève: ADMEE-Europe. Availible online at: https://plone.unige.ch/sites/admee08 /symposiums.

Bain, D. (2014). Généralisabilité et évaluation des compétences: Pistes et fausses pistes. In C. Dierendonck (Ed.), L'évaluation des compétences en milieu scolaire et en milieu professionnel. Bruxelles: De Boeck Supérieur.

Bain, D., & Pini, G. (1996). Pour évaluer vos évaluations. La généralisabilité : Mode d’emploi. Genève: Centre de Recherches Psychopédagogiques.

Boston, C. (2002). Understanding scoring rubrics: A guide for teachers. University of Maryland: ERIC Clearinghouse.

Brennan, R. (1992). Generalizability theory. Educ-ational Measurement: Issues and Practice, 11(4), 27-34.

Brennan, R. (2000). Performance assessments from the perspective of generalizability theory. Applied Psychological Measurement, 24(4), 339–353.

Brennan, R. (2001). Generalizability Theory. New York: Springer-Verlag.

Briesch, A., Swaminathan, H., Welsh, M., & Chafouleas, S. (2014). Generalizability theory: A practical guide to study design implementa-tion, and interpretation. Journal of School Psychology, 52(1), 13–35.

Cardinet, J. (1988). Evaluation scolaire et pratiq-ue. Bruxelles: De Boeck-Wesmael.

Cardinet, J., & Tourneur, Y. (1985). Assurer la mesure. Berne: Peter Lang.

Cardinet, J., Sandra, J., & Pini, G. (2010). Applying generalizability theory using EDUG. New York: Routledge.

Cardinet, J., Tourneur, Y., & Allal, L. (1976). The symmetry of generalizability theory: Applicat-ion to educational measurement. Journal of Educational Measurement, 13(2), 119-135.

Casanova, D., & Demeuse, M. (2011). Analyse de différentes facettes influant sur la fiabilité de l’épreuve d’expression écrite d’un test de français langue étrangère. Mesure et Évaluat-ion en Éducation, 34(1), 25-53.

Chen, E., Niemi, D., Wang, H., & Mirocha, J. (2007). Examining the generalizability of direct writing assessment tasks. University of California. Los Angeles: CRESST (CSE Technical Report N° 718). Available online at: cresst.org/publications/cresst-publication-3089

Cronbach, L., Linn, R., Brennan, R., & Haertel, E. (1997). Generalizability analysis for performa-nce assessments of student achievement or school effectiveness. Educational and Psycho-logical Measurement, 57(3), 373-399.

Cronbach, L., Rajaratnam, N., & Gelser, G. (1963). Theory of generalizability: A liberalization of reliability theory. British Journal of Mathem-atical and Statistical Psychology, 16(2), 137–163.

De Ketele, J., & Gerard, F. (2005). La validation des épreuves selon l’approche par compéten-ces. Mesure et Évaluation en Éducation, 28(3), 1-26.

Dunbar, S. B., Koretz, D. M., & Hoover, H. D. (1991). Quality control in the development and ue of performance assessments. Applied Measurement in Education, 4(4), 289-303.

Feuer, M., & Fulton, K. (1993). The many faces of performance assessment. The Phi Delta Kappan, 74(6), 478.

Gao, X., & Brennan, R. (2001). Variability of estimated variance components and related statistics in a performance assessment. Applied Measurement in Education, 14(2), 191–203.

Gebril, A. (2009). Score generalizability of academic writing tasks: Does one test method fit it all? Language Testing, 26(4), 507–531.

Gerard, F. (2006). L’évaluation des acquis des élèves dans le cadre de la réforme éducative en Algérie. In N. Toualbi-Thaâlibi (Ed.), Réforme de l’éducation et innovation pédagogique en Algérie. UNESCO: ONPS.

Güler, N., & Gelbal, S. (2010). Studying reliability of open-ended mathematics items according to the generalizability theory. Educational Scien-ces: Theory & Practice, 10(2), 1011-1019.

Hébert, M., Valois, P., Scallon, G., & Frenette, E. (2014). Fiabilité d’un dispositif d’évaluation de l’habileté à déterminer le résultat d’une chaîne d’opérations chez des élèves québécois du secondaire. Mesure et évaluation en éducation, 37(1), 21-41.

Huang, C. (2009). Magnitude of task-sampling variability in performance assessment: A meta-analysis. Educational and Psychological Measurement, 69(6), 887-912.

In’nami, Y., & Koizumi, R. (2015). Task and rater effects in L2 speaking and writing: A synthesis of generalizability studies. Language Testing, 33(3), 341-366.

Johnson, R., Penny, J., & Gordan, B. (2009). Assessing performance: Designing, scoring, and validating performance tasks. New York: Guilford Press.

Lane, S., Liu, M., Ankenmann, R., & Stone, C. (1996). Generalizability and validity of a mathematics performance assessment. Journal of Educational Measurement, 33(1), 71-92.

Lee, Y., & Kantor, R. (2007). Evaluating prototype tasks and alternative rating schemes for a new ESL writing test through G-theory. Internat-ional Journal of Testing, 7(4), 353-385.

McBee, M., & Barnes, L. (1998). The generaliz-ability of a performance assessment measure-ing achievement in eight-grade mathe-matics. Applied Measurement in Education, 11(2), 179-194.

Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741-749.

Meyer, J., (2010). Reliability. New York: Oxford University Press.   

Miller, M., & Linn, R. (2000). Validation of performance-based assessments. Applied Psy-chological Measurement, 24(4), 367–378.

Ministry of National Education. (2011). Fourth year primary school curriculum. Algiers: The National Authority for School Publications.

Nie, Y., Yeo, S., & Lau, S. (2007). Application of generalizability theory in the investigation of the quality of journal writing in mathematics. Studies in Educational Evaluation, 33(3-4), 371-383.

Parkes, J. (2000). The relationship between the reliability and cost of performance assessme-nts. Education Policy Analysis Archives, 8(16), 1-14.

Parkes, J. (2001). The role of transfer in the variability of performance assessment scores. Educational Assessment, 7(2), 143–164.

Parkes, J., Zimmaro, D., Zappe, S., & Suen, H. (2000). Reducing task-related variance in performance assessment using concept maps. Educational Research and Evaluation, 6(4), 357–378.

Roegiers, X. (2000). Pour une pédagogie de l'intégration. Bruxelles: De Boeck Supérieur.

Roegiers, X. (2004). L’école et l’évaluation. Bruxelles: De Boeck Supérieur.

Scallon, G. (2004). L’évaluation des apprentiss-ages dans une approche par compétences. Bruxelles: De Boeck Supérieur.

Segers, M., Dochy, F., & Cascallar, E. (2003). Optimizing new modes of assessment: In search for qualities and standards. Boston: Kluwer Academic.

Shavelson, R., Baxter, G., & Pine, J. (1992). Performance assessments: Political rhetoric and measurement reality. Educational Resear-cher, 21(4), 22–27.

Shavelson, R., Baxter, G., & Gao, X. (1993). Sampling variability of performance assessen-ts. Journal of Educational Measurement, 30(3), 215-232.

Shavelson, R., & Webb, N. (1991). Generalizabili-ty theory: A primer. California: Sage Publicat-ions.

Shavelson, R., & Webb, N. (2009). Generaliz-ability theory and its contribution to the discussion of the generalizability of research findings. In K. Erickan., & W. Roth (Eds.), Generalizability from educational research (pp. 13-32). New York: Routledge.

Smit, R., & Birri, T. (2014). Assuring the quality of standards-oriented classroom assessment with rubrics for complex competencies. Studies in Educational Evaluation, 43, 5-13.

Swiss Society for Research in Education Working Group. (2010). EduG user guide. Neuchatel: IRDP. Available online at: http://www.irdp.ch /edumetrie/logiciels.html

Taylor, A., & Pastor, D. (2013). An application of generalizability theory to evaluate the technical quality of an alternate assessment. Applied Measurement in Education, 26(4), 279–297.

Tebaa, F. (2017). Critics of using competences concept in educational practices related to assessment. Social Sciences Journal, 24, 162-177.

Tebaa, F., & Lifa, N. (2015). Competences assessment from the perspective of generalizability theory. Psychological and Educational Studies Review, 12, 206-227.

Webb, N., Schlackman, J., & Sugrue, B. (2000). The dependability and interchangeability of assessment methods in science. Applied Measurement in Education, 13(3), 277–301.

Webb, N., Shavelson, R., & Steedle, J. (2012). Generalizability theory in assessment contexts. In C. Secolsky., & B. Denison (Eds.), Handbook on mesurement, assessment and evaluation in higher education (pp. 132-149). New York: Routledge.

Yin, Y., & Shavelson, R. (2008). Application of generalizability theory to concept map assessment research. Applied Measurement in Education, 21(3), 273-291.