TY - JOUR
T1 - Assessment question characteristics predict medical student performance in general pathology
AU - Hernandez, Tahyna
AU - Magid, Margret S.
AU - Polydorides, Alexandros D.
N1 - Funding Information:
This work was partially presented during the 109th annual meeting of the United States and Canadian Academy of Pathology (USCAP); March 2, 2020; Los Angeles, California.
Publisher Copyright:
© 2021 College of American Pathologists. All rights reserved.
PY - 2021/10
Y1 - 2021/10
N2 - Context.-Evaluation of medical curricula includes appraisal of student assessments in order to encourage deeper learning approaches. General pathology is our institution's 4-week, first-year course covering universal disease concepts (inflammation, neoplasia, etc). Objective.-To compare types of assessment questions and determine which characteristics may predict student scores, degree of difficulty, and item discrimination. Design.-Item-level analysis was employed to categorize questions along the following variables: type (multiple choice question or matching answer), presence of clinical vignette (if so, whether simple or complex), presence of specimen image, information depth (simple recall or interpretation), knowledge density (first or second order), Bloom taxonomy level (1-3), and, for the final, subject familiarity (repeated concept and, if so, whether verbatim). Results.-Assessments comprised 3 quizzes and 1 final exam (total 125 questions), scored during a 3-year period, (total 417 students) for a total 52 125 graded attempts. Overall, 44 890 attempts (86.1%) were correct. In multivariate analysis, question type emerged as the most significant predictor of student performance, degree of difficulty, and item discrimination, with multiple choice questions being significantly associated with lower mean scores (P ¼.004) and higher degree of difficulty (P ¼.02), but also, paradoxically, poorer discrimination (P ¼.002). The presence of a specimen image was significantly associated with better discrimination (P ¼.04), and questions requiring data interpretation (versus simple recall) were significantly associated with lower mean scores (P ¼.003) and a higher degree of difficulty (P ¼.046). Conclusions.-Assessments in medical education should comprise combinations of questions with various characteristics in order to encourage better student performance, but also obtain optimal degrees of difficulty and levels of item discrimination.
AB - Context.-Evaluation of medical curricula includes appraisal of student assessments in order to encourage deeper learning approaches. General pathology is our institution's 4-week, first-year course covering universal disease concepts (inflammation, neoplasia, etc). Objective.-To compare types of assessment questions and determine which characteristics may predict student scores, degree of difficulty, and item discrimination. Design.-Item-level analysis was employed to categorize questions along the following variables: type (multiple choice question or matching answer), presence of clinical vignette (if so, whether simple or complex), presence of specimen image, information depth (simple recall or interpretation), knowledge density (first or second order), Bloom taxonomy level (1-3), and, for the final, subject familiarity (repeated concept and, if so, whether verbatim). Results.-Assessments comprised 3 quizzes and 1 final exam (total 125 questions), scored during a 3-year period, (total 417 students) for a total 52 125 graded attempts. Overall, 44 890 attempts (86.1%) were correct. In multivariate analysis, question type emerged as the most significant predictor of student performance, degree of difficulty, and item discrimination, with multiple choice questions being significantly associated with lower mean scores (P ¼.004) and higher degree of difficulty (P ¼.02), but also, paradoxically, poorer discrimination (P ¼.002). The presence of a specimen image was significantly associated with better discrimination (P ¼.04), and questions requiring data interpretation (versus simple recall) were significantly associated with lower mean scores (P ¼.003) and a higher degree of difficulty (P ¼.046). Conclusions.-Assessments in medical education should comprise combinations of questions with various characteristics in order to encourage better student performance, but also obtain optimal degrees of difficulty and levels of item discrimination.
UR - http://www.scopus.com/inward/record.url?scp=85116436610&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85116436610&partnerID=8YFLogxK
U2 - 10.5858/arpa.2020-0624-OA
DO - 10.5858/arpa.2020-0624-OA
M3 - Article
C2 - 33450752
AN - SCOPUS:85116436610
SN - 0003-9985
VL - 145
SP - 1280
EP - 1288
JO - Archives of Pathology and Laboratory Medicine
JF - Archives of Pathology and Laboratory Medicine
IS - 10
ER -