469 - Factors Influencing the Quality of Narrative Assessments of Medical Students’ Clerkship Performance

Friday, April 28, 2023

5:15 PM – 7:15 PM ET

Poster Number: 469
Publication Number: 469.123

Daniel Restifo, Weill Cornell Medicine, New York, New York, NY, United States; Rachel A. Arnesen, Weill Cornell Medicine, Bethesda, MD, United States; Adin Nelson, Weill Cornell Medicine, New York, NY, United States; Thanakorn Jirasevijinda, Weill Cornell Medical College, New York, NY, United States

Presenting Author(s)

DR

Daniel Restifo, MA (he/him/his)

Medical Student
Weill Cornell Medicine
New York, New York, New York, United States

Background: Narrative comments on clinical assessments of students guide their development and impact residency selection. Previous studies have shown that narrative comments rarely describe students’ skills or offer actionable feedback though, and factors affecting comments’ quality have not been identified.

Objective: Identify factors associated with narrative comment quality on student assessments.

Design/Methods: We examined free-text comments on 2,950 clinical assessments for medical students on the pediatrics clerkship between 2017-2021, and rated their quality using the Feedback Evaluation Tool (FET) (Ross, et al. 2013). The FET evaluates narrative comments to generate an overall quality score. We then used ANOVA tests to check what demographic or contextual factors influenced FET scores, and Pearson’s correlation to test associations between FET scores and the numerical rating of student performance and word count of individual narrative comments.

Results: The mean FET score for the narrative comments was 3.42 out of 5. The FET score did not differ by student gender (p = 0.8) or assessor gender (p = 0.5). Additionally, there was no significant interaction between student and assessor gender (p = 0.8). Assessors' level of training significantly affected FET scores (interns: 3.53, residents: 3.34, and attendings: 3.39; p < 0.01). Clinical setting also affected FET scores (inpatient: 3.45, outpatient: 3.53, ED: 3.33, and nursery: 3.13; p < 0.01). The factor associated with the largest discrepancy in FET scores was the clerkship site as the mean FET score at two community hospitals was 3.05 and 3.57, while the mean FET score at the academic center site was 3.52 (p < 0.01). Higher FET score was associated with higher word count in the comments (R2 = 0.35, p < 0.01), but not with the numerical ratings of student performance (R2 = 0.0002, p = 0.4).

Conclusion(s): The quality of narrative comments on clerkship students’ clinical assessments are influenced by assessors’ level of training, clinical setting, clerkship site, and narrative comment word count. These differences may be explained by the extent of student-assessor interaction and by assessors' backgrounds, as the majority of residents at one of the community hospital sites were international medical graduates. Our results suggest that more training on writing narrative comments is required for assessors of all levels of training. Additionally, implementing word count requirements for narrative comments may improve their quality.