Abstract
Introduction
Reducing examiner variability in Objective Structured Clinical Exams (OSCEs) is a priority within clinical performance assessment. In contrast to typical OSCE examiner training, video-based benchmarking (VBB) involves examiners scoring videos a/from their specific station b/shortly before the OSCE and then reflecting on and discussing scores/justifications agreed by an expert panel. Whilst realist evaluation has described mechanisms and contexts by which VBB may operate, VBB’s overall efficacy is unknown.
Methods
We performed a multi-centre (12 UK medical schools) stratified randomised controlled trial of VBB versus control to determine the influence of VBB on examiners’ score variability and other score characteristics. Secondarily, we compared the average scores allocated by examiners from different schools.
Results
171 medically qualified, trained OSCE examiners participated in the study. VBB showed no significant effect on overall examiner variability. In pre-specified analyses, VBB reduced variability from group mean of initially ‘outlying’ examiners on the borderline performance (VBB mean variability 3.02 out of 27 (IQR1.98-4.98), control 4.70 (3.91–5.70), p < 0.016) and made examiners more likely to correctly fail a minimally failing performance (p < 0.03, OR = 2.133[95% CI 1.081–4.208]). VBB caused a small increase in confidence. There were no significant differences in average scores by school.
Conclusions
VBB may enhance trust in OSCEs through more accurate classification of borderline performances and aligning outlying examiners scoring.
Reducing examiner variability in Objective Structured Clinical Exams (OSCEs) is a priority within clinical performance assessment. In contrast to typical OSCE examiner training, video-based benchmarking (VBB) involves examiners scoring videos a/from their specific station b/shortly before the OSCE and then reflecting on and discussing scores/justifications agreed by an expert panel. Whilst realist evaluation has described mechanisms and contexts by which VBB may operate, VBB’s overall efficacy is unknown.
Methods
We performed a multi-centre (12 UK medical schools) stratified randomised controlled trial of VBB versus control to determine the influence of VBB on examiners’ score variability and other score characteristics. Secondarily, we compared the average scores allocated by examiners from different schools.
Results
171 medically qualified, trained OSCE examiners participated in the study. VBB showed no significant effect on overall examiner variability. In pre-specified analyses, VBB reduced variability from group mean of initially ‘outlying’ examiners on the borderline performance (VBB mean variability 3.02 out of 27 (IQR1.98-4.98), control 4.70 (3.91–5.70), p < 0.016) and made examiners more likely to correctly fail a minimally failing performance (p < 0.03, OR = 2.133[95% CI 1.081–4.208]). VBB caused a small increase in confidence. There were no significant differences in average scores by school.
Conclusions
VBB may enhance trust in OSCEs through more accurate classification of borderline performances and aligning outlying examiners scoring.
| Original language | English |
|---|---|
| Pages (from-to) | 1-15 |
| Number of pages | 15 |
| Journal | Medical Teacher |
| Early online date | 1 Mar 2026 |
| DOIs | |
| Publication status | Published - 1 Mar 2026 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Assessment
- OSCEs
- Examiner variability
- Randomised controlled trial as topic
- Video-based benchmarking
Fingerprint
Dive into the research topics of 'Determining the influence of video-based benchmarking (VBB) on examiner variability in objective structured clinical exams (OSCE): The Align study'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver