SimCapture Enterprise with Exam System: Inter-Rater Consistency Report
Discover how to generate and interpret a report on Inter-rater consistency using SimCapture Enterprise with Exam System.
Table of Contents
- SimCapture Cloud Release Notes
- SimCapture Cloud Administrator and Faculty Help
- SimCapture Cloud Learner Help
- SimCapture On-Premise Help
- SimCapture for Skills Help
- SimCapture Mobile Camera App
- SimCapture Companion Apps Help
- SimCapture Integrations Help
- Samaritan AI Help
- SimCapture Cloud Use Cases
- Hardware and Network Technical Information Help
- Glossary
The Inter‑Rater Consistency Report helps you evaluate how consistently multiple raters score the same learners. This report is commonly used by administrators and faculty to support fair grading , rater calibration , and assessment quality assurance .
What the Report Shows
The report compares evaluator scoring patterns across shared learners and highlights variation using statistical measures. This allows you to identify potential inconsistencies between raters.
Skill Area and n Values
Each Skill Area includes an n = value , which represents the total number of evaluations included in that calculation.
- A higher n (for example, n = 200) indicates a more statistically reliable consistency measure.
- A lower n (for example, n = 2) should be interpreted with caution, as small sample sizes can exaggerate variation.
Tip: This report is best suited for identifying trends and patterns across multiple evaluations, not for evaluating a rater based on a small number of responses.
Access the Inter‑Rater Consistency Report
- Navigate to Reports in the global navigation bar.
- Select Inter‑Rater Consistency .
Filter the Report
You can refine the report results using the following filters:
- Date Range: Last 30 Days, Last 90 Days, Year to Date, or All Time
- Organization
- Course
- Scenario
- Evaluation
After applying filters, you can export the report by selecting
Export Data > Inter‑Rater Consistency .
Note: In the export, usernames are formatted as Last Name, First Name, Middle Name (if applicable).
Included and Excluded Evaluations
When filtering by Organization , Course , or Scenario , the report includes:
Included
- Patient Evaluations
- Monitor Evaluations
- Scoring Evaluations
Excluded
- Participant Evaluations
- Course Evaluations (self‑assessments)
Self‑assessments are excluded to ensure that inter‑rater comparisons are based only on objective evaluator scores.
Understanding Z‑Scores
The Inter‑Rater Consistency Report uses Z‑Scores to show how far a rater’s scoring deviates from the group average, measured in standard deviations.
Z‑Scores make it easier to compare scoring behavior across different question categories and evaluation sets.
Z‑Score Color Coding
Z‑Scores are visually indicated using color:
- Light Orange: ±1 standard deviation from the mean
- Dark Orange: ±2 standard deviations from the mean
Larger deviations may indicate scoring patterns that differ substantially from the group and may warrant review or rater calibration.
How Z‑Scores Are Calculated
Z‑Scores are calculated in two contexts within this report.
By Evaluator and Question Category
This calculation compares an evaluator’s average score for a specific question category to the average of all evaluators for that category:
Z=(Evaluator mean for category)−(Category mean)Category standard deviationZ = \frac{(\text{Evaluator mean for category}) - (\text{Category mean})}{\text{Category standard deviation}}Z=Category standard deviation(Evaluator mean for category)−(Category mean)
By Evaluator Overall
This calculation compares an evaluator’s overall average score across all evaluations to the overall average of all evaluators:
Z=(Evaluator overall mean)−(Overall mean)Overall standard deviationZ = \frac{(\text{Evaluator overall mean}) - (\text{Overall mean})}{\text{Overall standard deviation}}Z=Overall standard deviation(Evaluator overall mean)−(Overall mean)
What’s Included in the Excel Export
The exported Excel file includes:
- Question categories used in the applied filters
- Standardized patient graders
- Z‑Scores per grader by question category
- Mean and standard deviation for each category
- Overall Z‑Score and average score per grader
Excel File Layout
- Page 1: Report data
- Page 2: Applied filters used to generate the report
Use and Interpretation Notes
- Z‑Scores indicate relative scoring consistency , not scoring correctness.
- Occasional deviations can be expected, especially with low n values.
- Sustained or extreme deviations may indicate the need for rater review, training, or calibration.
Need Help?
If you have questions about interpreting this report or applying it to your assessment workflows, contact your SimCapture administrator or SimCapture Support.