Surgeons as Educators A Guide for Academic Development and Teaching Excellence

(Ben Green) #1
79

as being “nonproductive” time, and, specifically, “total time the movements of the
instruments do not make tissue contact” [ 32 ]. While this study only assessed total
idle time and did not examine the temporal location or duration of each idle event,
it succeeded in demonstrating initial construct validity of the idle time metric.
Significant (p=0.0001) differences in idle time were reported for each of the three
experience levels 8 experienced (M = 357 s), 8 intermediate (M = 654 s), and 24
inexperienced surgeons (M = 747 s) [32]. It is worth noting that idle time is a rela-
tively new metric that appears on few of the currently available VR simulators.


Global Rating Scales


OSATS
The objective structured assessment of technical skill (OSATS) was one of the first
assessments to utilize a global rating scale, which became widely accepted for eval-
uating surgical residents during open procedures. Along with a global rating scale
represented in Table 5.1, OSATS also includes a detailed checklist. Checklists are
procedure specific and individually validated for each additional evaluated surgery.
The global rating scale was first validated for an inferior vena cava repair necessi-
tated by a stab wound, on both a bench trainer and live porcine model [ 33 ]. The global
ratings scale outperformed the checklist in consistency between models (box trainer and
animate porcine tissue) [ 33 ]. An outcome of this is that MIS-specific objective assess-
ment tools are based on the global ratings scale portion of OSATS, to be discussed later.
However, more recent research shows that the checklist may be a more valuable
tool than initially thought. Checklists have defined yes or no answers and are thus
designed to have low ambiguity, making it easier for studies to establish interrater
reliability [ 34 ]. Direct comparison of checklists and global rating scales shows that
interrater reliability was significantly higher for the checklist evaluations, though
this study was disadvantaged by having only two raters [ 34 ].
Presented above is the global ratings scale from OSATS. Each metric is graded
with a numeric anchored Likert scale. Rensis Likert, a psychologist at the University
of Michigan in the mid-1900s, developed the Likert scale as a way to uniformly test
people’s attitudes toward a subject. There are several qualifications to being a Likert
scale. Firstly, they must have multiple metrics that are being graded. A Likert scale
is said to be anchored by using labeled integers as the score for each item.
Descriptions of the score must be arranged symmetrically and evenly. Summing the
scores generates the overall score, though they can be averaged. OSATS, and the
other global ratings scales we’ll discuss here, only anchor points 1, 3, and 5, which
does give the assessor slightly more freedom.
Note that use of assistants metric is not critical, or even relevant in some proce-
dures, and thus this metric is frequently absent. Depending on the task at hand,
assessors add in metrics such as suture handling or scales of overall performance
and quality of final product, which are not found on the original global ratings scale
[ 35 ]. Martin’s original scale also included a final pass/fail question, though this was
lost in later versions due to poor reliability [ 33 ].


5 Performance Assessment in Minimally Invasive Surgery

Free download pdf