Surgeons as Educators A Guide for Academic Development and Teaching Excellence

50

demonstrated that grit scores predicted retention among cadets at West Point, as well
as success among contestants in the Scripps National Spelling Bee [ 5 ].
Throughout this chapter, we’ve also discussed operative performance ratings.
Presumably, operative performance ratings are predictive of patient outcomes in
future operations – but such data are scarcely available to researchers. However,
another real-world factor is absolutely measurable: experience. Multiple studies
have found that the year of residency is strongly related to operative performance
ratings – greater experience predicts higher performance, exactly as one would
expect [ 3 , 10 ].
Another test of validity is called concurrent validity – are scores on your assess-
ment correlated with scores on other assessments that measure a similar construct?
Grit has been shown in repeated studies to be related to self-reports of conscien-
tiousness, as one would expect (e.g., [ 15 ]). A validation study of an operative per-
formance rating system in urology showed that performance during a kidney stone
procedure was correlated with performance during other urological operations.
These are examples of concurrent validity.
When piloting your assessment, it might be desirable to assess your students
using other tools found in the literature review. This might seem odd. Presumably,
you developed a new assessment tool because you believed it was different than
what was available. However, if similar measures exist to your own, it is worthwhile
to deploy them during your pilot testing. This has two benefits. It allows you to test
for concurrent validity, and it allows you to test whether your assessment has greater
predictive validity than other assessments.
Beyond predictive and concurrent validity, there are other concepts of whether a
measure is “valid” (e.g., divergent validity and curricular validity). These should be
examined too, if relevant.
But more importantly, we must remind ourselves that no amount of evidence can
prove that our assessment measures what it purports to measure. We are trying to
measure something that we can’t see, and so we can never know for certain whether
our assessment is truly valid. Stephen M. Downing puts it nicely: “Assessments are
not valid or invalid; rather, the scores or outcomes of assessments have more or less
evidence to support (or refute) a specific interpretation.” [ 4 ]

Question 9: Is Your Assessment Interesting and Useful to Other
Researchers?

If you have found satisfactory answers to Question 1 through 8, you have produced
an assessment tool that you should be proud of. It can be the basis for a publishable
paper. You have followed the same process used by leading researchers in assess-
ment research. Don’t let your effort end there.
A final test of your assessment can come through peer review. And by peers, we
mean “your co-researchers in surgical education.” If you can clearly answer
Questions 1 through 8, then you have the clear framework for an academic publica-
tion. Present the work at conferences, and submit it to a journal. The feedback you

C. Hitt

Surgeons as Educators A Guide for Academic Development and Teaching Excellence

Get our desktop app

Company

Features

Documentation

Resources