Science - 6 December 2019

(Ann) #1

SCIENCE sciencemag.org 6 DECEMBER 20 19 • VOL 3 66 ISSUE 6470 1203


input, and their effects can be hard to iden-
tify ex-ante. But the quality of parametric
updates depends largely on the quality of the
associated underlying data. An adaptive sys-
tem with continuously changing parameters
is susceptible to data quality issues that can
arise from, for example, errors of the AI/ML
users or intentional adversarial attacks ( 6 ).
The latter can take many forms. Consider a
hypothetical example ( 6 ): In response to the
opioid crisis, many insurers now use patient-
or provider-level overdose risk-prediction al-
gorithms to deny oxycontin prescriptions. A
physician, certain that a patient is in need of
a prescription, may learn that the patient can
avoid the algorithmic gatekeeper and secure
a prescription by typing in a combination of
codes that will guarantee a low risk for over-
dose score. Such a system incentivizes the
elicitation of low-quality physician data. An
unchecked dynamic algorithm would inap-
propriately adapt to this over time—consid-
ering all outcomes of prescriptions—and be-
gin to falsely categorize low-risk patients as
high risk. In this kind of situation, continu-
ous oversight can provide a necessary check
on adaptive AI/ML systems.
In an attempt to steer between these two
poles, the FDA released a discussion paper in
April 2019 ( 1 ). Until now, the FDA has exclu-
sively approved or cleared medical AI/ML-
based software as a medical device—what
the FDA calls “SaMD,” which is software that
is on its own a medical device and is not
part of a hardware medical device ( 7 )—with
“locked” algorithms ( 1 ). A locked algorithm
is defined by the FDA as “an algorithm that
provides the same result each time the same
input is applied to it and does not change
with use” ( 1 ). Any AI/ML system can satisfy
this definition provided it is fixed in advance.
However, most AI/ML algorithms are
“adaptive,” arguably their key strength. Even
parameters in a simple model like a logistic
regression will gradually evolve as the model
is refit in response to new data. For adap-
tive AI/ML-based SaMD, the FDA proposed
a “total product lifecycle (TPLC) regulatory
approach” that permits continuous improve-
ment of such devices while maintaining
their safety and effectiveness ( 1 ). The FDA’s
TPLC approach is a feature of the Software
Precertification (Pre-Cert) Program that it is
piloting on a small number of companies to
determine its feasibility ( 2 ). One major idea
in the FDA’s April 2019 discussion paper is
that AI/ML-based SaMD could be updated
to a certain extent after marketing authoriza-
tion; when seeking initial premarket review
of an AI/ML-based SaMD, manufacturers
would be given the option to submit a “prede-
termined change control plan,” which would
contain a description of anticipated modifi-
cations and an “Algorithm Change Protocol,”


including the associated methodology being
used to implement such changes ( 1 ).

UNDERSTANDING RISKS
Before considering adaptive algorithms, it is
important to recognize that a “locked” algo-
rithm, as defined, for example, by the FDA,
could be more harmful than an “adaptive”
one—and vice versa.
To begin with, the concept of “locked”
is ambiguous. We focus on two definitions
that we call “system lock” and “true func-
tion lock.” Do we want the AI/ML system to
continually use the locked estimate of the
function, relating inputs and outputs, that
was first approved? This is how the FDA
has defined what it means to be “locked,” a
concept that we call “system lock.” Merely
achieving “system lock” will not guarantee
that the system is safe for patients. An alter-
native, and perhaps a preferable one, is that
the algorithm locks, as closely as possible, to
the true function that relates the inputs and
output—which is unknown ex-ante in prac-
tice and which emerges over time. We call
this “true function lock.”
For adaptive algorithms, it is especially im-
portant for regulators to assess whether the
AI/ML system is overall reliable as applied
to new data—i.e., whether it approaches
“true function lock.” Several AI/ML features
are identified below that, when not properly
considered, can lead the AI/ML system to
use a poor estimate of the true relationship
between the inputs and outputs and thereby
possibly cause harm to patients (for exam-
ple, through misdiagnosis). Regulators need
to focus their attention on such issues in or-
der to manage the risk that AI/ML systems
learn and use a wrong input-output relation.

Concept drift
Concept drift describes a situation where
the true relation between inputs and out-
puts changes (over time). This may hap-
pen because of a changing environment
or because the model was misspecified
(for example, the estimated function is
linear when the actual relationship is qua-
dratic, or there are omitted variables, etc.).
Consider, for example, an AI/ML system
trained to identify skin lesions as benign
or malignant ( 8 ). The model presupposes
an underlying distribution of these labels
(benign versus malignant). However, the
datasets that these AI/ML systems rely on
typically do not track race or skin color, or
may miss or not report certain skin types.
Yet the malignancy of skin lesions (the
true relation between input and output/
diagnosis) may vary across race and skin
type. As a result, the same image can lead
to two different probabilistic diagnoses,
depending on the underlying skin/race,

an omitted feature. This sort of problem is
ubiquitous in medical AI/ML.
A regulatory regime requiring a “system
lock” is not immune to this problem. Indeed,
a “system locked” algorithm can make mat-
ters worse by prohibiting the system from
learning. Moreover, a regime focused on
predetermined change control plans is like-
wise vulnerable to risks arising from concept
drift. Any predetermined change control
plan risks being either uninformative or im-
practical—depending on the level of detail
at which a maker would be expected to de-
scribe future modifications. At one extreme,
a maker might be required to describe pro-
posed changes in very general terms. This
would be uninformative. On the other ex-
treme, they might be required to describe
precisely the sorts of changes they anticipate.
Such a task is not feasible without having
seen all possible future data from all types
of patients and conditions—especially for
AI/ML algorithms that may have thousands
or millions of parameters. Even if (in theory,
or possibly with future technologies) this
kind of task could be accomplished, it would
be extremely difficult and time-consuming—
and thus, impractical. Moreover, such a plan
could be especially harmful when unantici-
pated problems are reported—in which case,
the proposed framework could require an-
other round of review.

Covariate shift
When the input distribution of new data is
different from the data that the algorithm
was trained or tested for approval on, the
result is covariate shift ( 9 ). This can occur
in the absence of concept drift, although
the two are not mutually exclusive. For ex-
ample, training data may have come from
geographically centralized clinical sites, but
the device is to be deployed beyond those
regions and populations. When this occurs,
“system locking” the algorithm hampers the
maker’s ability to address the problem. Fur-
ther, describing how the distribution of pa-
tients may change is not something a maker
may be able to do ex-ante because they usu-
ally do not know the distribution of the data
that the algorithm will be applied to.

Instability
One major concern is treating similar pa-
tients similarly. That is, medically insignifi-
cant differences among patients should not
lead to substantive differences in diagnosis
or treatment. Suppose that when an AI/ML
system is given a set of inputs, it produces
one probabilistic output. For example, the
probability that a particular skin lesion is
malignant is 87 %. Now suppose that very
small changes are made to the set of inputs
provided to the underlying algorithm. For ex-

INSIGHTS

Published by AAAS

on December 12, 2019^

http://science.sciencemag.org/

Downloaded from
Free download pdf