Nature - USA (2020-01-02)

(Antfer) #1

Extended Data Table 4 | Potential use of the AI system in two clinical applications


a


Sensitivity (%)
(n = 41 4)

Specificity (%)
(n = 25 ,422)

Simulated reduction
of second reader
workload (%)

AI as second reader (UK) 66.6 6 96.2 6 87.9 8


Existing workflow (UK) 67.3 9 96.2 4 -


95% CI on the difference (-2.68, 1.23) (-0.13, 0.17) -


b


Triage status Dataset


Sensitivity (%)
(95% CI)

Specificity (%)
(95% CI)

Reliability of triage
decision (%)
(95% CI)

Negative


UK


99.6 3
(98.88, 100.0)
n = 27 4

41.1 5
(40.57, 41.72)
n = 25,443

99.99 (NPV)
(99.97, 100.0)
n = 10,471

USA


98.0 5
(96.12, 99.16)
n = 35 9

34.7 9
(31.97, 37.60)
n = 2, 411

99.90 (NPV)
(99.83, 99.96)
n = 72 0

Positive


UK


41.2 4
(35.63, 47.08)
n = 27 4

99.9 2
(99.89, 99.95)
n = 25,443

85.69 (PPV)
(79.66, 90.98)
n = 13 2

USA


29.8 0
(25.21, 34.45)
n = 35 9

99.9 0
(99.78, 99.97)
n = 2, 411

82.41 (PPV)
(65.38, 94.71)
n = 12 1

a, Simulation, using the UK test set, in which the AI system is used in place of the second reader when it concurs with the first reader. In cases of disagreement (12.02%) the consensus opinion
was invoked. The high performance of this combination of human and machine suggests that approximately 88% of the effort of the second reader can be eliminated while maintaining the
standard of care that is produced by double reading. The decision of the AI system was generated using the first reader operating point (i) shown in Fig. 2a. Confidence intervals are Wald
intervals computed with the Obuchowski correction for clustered data. b, Evaluation of the AI system for low-latency triage. Operating points were set to perform with high NPV and PPV for
detecting cancer in 12 months.

Free download pdf