Science - USA (2022-04-08)

(Maropa) #1

Dietleinet al.,Science 376 , eabg5601 (2022) 8 April 2022 2 of 12


mutations / Mb (Lung) mutations / Mb (Lung)

test 1 based on comparison with epigenomic data test 2 based on comparison between tumor types

2

4

8

16

32

-2 -1 0 1 2
H3K9me3

2

4

8

16

32

-2 -1 0 1 2
H3K36me3 epigenomic signal mutations / Mb

2

4

8

16

32

0 3 6 9 12
Esophagus

2

4

8

16

32

0 3 6 9 12
Liver

Bladder

0.0

0.25

0.5

0.75

1.0 Bladder

0.0

0.25

0.5

0.75

Brain Breast Esophagus 1.0 Brain Breast Esophagus

Gastric

0.0
0.0

0.25

0.25

0.5

0.5

0.75

0.75

1.0

1.0

Gastric

0.0
0.0

0.25

0.25

0.5

0.5

0.75

0.75

1.0

1.0

Kidney

0.00.250.50.751.0

Kidney

0.00.250.50.751.0

Liver

0.00.250.50.751.0

Liver

0.00.250.50.751.0

Prostate

0.00.250.50.751.0

Prostate

0.00.250.50.751.0

calibration of significance test 1 calibration of significance test 2

observed p-value

expected p-value expected p-value

observed p-value

2 2

4 4

8 8

16 16

32 32

90 95 100 105 110 115 90 95 100 105 110 115
genomic position on Chr 1 (Mb) genomic position on Chr 1 (Mb)

prediction
of test 1
observed

prediction
of test 2
observed

BC


DE


A


Fig. 1. Genome-wide analysis of somatic mutation events in whole cancer
genomes.(A) Genome-wide detection of somatic mutation events in whole
cancer genome sequencing data. Step 1 combines three complementary test
strategies. Step 2 integrates the results of tests 1 to 3 into a joint, genome-wide
signal and identifies significant mutation events. Step 3 classifies mutation
events according to their genomic location. (BandC) Top: Boxplots comparing
mutation rates of a representative cancer type (lung cancer) against epigenomic


signals [(B), the rationale of test 1] and mutation rates of other cancer types [(C),
the rationale of test 2]. Boxes indicate 25/75% interquartile ranges, vertical lines
extend to 10/90% percentiles, and horizontal lines reflect distribution medians.
Bottom: Observed (teal dots) and predicted (continuous line) mutation rates
(10-kb intervals) plotted against their position on chromosome 1 (function smoothed
by Gaussian kernel). (DandE) Q-Q plots comparing observed (y-axis) and
expected (x-axis)Pvalues for test 1 (D) and test 2 (E).

RESEARCH | RESEARCH ARTICLE

Free download pdf