Science - USA (2022-04-22)

(Maropa) #1

RESEARCH ARTICLE SUMMARY



CANCER GENOMICS


Substitution mutational signatures in


whole-genomeÐsequenced cancers in the UK population


Andrea Degasperi, Xueqing Zou, Tauanne Dias Amarante, Andrea Martinez-Martinez, Gene Ching Chiek Koh,
João M. L. Dias, Laura Heskin, Lucia Chmelova, Giuseppe Rinaldi, Valerie Ya Wen Wang, Arjun S. Nanda,
Aaron Bernstein, Sophie E. Momen, Jamie Young, Daniel Perez-Gil, Yasin Memari, Cherif Badja, Scott Shooter,
Jan Czarnecki, Matthew A. Brown, Helen R. Davies, Genomics England Research Consortium, Serena Nik-Zainal*


INTRODUCTION:Mutational signatures—imprints
of DNA damage and repair processes that have
been operative during tumorigenesis—provide
insights into environmental and endogenous
causes of each patient’s cancer. Cancer genome
sequencing studies permit exploration of mu-
tational signatures. We investigated a very large
number of whole-genome–sequenced cancers
of many tumor types, substantially more than
in previous efforts, to comprehensively re-
inforce our understanding of mutational
signatures.


RATIONALE:We present mutational signature
analyses of 12,222 whole-genome–sequenced
cancers collected prospectively via the UK


National Health Service (NHS) for the 100,000
Genomes Project. We identified single-base
substitution (SBS) and double-base substitu-
tion (DBS) signatures independently in each
organ. Exploiting this unusually large cohort,
we developed a method to enhance discrimi-
nation of common mutational processes from
rare, lower-frequency mutagenic processes. We
validated our findings by independently per-
forming analyses with data from two publicly
available cohorts: 3001 primary cancers from
the International Cancer Genome Consortium
(ICGC) and 3417 metastatic cancers from the
Hartwig Medical Foundation. We produced a
set of reference signatures by comparing and
contrasting the independently derived tissue-

specific signatures and performing clustering
analysis to unite mutational signatures from
different tissues that could be due to similar
processes. We included additional quality con-
trol measures such as dimensionality reduction
of mixed signatures and gathered evidence that
could help elucidate mechanisms and etiol-
ogies such as transcriptional and replication
strand bias, associations with somatic drivers,
and germline predisposition mutations. We also
investigated additional mutation context and
examined past clinical and treatment histories
when possible, to explore potential etiologies.

RESULTS:Each organ contained a limited num-
ber of common SBS signatures (typically be-
tween 5 and 10). The number of common
signatures was independent of cohort size. By
contrast, the number of rare signatures was
dependent on sample size, as the likelihood of
detecting a rare signature is a function of its
population prevalence. The same biological
process produced slightly different signatures
in diverse tissues, reinforcing that mutational
signatures are tissue specific.
Across organs, we clustered all tissue-specific
signatures to ascertain mutational processes
that were equivalent but occurring in different
tissues (i.e., reference signatures). We obtained
82 high-confidence SBS reference signatures
and 27 high-confidence DBS reference signa-
tures. We compared these with previously re-
ported mutational signatures, revealing 40 and
18 previously unidentified SBS and DBS signa-
tures, respectively.
Because we are cognizant of increasing com-
plexity in mutational signatures and want to
enable general users, we developed an algo-
rithm called Signature Fit Multi-Step (FitMS)
that seeks signatures in new samples while
taking advantage of our recent findings. In a
first step, FitMS detects common, organ-
specific signatures; in a second step, it deter-
mines whether an additional rare signature is
also present.

CONCLUSION:Mutational signature analysis
of 18,640 cancers, the largest cohort of whole-
genome–sequenced samples to date, has re-
quired methodological advances, permitting
knowledge expansion. We have identified
many previously unreported signatures and
established the concept of common and rare
signatures. The FitMS algorithm has been
designed to exploit these advances to aid users
in accurately identifying mutational processes
in new samples.

RESEARCH


368 22 APRIL 2022•VOL 376 ISSUE 6591 science.orgSCIENCE


The list of author affiliations is available in the full article online.
*Corresponding author. Email: [email protected]
Cite this article as A. Degasperiet al.,Science 376 ,
eabl9283 (2022). DOI: 10.1126/science.abl9283

READ THE FULL ARTICLE AT
https://doi.org/10.1126/science.abl9283

Analysis of 12,222 WGS cancers
from UK NHS (GEL)
Tissue-
type A

Tissue-
type B

Common signatures

Rare signatures

Relate samples across cohorts and
tissue-types using Reference Signatures

Common signatures

Rare signatures

19
tissue-types

Independent analysis of public WGS cancer cohorts
ICGC 3,001 WGS 19 tissue-types

Hartwig 3,417 WGS 18 tissue-types

Number of samples Number of samples
Number of signatures Number of signatures

Common signatures Rare signatures

Step 1
Fit common
signatures

Investigate new samples using FitMS
?

More rare signatures were discovered
in larger cohorts

Step 2
Attempt to find additional
rare signatures

GEL ICGC Hartwig

0.00

0.04

0.08

C>A C>G C>T T>A T>C T>G

Example SBS116

SBS116 found in 3 samples
GEL: 1 breast, 1 ovary
Hartwig: 1 prostate

ICGC
0

GEL
2

Hartwig 1

Tissue-type A

Tissue-type C

Common signatures
Rare signatures

Common signatures
Rare signatures

Tissue-type B

Tissue-type A

Common signatures
Rare signatures

Common signatures
Rare signatures

Discovery and application of common and rare mutational signatures.Analysis of three large whole-genomeÐ
sequenced cancer cohorts revealed that per-organ common signatures are limited in number, whereas numbers
of rare signatures increase with increasing cohort size. Reference signatures permit comparisons across organs and
cohorts. Henceforth, a new algorithm, FitMS, which accounts for common and rare signatures, can be used to
analyze new samples. GEL, Genomics England cohort.

Free download pdf