Degasperiet al.,Science 376 , eabl9283 (2022) 22 April 2022 2 of 15
GEL commonICGC common
Hartwig commonGEL rareICGC rare
Hartwig rare05101520Number of common and rare SBS signatures
GELICGC
Hartwig0 500 100015002000250005101520Number of rare SBS signatures and samples
(Pearson corr. 0.76)samples in each organGELICGC
Hartwigcohort
GEL
ICGC
HartwigTotal WGS samples
ICGC
3001
Hartwig 3417
12222 GELA
100020003000samplesSNVsDNVs100102
103
101101041056
107100102
103
101104
101051067Skin Lung
Esophagus
StomachColorectalBladderLiver
Head_neckOral_OropharyngealUterusOvaryBiliaryKidney
PancreasBreastProstateBone_SoftTissueCNS
LymphoidNET
MyeloidGELClustering on
mutational profiles
Exclusion of
atypical profiles(12,222 WGS) (3,001 WGS)
ICGC Hartwig
(3,417 WGS)Breast Ovary Stomach Uterus Breast Ovary Stomach Uterus Breast Ovary Stomach UterusSamples with
commonprofilesAll
samplesExtraction
of commonsignaturesIdentify samplesthat contain
additional rare
signaturesCommon
Organ-specific
SignaturesRare
Organ-SpecificSignaturesTotal SBS signatures
across organs (757)
135 GELcommon180 GELrare135 ICGCcommon58 ICGCrare135 Hartwigcommon114 HartwigrareFor each organin each cohort0 500 100015002000250005101520Number of common SBS signatures and samples
(Pearson corr. 0.14)samples in each organGELICGC
HartwigBCD GEFsignatures in each organcommon signatures in each organrare signatures in each organOrgan-specificsignatures
of all three
cohorts
(757)PatternsDistinct
(187)Clustering
to determine
Distinct PatternsRecurrent
Distinct
Patterns(88)Mixed
Distinct
Patterns(29)Singleton
Distinct
Patterns(70)Clustering into
RecurrentReference
Signatures (74)Determine
composition
Referenceby other
SignaturesVariants of
Recurrent
Reference
Signatures
(24)Known
signatures
from
other studies
(8)Additional
previously
unreported
(38)Reference
Signatures(120 SBS
signatures)82 QC green SBS signatures 38 QC amber and red SBS signaturesFig. 1. Whole-genomeÐsequenced cancers across three independent cohorts:
GEL, ICGC, and HMF.(A) WGS cases included in analyses. (B)Numberofsamples
and mutational burden of somatic single-nucleotide variants (SNVs) and double-
nucleotide variants (DNVs) across 21 whole-genomeÐsequenced tumor types from
the GEL, ICGC, and HMF cohorts. Not all tumor types (e.g., esophagus, head and
neck, oropharyngeal) are represented in all three cohorts. (C) Schematic
representation of the workflow of mutational signature analysis. Three cohorts (GEL,
ICGC, and HMF) were evaluated independently. For each organ in each cohort,
mutational catalogs were clustered and samples with atypical catalogs were
excluded from the extraction process. Samples with similar catalogs were subjected
to signature extraction to obtain a set of common organ-specific signatures. These
common signatures were fitted into all samples, highlighting samples with a high
error profile that were subsequently used to identify rare signatures. The pie
chart shows the total number of SBS signatures identified for each independent
extraction of each organ in all three cohorts. (D) Number of common and rare SBS
signatures in each cohort. (E) Common SBS signatures as a function of number
of samples analyzed. (F) Rare SBS signatures as a function of number of samples
analyzed. (G) Procedure to determine the reference signatures from all identified
cohort-organ signatures. Numbers refer to the SBS signature analysis. For details,
see materials and methods.RESEARCH | RESEARCH ARTICLE