andFGFR2in lung cancer (P=9×10−^4 and 1 ×
10 −^2 , respectively),ARRDC3in kidney cancer
(P=4×10−^2 ),PIK3C2Bin bladder cancer (P=
8×10−^3 ),BCL6in leukemia (P= 1 × 10−^2 ), and
XBP1in breast cancer (P=8×10−^4 ) (fig. S44B).
These analyses provide additional support for
the plausibility of some of the mutation events
in this category, in addition to their location in
regulatory regions and enrichment for canon-
ical cancer genes (Fig. 2C).
Experimental evaluation of regulatory regions
and noncoding mutations around XBP1
Although many events in the regulatory cat-
egory fell into the promoter regions of known
cancer genes (Figs. 2 and 3A), some events
occurred outside of canonical regulatory re-
gions. For example,XBP1mutations, which
were present in ~6% of the breast cancer
patients in our WGS cohort, did not primarily
target theXPP1promoter but rather clustered
in a narrow, noncoding region downstream of
XBP1(Fig. 3A and fig. S45A), a pattern unlikely
to occur by random chance (fig. S45B).
Previous studies have connectedXBP1to
breast cancer ( 38 , 39 ) and estrogen receptor
signaling ( 40 , 41 ). Concordantly, Gene Set En-
richment Analysis showed estrogen receptor–
dependent signaling to be the most differentially
expressed pathway (FDR = 7 × 10−^4 ) between
breast cancer samples with high versus low
XBP1expression (Fig. 5 and fig. S46). Further-
more,XBP1was only expressed in prediction
analysis of microarray 50 [PAM50 ( 42 )] expres-sion types related to hormone receptor signal-
ing (luminal A/B,HER2-enriched types) but
not in other breast tumors (basal-like type)
(fig. S47). In addition, the average ATAC-seq
signal aroundXBP1was 1.83-fold higher in
receptor-positive versus receptor-negative breast
tumors (P< 0.001, basal-like versus non–basal-
like PAM50 subtype, Mann-WhitneyUtest)
(fig. S46, D and E), suggesting that regulatory
regions aroundXBP1exhibited primary activ-
ity in the hormone receptor–related subtype.
We confirmed somatic mutations aroundXBP1
using Sanger sequencing in breast tumors
from our WGS cohort (fig. S48).
We used two experimental assays to further
assess mutations nearXBP1and to provide
proof-of-principle support for the possibleDietleinet al.,Science 376 , eabg5601 (2022) 8 April 2022 8 of 12
050100ATAC-seq(% of max)
CCDC117 XBP13D promoter
interactionsATAC-seq
in XBP1-pos
breast cancer050100mutation density(% of max)
29.16 29.18 29.2 29.22 29.24
Position on Chr 22mutation density
in breast cancer1.52.02.53.0wt mut wt mut
mut near XBP1log expressionXBP1levels mut vs. wt tumors
PCAWG CCLE*** **EARLY
LATEHALLMARKESTROGEN
RESPONSE0.01.02.03.0-0.25 0.0 0.25 0.5 0.75
max. GSEA score-log FDRGSEAXBP1-pos vs. neg tumorsEARLY
ESTROGEN
RESPONSE
0.00.20.40.60 5 10 15 20LATE
ESTROGEN
RESPONSE
0.00.20.40.60 5 10 15 20
gene rank (x1000) gene rank (x1000)GSEA score GSEA score-30-30
-15-1500151530304545effect score replicate 1effect score replicate 2effective sgRNA no effect-30 -15 0 15 30 450.4440400effect score replicate 1ATAC-seq signaleffective sgRNA no effect0.00.51.01.52.0x-foldexpressionXBP1#429F#493R#588F#920F#966F#977F#1401R#1403R#1414R#1683R#1894R#1949F#1954F#2093F#2138F#2307F#2310R#2371F#2420F#2481RPosition
on Chr 2229.16 29.18 XBP1 29.2 29.22 29.24012253750mutated regionsATAC-seq peaks% of sgRNAs
with effect on
XBP1 expression
local fraction ofeffective sgRNAsABCDEG4) sort cells in equisized binslow middle high5) score sgRNA effectscell count6) map effects to target regions1) 2,923 CRISPR sgRNAs 2) transfection of 10 cells^7 3) expression in flow cytometryeffectcells with low
expressionefficient
sgRNA
no
effect
count
lowmiddlehigh lowmiddlehighF HIJregion targeted by sgRNAsFig. 5. Noncoding somatic mutations occur in regulatory regions aroundXBP1.
(A) CRISPRi screening of regions aroundXBP1using a library of 2923 sgRNAs
in breast cancer cells (CAMA1). Regulatory regions were localized based on
sgRNAs, for which KRAB-mediated silencing of their target region led to
decreasedXBP1expression in flow cytometry (orange). (B) Fractions of effective
sgRNAs (y-axis) plotted against their position aroundXBP1(x-axis). Positions
of ATAC-seq peaks (teal, bottom), noncoding mutations (purple, bottom),
and target regions of the sgRNAs (top) are annotated. (CandD) Efficacies of
sgRNAs (sliding window of 10 adjacent sgRNAs) compared between experimental
replicates [x-axis versusy-axis (C)] and the ATAC-seq signal of their target
regions in breast cancer [y-axis (D)]. (E) Bar graphs displaying theXBP1
expression ratio before and after CRISPRi in regulatory regions (orange) and
nonregulatory regions (gray) for individual sgRNAs. Error bars reflect the
SD across cells. (F) Mutation densities (purple), ATAC-seq signals (teal), and three-
dimensional interactions in the breast cancer genome of MCF7 (ChIA-PET, black)
plotted against their genomic position aroundXBP1(x-axis). (G)XBP1expression
compared between breast tumors with [purple, mutated (mut)] and without [gray,
wild-type (wt)] mutations aroundXBP1in PCAWG (left) and CCLE (right). Boxes
indicate the 25/75% interquartile range, vertical lines extend to 10/90% percentiles,
and horizontal lines reflect distribution medians ofXBP1expression. Significant
differences (Mann-WhitneyUtest) are annotated with asterisks: *P< 0.05, **P< 0.01,
***P< 0.001. (H) Gene Set Enrichment Analysis analyzing expression differences
in tumors with high versus lowXBP1expression by computing an enrichment score
(x-axis) and a significance value (y-axis) for each hallmark signature. (IandJ) Gene
ranks (x-axis) are plotted against enrichment scores (y-axis) for early (I) and late
(J) estrogen response signatures (black).RESEARCH | RESEARCH ARTICLE