andFGFR2in lung cancer (P=9×10−^4 and 1 ×
10 −^2 , respectively),ARRDC3in kidney cancer
(P=4×10−^2 ),PIK3C2Bin bladder cancer (P=
8×10−^3 ),BCL6in leukemia (P= 1 × 10−^2 ), and
XBP1in breast cancer (P=8×10−^4 ) (fig. S44B).
These analyses provide additional support for
the plausibility of some of the mutation events
in this category, in addition to their location in
regulatory regions and enrichment for canon-
ical cancer genes (Fig. 2C).
Experimental evaluation of regulatory regions
and noncoding mutations around XBP1
Although many events in the regulatory cat-
egory fell into the promoter regions of known
cancer genes (Figs. 2 and 3A), some events
occurred outside of canonical regulatory re-
gions. For example,XBP1mutations, which
were present in ~6% of the breast cancer
patients in our WGS cohort, did not primarily
target theXPP1promoter but rather clustered
in a narrow, noncoding region downstream of
XBP1(Fig. 3A and fig. S45A), a pattern unlikely
to occur by random chance (fig. S45B).
Previous studies have connectedXBP1to
breast cancer ( 38 , 39 ) and estrogen receptor
signaling ( 40 , 41 ). Concordantly, Gene Set En-
richment Analysis showed estrogen receptor–
dependent signaling to be the most differentially
expressed pathway (FDR = 7 × 10−^4 ) between
breast cancer samples with high versus low
XBP1expression (Fig. 5 and fig. S46). Further-
more,XBP1was only expressed in prediction
analysis of microarray 50 [PAM50 ( 42 )] expres-
sion types related to hormone receptor signal-
ing (luminal A/B,HER2-enriched types) but
not in other breast tumors (basal-like type)
(fig. S47). In addition, the average ATAC-seq
signal aroundXBP1was 1.83-fold higher in
receptor-positive versus receptor-negative breast
tumors (P< 0.001, basal-like versus non–basal-
like PAM50 subtype, Mann-WhitneyUtest)
(fig. S46, D and E), suggesting that regulatory
regions aroundXBP1exhibited primary activ-
ity in the hormone receptor–related subtype.
We confirmed somatic mutations aroundXBP1
using Sanger sequencing in breast tumors
from our WGS cohort (fig. S48).
We used two experimental assays to further
assess mutations nearXBP1and to provide
proof-of-principle support for the possible
Dietleinet al.,Science 376 , eabg5601 (2022) 8 April 2022 8 of 12
0
50
100
ATAC-seq(% of max)
CCDC117 XBP1
3D promoter
interactions
ATAC-seq
in XBP1-pos
breast cancer
0
50
100
mutation density
(% of max)
29.16 29.18 29.2 29.22 29.24
Position on Chr 22
mutation density
in breast cancer
1.5
2.0
2.5
3.0
wt mut wt mut
mut near XBP1
log expression
XBP1levels mut vs. wt tumors
PCAWG CCLE
*** **
EARLY
LATE
HALLMARKESTROGEN
RESPONSE
0.0
1.0
2.0
3.0
-0.25 0.0 0.25 0.5 0.75
max. GSEA score
-log FDR
GSEAXBP1-pos vs. neg tumors
EARLY
ESTROGEN
RESPONSE
0.0
0.2
0.4
0.6
0 5 10 15 20
LATE
ESTROGEN
RESPONSE
0.0
0.2
0.4
0.6
0 5 10 15 20
gene rank (x1000) gene rank (x1000)
GSEA score GSEA score
-30
-30
-15
-15
0
0
15
15
30
30
45
45
effect score replicate 1
effect score replicate 2
effective sgRNA no effect
-30 -15 0 15 30 45
0.4
4
40
400
effect score replicate 1
ATAC-seq signal
effective sgRNA no effect
0.0
0.5
1.0
1.5
2.0
x-fold
expression
XBP1
#429F#493R#588F#920F#966F#977F#1401R#1403R#1414R#1683R#1894R#1949F#1954F#2093F#2138F#2307F#2310R#2371F#2420F#2481R
Position
on Chr 22
29.16 29.18 XBP1 29.2 29.22 29.24
0
12
25
37
50
mutated regions
ATAC-seq peaks
% of sgRNAs
with effect on
XBP1 expression
local fraction ofeffective sgRNAs
A
B
CD
E
G
4) sort cells in equisized bins
low middle high
5) score sgRNA effects
cell count
6) map effects to target regions
1) 2,923 CRISPR sgRNAs 2) transfection of 10 cells^7 3) expression in flow cytometry
effect
cells with low
expression
efficient
sgRNA
no
effect
count
lowmiddlehigh lowmiddlehigh
F H
IJ
region targeted by sgRNAs
Fig. 5. Noncoding somatic mutations occur in regulatory regions aroundXBP1.
(A) CRISPRi screening of regions aroundXBP1using a library of 2923 sgRNAs
in breast cancer cells (CAMA1). Regulatory regions were localized based on
sgRNAs, for which KRAB-mediated silencing of their target region led to
decreasedXBP1expression in flow cytometry (orange). (B) Fractions of effective
sgRNAs (y-axis) plotted against their position aroundXBP1(x-axis). Positions
of ATAC-seq peaks (teal, bottom), noncoding mutations (purple, bottom),
and target regions of the sgRNAs (top) are annotated. (CandD) Efficacies of
sgRNAs (sliding window of 10 adjacent sgRNAs) compared between experimental
replicates [x-axis versusy-axis (C)] and the ATAC-seq signal of their target
regions in breast cancer [y-axis (D)]. (E) Bar graphs displaying theXBP1
expression ratio before and after CRISPRi in regulatory regions (orange) and
nonregulatory regions (gray) for individual sgRNAs. Error bars reflect the
SD across cells. (F) Mutation densities (purple), ATAC-seq signals (teal), and three-
dimensional interactions in the breast cancer genome of MCF7 (ChIA-PET, black)
plotted against their genomic position aroundXBP1(x-axis). (G)XBP1expression
compared between breast tumors with [purple, mutated (mut)] and without [gray,
wild-type (wt)] mutations aroundXBP1in PCAWG (left) and CCLE (right). Boxes
indicate the 25/75% interquartile range, vertical lines extend to 10/90% percentiles,
and horizontal lines reflect distribution medians ofXBP1expression. Significant
differences (Mann-WhitneyUtest) are annotated with asterisks: *P< 0.05, **P< 0.01,
***P< 0.001. (H) Gene Set Enrichment Analysis analyzing expression differences
in tumors with high versus lowXBP1expression by computing an enrichment score
(x-axis) and a significance value (y-axis) for each hallmark signature. (IandJ) Gene
ranks (x-axis) are plotted against enrichment scores (y-axis) for early (I) and late
(J) estrogen response signatures (black).
RESEARCH | RESEARCH ARTICLE