Article reSeArcH
ab
# cells = 18,722 Patient ID
ET01
ET02
ET03
ET04
ET05Relative expression
LowCD79A
DNTT
IRF7ELANEAZU1MPOSPINK2CD52HOPXAVPCA1PLEK
DAD1ITGA2BHDCCLCTK1BLVRBPDLIM1
HBBFCER1ATYMS
LYZIRF8E/B/M MkP EP2 EP1EP-ccMEP-ccMEPHSPC1HSPC2HSPC3IMP1IMP2IMP-ccNP1NP2M/D1M/D2PreB1PreB2cd
WT (n = 9,338)
MUT (n = 7,276)
NA (n = 2,108)HSPCIMPNPMEP
EPE/B/MM/DPreBMkP# cells = 18,722eHighAVP HOPX CLC IRF8 AZU1ITGA2B CA1 BLVRB PLEK DNTTRelative expression
LowHight-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1t-SNE2
t-SNE1Extended Data Fig. 4 | Integration of samples from patients with
essential thrombocythaemia and assignment of progenitor subsets.
a, t-SNE projection of CD34+ progenitor cells from samples ET01–ET05,
after integration and batch correction using the Seurat package (Methods).
b, Heat map of top ten differentially expressed genes for clusters; lineage-
specific genes from a previous publication^26 are highlighted (Methods).
c, Representative lineage-specific genes projected onto the t-SNE
representation of CD34+ cells from samples from patients with essential
thrombocythaemia. d, t-SNE projection of CD34+ cells from samples
ET01–ET05 after applying a deep generative modelling approach for
the single-cell analysis using the scVI package (Methods)^19 , showing
assignments of progenitor subsets as determined after clustering the cells
using the Seurat package. e, Genotyping data from GoT are projected onto
the t-SNE representation generated after the scVI analysis of progenitor
cells from samples ET01–ET05. Cells without any GoT data are labelled
NA (not assignable).