was associated with 13.9 mg/dl lower LDL-C
(P= 4.1 × 10–^19 , all single-variant regression
used an additive model, andPvalues were
calculated based onttest unless otherwise
specified). This variant has a minor allele
frequency (MAF) of 6% in the OOA population
but is ultrarare across other human popu-
lations: Only eight copies were identified
in 140,000 whole-genome sequences (WGS)
of non-Amish participants in the National
Heart, Lung, and Blood Institute (NHLBI)
Trans-Omics for Precision Medicine (TOPMed)
program ( 22 ).
To more exhaustively investigate coding and
noncoding variants in this region, we per-
formed association analysis using 1083 OOA
subjects with WGS ( 21 ) as part of TOPMed.
Despite the smaller sample size, WGS analysis
identified the same variant (rs551564683) as
the top association with LDL-C in this region,
with aPvalue of 3.1 × 10–^6 and an effect size
of–16.9 mg/dl (Fig. 1B). In addition, WGS
analysis revealed 20 variants in the region
(Fig. 1A and fig. S2) in linkage disequilibrium
with rs551564683 (r^2 =0.84to0.95),with
LDL-C associationPvalues ranging from
6.3 × 10–^6 to 2.3 × 10–^5 (table S3). Of these
21 variants that comprise a 4-Mb OOA-specific
haplotype (fig. S2), rs551564683 was the only
protein-coding variant, and is classified as
damaging or deleterious by the in silico pro-
tein function prediction algorithms SIFT
[deleterious ( 23 )], Polyphen2 [possibly dam-
aging ( 24 )], LRT [deleterious ( 25 )], Mutation
Taster [disease-causing ( 26 )], and PROVEAN
[deleterious ( 27 )].
Becauseofitslimitedsamplesize(n= 1083),
WGS was not able to differentiate the top
missense variant (P=3.1×10–^6 )fromthe20
other highly correlated variants (P= 6.3 × 10–^6
to 2.3 × 10–^5 ). To further differentiate among
these 21 highly linked variants, we imputed
genotypes in 5890 OOA subjects with geno-
type chip data to the TOPMed WGS reference
panel ( 21 ). rs551564683 was the top associated
variant, with aPvalue of 3.6 × 10–^15 (Fig. 1C
and table S2), two or more orders of magni-
tude smaller than any of the other variants
(Pvalue of 9.4 × 10–^13 to 1.6 × 10–^9 ). Independent
direct genotyping for seven of these variants
gave similar results (fig. S3).
Conditional analysis adjusting for rs551564683
completely abolished the association of the
other 20 variants, whereas conditional analy-
ses adjusting for any of the other 20 variants
reduced the association of rs551564683 be-
cause of the strong correlation (r^2 = 0.84 to
0.95) but did not abolish it (P=1.0×10–^3 to
3.0 × 10–^7 ). Thus, rs551564683 is the most likely
causal variant in this region.
rs551564683 has a strong association with
total cholesterol and non–high-density lipoprotein
cholesterol (HDL-C) levels compared with a
moderate association with the HDL-C and total
cholesterol:HDL-C ratio and a lack of associa-
tion with triglycerides, indicating that LDL-C
is the main driver of this association (table
S4A). Similar results were found under var-
ious models, indicating that this robust asso-
ciation is not driven by outliers or body weight
(table S4B).
B4GALT1is a member of the beta-1,4-
galactosyltransferase gene family that en-
codes type II membrane-bound glycoproteins.
B4GALT1 is ubiquitously expressed and plays
a critical role in the processing of N-linked
oligosaccharide moieties in glycoproteins,
transferring the galactose from uridine di-
phosphate galactose (UDP-Gal) to specific
glycoprotein substrates ( 28 ). Thus, impair-
ment of B4GALT1 activity has the potential
to alter the structure of N-linked oligosac-
charides and introduce aberrations in the
glycan structure that may alter glycopro-
tein function.
A rare homozygous frame shift insertion in
B4GALT1that predicts a nonfunctional trun-
cated protein was reported to cause congenital
disorder of glycosylation type 2 (CDGII) ( 28 – 31 ).
Three of the six reported CDGII patients ex-
hibited, among other traits, abnormal coag-
ulation and very high levels of aspartate
transaminase (AST). Interestingly, rs551564683
was associated with lower levels of fibrinogen
1222 3 DECEMBER 2021•VOL 374 ISSUE 6572 science.orgSCIENCE
Fig. 1. Single-variant association analyses identifyB4GALT1p.Asn352Ser as a new LDL-CÐlowering
variant.(A) WES results (n= 6890). (B) WGS results (n= 1083). (C) Imputed data from genotyping chip
results (n= 5890). The blue line marks the suggestive threshold (P= 5.0 × 10–^6 ) and the red line the
significance threshold (P= 5.0 × 10–^8 ). AllPvalues are based onttest using the additive genetic model.
RESEARCH | RESEARCH ARTICLES