2.2.9 Solvent Accessible
Surface Area
The solvent accessible surface area was predicted using RVP-Net
[40, 41], which shows the proportion of solvent accessible area for
each amino acid in a protein sequence. In PGlcS, the RVP-Net
outputs were normalized to be approximately 0–1 for every amino
acid close to the O-GlcNAcylation sites [10].
2.2.10 First- and
Second-Order Composition
Moment Vector (CMV)
Two encoding methods were used to obtain reduced amino acid
sequences according to the physicochemical properties of the
amino acids. Based on their acid–base properties, the 20 amino
acid residues were classified into three groups: acidic (AAaci), {D,E};
basic (AAbas), {K,H,R}; and neutral (AAneu), {A,C,F,G,I,L,M,N,P,
Q,S,T,V,W,Y} amino acids. And based on their hydrophobicity, the
20 amino acid residues were classified into three other groups:
internal (AAint), {F,I,L,M,V}; external (AAext), {D,E,H,K,N,Q,
R}; and ambivalent (AAamb), {A,C,G,P,S,T,W,Y} amino acids.
To reflect the order of acidic and hydrophobic amino acids in
the sequences surrounding candidate O-GlcNAcylation sites,
CMVs were defined as
CMVik¼
Pxi
j¼ 1
nijk
∏kd¼ 0 ðÞNd
wherek¼0, 1 is the order of CMV;Lis the sequence length;iis
theith amino acid with acid–base {AAaci,AAbas,AAneu} and hydro-
phobicity {AAintl,AAext,AAamb} properties;xjis the total number
ofiresidues in the corresponding reduced amino acid sequence;
andnijis thejth position ofiresidues in the corresponding reduced
sequence. Whenk¼0, the CMV reduces to the content of theith
amino acid. Therefore, Acontent/Hcontent (the first order of
CMV) and ACMV/HCMV (the second order of CMV) were
defined to reflect the sequence order of acidic and hydrophobic
amino acids, respectively [8].
2.3 Assessing
Predictive
Performance
The predictive performances of the O-GlcNAcylation sites predic-
tors were assessed by calculating four commonly used measure-
ments, sensitivity (Sn), specificity (Sp), accuracy (Acc), and
Matthew’s correlation coefficient (MCC), as follows:
Sn¼
TP
TPþFN
Sp¼
TN
TNþFP
Acc¼
TPþTN
TPþTNþFPþFN
242 Cangzhi Jia and Yun Zuo