The progressive condensation of many molecules of amino acids gives rise to an
unbranched polypeptide chain. By convention, the N-terminal amino acid is taken as
the beginning of the chain and the C-terminal amino acid as the end of the chain
(proteins are biosynthesised in this direction). Polypeptide chains contain between
20 and 2 000 amino acid residues and hence have a relative molecular mass ranging
between about 2 000 and 2 00 000. Many proteins have a relative molecular mass
in the range 20 000 to 1 00 000. The distinction between a large peptide and a small
protein is not clear. Generally, chains of amino acids containing fewer than 50
residues are referred to as peptides, and those with more than 50 are referred to as
proteins. Most proteins contain many hundreds of amino acids (ribonuclease is an
extremely small protein with only 103 aminoacid residues) and manybiologically active
peptides contain 20 or fewer amino acids, for example oxytocin (9 amino acid
residues), vasopressin (9), enkephalins(5), gastrin (17), somatostatin (14) and lutenising
hormone (10).
The primary structure of a protein defines the sequence of the amino acid residues and
is dictated by the base sequence of the corresponding gene(s). Indirectly, the primary
structure also defines the amino acid composition (which of the possible 20 amino acids
are actually present) and content (the relative proportions of the amino acids present).
The peptide bonds linking the individual amino acid residues in a protein are both
rigid and planar, with no opportunity for rotation about the carbon–nitrogen bond, as
it has considerable double bond character due to the delocalisation of the lone pair of
electrons on the nitrogen atom; this, coupled with the tetrahedral geometry around
eacha-carbon atom, profoundly influences the three-dimensional arrangement which
the polypeptide chain adopts.
Secondary structure defines the localised folding of a polypeptide chain due to
hydrogen bonding. It includes structures such as thea-helix andb-pleated sheet.
Certain of the 20 amino acids found in proteins, including proline, isoleucine, trypto-
phan and asparagine, disrupta-helical structures. Some proteins have up to 70%
secondary structure but others have none.
Tertiary structure defines the overall folding of a polypeptide chain. It is stabilised
by electrostatic attractions between oppositely charged ionic groups (N
þ
H 3 ;COO),
by weak van der Waals forces, by hydrogen bonding, hydrophobic interactions and, in
some proteins, by disulphide (-SS-) bridges formed by the oxidation of spatially
adjacent sulphydryl groups (-SH) of cysteine residues (Fig. 8.1). The three-dimensional
folding of polypeptide chains is such that the interior consists predominantly of
non-polar, hydrophobic amino acid residues such as valine, leucine and phenyl-
alanine. The polar, ionised, hydrophilic residues are found on the outside of the
molecule, where they are compatible with the aqueous environment. However,
some proteins also have hydrophobic residues on their outside and the presence
of these residues is important in the processes of ammonium sulphate fractionation
and hydrophobic interaction chromatography (Section 8.3.4).
Quaternary structure is restricted to oligomeric proteins, which consist of the
association of two or more polypeptide chains held together by electrostatic attrac-
tions, hydrogen bonding, van der Waals forces and occasionally disulphide bridges.
Thus disulphide bridges may exist within a given polypeptide chain (intra-chain) or
305 8.2 Protein structure