the loss of N-terminal lysine was 20 072.8. Clearly mass spectrometry has the ability
to provide highly accurate molecular mass measurements for proteins and peptides,
which in turn can be used to deduce small changes made to the basic protein structure.
8.4.2 Amino acid analysis
The determination of which of the 20 possible amino acids are present in a particular
protein, and in what relative amounts, is achieved by hydrolysing the protein to yield
its component amino acids and identifying and quantifying them chromatographic-
ally. Hydrolysis is achieved by heating the protein with 6 M hydrochloride acid
for 14 h at 110Cin vacuo. Unfortunately, the hydrolysis procedure destroys or
chemically modifies the asparagine, glutamine and tryptophan residues. Asparagine
and glutamine are converted to their corresponding acids (Asp and Glu) and are
quantified with them. Tryptophan is completely destroyed and is best determined
spectrophotometrically on the unhydrolysed protein.
The amino acids in the protein hydrolysate are then separated chromatographically.
Nowadays this is normally done using the method of precolumn derivatisation, fol-
lowed by separation by reverse-phase HPLC. In this approach the amino acid hydro-
lysate is first treated with a molecule that (i) reacts with amino groups in amino acids,
(ii) is hydrophobic, thus allowing separation of derivatised amino acids by reversed-
phase HPLC and (iii) is easily detected by its ultraviolet absorbance or fluorescence.
Reagents routinely used for precolumn derivatisation includeo-phthalaldehyde and
6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC), which both produce fluor-
escent derivatives, and phenylisothiocyanate, which produces a phenylthiocarbamyl
derivative that is detected by its absorbance at 254 nm. Analysis times can be as little as
20 min, and sensitivity is down to 1 pmole or less of amino acid.
8.4.3 Primary structure determination
For many years the amino acid sequence of a protein was determined from studies
made on the purified protein alone. This in turn meant that sequence data available
were limited to those proteins that could be purified in sufficiently large amounts.
Knowledge of the complete primary structure of the protein was (and still is) a
prerequisite for the determination of the three-dimensional structure of the protein,
and hence an understanding of how that protein functions. However, nowadays the
protein biochemist is normally satisfied with data from just a relatively short length of
sequence either from the N terminus of the protein or from an internal sequence,
obtained by sequencing peptides produced by cleavage of the native protein. The
sequence data will then most likely be used for one of three purposes:
- To search sequence databases to see whether the protein of interest has already been
isolated, and hence can therefore be identified. For this type of search extremely short
lengths of sequence (three to five residues), known as sequence tags, need to be used.
Examples of this type of data search are given in Sections 8.5.1 and 9.5.2.
330 Protein structure, purification, characterisation and function analysis