Food Biochemistry and Food Processing (2 edition)

(Wang) #1

BLBS102-c22 BLBS102-Simpson March 21, 2012 13:41 Trim: 276mm X 219mm Printer Name: Yet to Come


22 Application of Proteomics to Fish Processing and Quality 419

proteins such as tropomyosin (Lehrer et al. 2003). Proteomics
provide a highly versatile toolkit to identify and characterize al-
lergens. As yet, these have seen little use in the study of seafood
allergies, although an interesting and elegant approach was re-
ported by Yu and coworkers (Yu et al. 2003a) at National Taiwan
University. These authors, studying the cause of shrimp allergy
in humans, performed a 2DE on crude protein extracts from
the tiger prawn,Penaeus monodon, blotted the two-dimensional
gel onto a PVDF membrane and probed the membranes with
serum from confirmed shrimp allergic patients. The allergens
were then identified by MALDI-TOF MS of tryptic digests. The
allergen was identified as a protein with close similarity to argi-
nine kinase. The identity was further corroborated by cloning
and sequencing the relevant cDNA. A final proof was obtained
by purifying the protein, demonstrating that it had arginine ki-
nase activity and reacted to serum IgE from shrimp allergic
patients and, furthermore, induced skin reactions in sensitized
shrimp allergic patients.

IMPACTS OF HIGH THROUGHPUT
GENOMIC AND PROTEOMIC
TECHNOLOGIES

In recent years, proteomics has moved on from technical issues
related to protein separation and protein identity to highly repro-
ducible gel-based or gel-free systems as discussed earlier. This
increased ability to separate proteins and to perform peptide se-
quence analysis by MS has meant that the volume of data that
is produced and the rate of identification of proteins is orders
of magnitude greater than only a few years ago (reviewed by
Seidler et al. 2010). Directly related to the increased volume of
data generated means that extracting the relevant information is
no longer a simple matter of discussing a list of protein identities,
and interpretation of the proteins and their function is central to
any medium- to large-scale proteome study (Malik et al. 2010).
The number of expressed sequence tags (ESTs) related to
salmonid fish is currently in the order of 800,000 sequences, rep-
resenting mRNAs encoding about 30–40,000 different proteins,
other commercial species including cod, sea bass, sea bream,
and catfish amongst others are quickly catching up (Martin et al.
2008). These sequences can be used to help identify amino acid
sequences generated during proteomics studies, which means
that now the majority of proteins can be identified rather than
the minority.
There are now a considerable number of fish species whose
whole genome is completely sequenced; zebrafish, two species
of puffer fish, and stickleback have their genomes sequenced.
And recently the sea bass, a major aquacultured species has been
almost fully sequenced (Kuhl et al. 2010). The technology and
tools used for these projects have led the way to expand the
sequence data and interpretation of sequences from commer-
cially important species (Forne et al. 2010)—the data that can
be directly used for proteomic studies. A good example of this is
zebrafish where 5716 proteins were identified with an estimated
false discovery rate of 1.34% (De Souza et al. 2009).
As more and more genes and protein sequences are deposited
in databases, they are automatically annotated; the quality of

these annotations is probably one of the greatest hurdles in fully
interpreting the output of either a transcriptomic study or pro-
teomic studies. Currently, annotations include the nucleotide
sequence, protein sequence, tissue distribution abundance of
mRNA, gene ontology (GO) (Gaudet et al. 2009), and Kyoto
Encyclopedia of Genes and Genomes (KEGG) (Okuda et al.
2008) pathways. A high proportion of the genes in human, mice,
yeast, among others, have many of their protein annotated to this
extent; in the near future, this will be the case for fish as well.
GO and KEGG are two databases that are used to assign func-
tion to a particular protein. For GO, these are three different
assignments: (1) biological process, (2) molecular function, and
(3) cellular component, the same terms being used across all
species. This gives a vocabulary of definitions for a particular
protein; in fact, there are now thousands of GO terms. When pro-
teins are identified they can be matched to these different GO
terms; for example, glycolysis or protein metabolism as very
basic examples. If a data set produced contains tens or hundreds
of proteins identities, it is possible to use GO to show the global
changes occurring in that tissue; if many GO terms are related to
the same biological process or function, this can help interpret
the data as opposed to a simple list of proteins. If there are two
samples to compare, statistical analysis of the GO terms can in-
dicate if there are major differences in cellular activity between
the two samples, or by using GO cellular component, it can be
indicated what are the particular organelles that are being af-
fected. The KEGG pathways (http://www.genome.jp/kegg/) are
similar to biochemical maps with interactive links to other nu-
cleotide, protein, and literature databases, as with GO these help
interpret the output data when a large number of proteins have
been identified. New pathway analysis programs are continu-
ally being developed including Reactome (Vastrik et al. 2007)
that allows networks of proteins to be explored and integra-
tion of data with literature through a platform called iHOP
(Hoffmann and Valencia 2004). Although there is a requirement
for significant manual analysis of the data when working with
fish species (including the model species), clearly as the users
become more familiar with the available tools and the databases
become more integrated, the knowledge gap between those
working on species with well-annotated genomes and fish will
decrease.
In future studies, data from other “omic” platforms should
be combined, that is, incorporating transcriptomic data from
microarray and deep sequencing and from metabolomic data.
The complementary techniques, although performed often in
different laboratories, do ask the same questions and one of the
future steps will be to perform meta-analysis across these high
throughput technologies.

REFERENCES


Ahmed FE. 2009. Sample preparation and fractionation for pro-
teome analysis and cancer biomarker discovery by mass spec-
trometry.Journal of Separation Science32: 771–798.
Ahmed N et al. 2003. An approach to remove albumin for the pro-
teomic analysis of low abundance biomarkers in human serum.
Proteomics3: 1980–1987.
Free download pdf