interacting proteins are being identified in individual organisms. This has led to the
development of the Database of Interacting Proteins (DIP), which can be found at
>http://dip.doe-mbi.ucla.edu>. Given the current fad for inventing new words
ending in ‘ome’, some refer to these maps of protein interactions as the interactome.
One of the most widely used, and successful, methods for investigating protein–
protein interaction is the yeast two-hybrid (Y2H) system, which exploits the modular
architecture of transcription factors. A transcription factor gene (GAL4) is split into
the coding regions for two domains, a DNA-binding domain and atrans-activation
domain. Both these domains are expressed, each linked to a different protein (one
being the unknown protein, the other a protein with which it may interact), in separate
yeast cells, which are then mated to produce diploid cells (the two proteins being
studied are often referred to as the bait and prey). If, in this diploid cell, the bait and
prey proteins bind to each other, they will bring together the two domains of the
transcription factor, which will then be active and will bind to the promoter of a
reporter gene (e.g. the hisgene), inducing its expression. Identification of cells
expressing the reporter gene product is evidence that the bait and prey proteins
interact. In practice, following mating, diploids are selected on deficient medium (in
this case, medium deficient in histidine), thus only yeast cells expressing interacting
proteins survive (as they are capable of synthesising histidine). Once such a positive
interaction is identified, the two interacting open reading frames (ORFs) are simply
identified by sequencing a small part of the protein gene.
Using this approach, all 6000 ORFs fromS. cerevisiaewere individually cloned
as both bait and prey. When the pool of 6000 prey clones was screened against each
of the 6000 bait clones, 691 interactions were identified, only 88 of which were
previously known. This therefore gave an indication of the function of over 600
proteins whose function was previously unknown. On a much larger scale, the same
approach was used to identify protein–protein interactions in the fruit fly,Drosophila
melanogaster. All 14 000 predictedD. melanogasterORFs were amplified using the
polymerase chain reaction (PCR) and each cloned into two-hybrid bait and prey
vectors. A total of 45 417 two-hybrid positive colonies were obtained, from which
10 021 protein interactions involving 4500 proteins were obtained. The yeast two-
hybrid system is described in greater detail in Section 6.8.3.
8.5.5 Protein arrays
A newly developing area for studying protein–protein interactions is the use of protein
arrays (chips). Although the basic principle for screening and identifying interacting
molecules is much the same as for DNA arrays (Section 6.8.8), the production of
protein arrays is more technically demanding owing mainly to the difficulty of
binding proteins to a surface and ensuring that the protein is not denatured at any
stage of the assay procedure.
In a protein array, proteins are immobilised as small spots (150–200mm) onto a
solid support (typically glass or a nitrocellulose membrane), using high precision
contact printing (not unlike a dot-matrix printer) at a spot density of the order of
1500 spots cm^2. A solution of the protein of unknown function is then incubated on
348 Protein structure, purification, characterisation and function analysis