Computational Drug Discovery and Design

(backadmin) #1
the analysis. Instead, it uses predicted structural features of the
sequence, combined with the evolutionary information [15].
There is one more approach worth mentioning regarding the
prediction of functional sites, which is the coevolution-based
approach. At the residue level, coevolution simply refers to the
correlated changes across proteins that are important for the main-
tenance of the protein structure, function, and stability [16]. These
approaches also use MSA of a protein family to search coevolved
amino acids [17–19]. Among them, CoeViz provides a web server
to analyze coevolved residues in a protein [20].
In this chapter, we describe a procedure to predict functional
sites of proteins using phylogenetic analysis. We detail all steps
starting from searching for homologous sequences of the query
protein to performing MSA and finally predicting functionally
important residues with alternative selection of tools at different
steps (seeSubheading2). We discuss how to determine parameters
or options of each external tool (BLAST, Clustal Omega, etc.) and
what challenges might be experienced at different stages of the
functional site prediction procedure.

2 Methods


The procedure below describes a path that can be followed for
evaluatinga proteinin terms of identifying its functional sites with
alternative solutions. We illustrate the whole pipeline in Fig.1 as a
flowchart where the input is the query protein sequence and the
output is the active site information. We need to note that the
methods reviewed in this chapter are predictive approaches; there-
fore, they may have false positives and false negatives besides the
true positives. Also, the results of the predictive methods may not
highly overlap with each other in some cases, due to the employ-
ment of different approaches.
A given protein sequence (PQ) can be computationally anno-
tated in terms of active/functional sites (i.e., the main focus of this
chapter), highly conserved functional regions (e.g., domains and
motifs) and sequence-wide generic annotations (e.g., protein
families, subcellular locations, biological processes). Below, we
describe the methods for functional site annotation:

2.1 Sequence
Homology Search
(BLAST)


Search the homologs ofPQusing the UniProt BLAST tool (http://
http://www.uniprot.org/blast/) by enteringPQin FASTA format in the
sequence window and clicking run BLAST button. The parameters
of BLAST tool are explained inNote 1. The algorithm scans the
entire target database to find similar sequences toPQand display
the results (i.e., homologous protein sequences in the target data-
base) ranked by similarities. The homologous sequences with the

Phylogenetics-Based Prediction of Functional Sites 53
Free download pdf