Computational Drug Discovery and Design

on proteins can be determined by experimental approaches; however, it is challenging to identify those sites for all proteins due to practical issues and high costs associated with experimental proce- dures. Therefore, computational approaches have emerged for the prediction of the functional sites (extensively reviewed in [2]). Most of the frequently used computational approaches depend on the information that functional sites are evolutionarily more conserved than the rest of the protein surface. However, there are also other sequence and structure based features that can be used to distin- guish functional sites such as the secondary structure information, solvent accessibility and structural conservation [3, 4]. Given that the protein structure is more conserved than the sequence, structural comparison can recover more distant relationships across proteins. In previous studies, a large scale comparison has been applied to all known protein binding sites and shown that although global structures of some protein complexes are different their binding regions are structurally similar [2, 5, 6]. Sequence conservation has been also used in combination with geometric features of functional sites for prediction to improve the performance [7]. Determination of the conserved regions in a protein sequence to predict functional residues starts with the multiple sequence alignment (MSA) of the query protein sequence and its homologs [4]. MSA reveals highly conserved positions on the input sequences. Some methods first construct a phylogenetic tree on the basis of the MSA results, instead of analyzing sequence conservation directly from the MSA [8]. A phylogenetic tree represents the evolutionary relationships between protein sequences, which provides subfamily-specific mutations of protein families [9]. The evolutionary trace (ET) method has been developed as the first implementation of this idea, which does not use only identical residues but also consider amino acid similarities [10]. As a kind of more improved version of ET method, ConSurf also generates phylogenetic trees of homologous sequences using the neighbor- joining algorithm based on the MSA results and computes position-specific conservation scores for each amino acid in the sequence. Also, it retrieves structural information of proteins from PDB if available [11, 12]. INTREPID, another functional site prediction method, performs phylogenetic tree analysis in combination with a Jensen–Shannon divergence based positional conservation score [4]. INTREPID has been extensively compared to ET and ConSurf methods in predicting functional residues [13]. The latest release of ET method as a database and web server is called Universal Evolutionary Trace (UET) [14]. Apart from these methods, there are also machine learning- based approaches such as PROFisis that identifies residues at protein–protein interaction (PPI) interfaces. This method uses PPI information obtained from experimentally known 3D structures; however, it does not require 3D structure of the query protein for

52 Heval Atas et al.

Computational Drug Discovery and Design

Get our desktop app

Company

Features

Documentation

Resources