AMPK Methods and Protocols

(Rick Simeone) #1

3.4 Assessing
the Feature
Architecture Similarity
(FAS) Between
Proteins


This protocol allows analyzing and comparing feature architectures
between the identified orthologs and the corresponding seed pro-
tein. The procedure ends with a similarity measure aiding in the
conclusion whether or not an identified ortholog is likely to have
diverged in function from the seed (seeNote 9). One can manually
compare feature architectures between proteins using InterProS-
can, as explained instep 1. For an automated comparison of feature
architectures, use FAS-S, a software to assess and score the feature
architecture similarity (FAS) between two proteins. FAS scores
range from 0, i.e., two architectures share no features, to
1, when all features of the seed protein are represented in the
ortholog. The generation of FAS scores in combination with a
phyletic profiling with HaMStR_OneSeq is described insteps
2 – 5. The later FAS scoring of existing orthologous protein pairs
is outlined insteps 6– 11.
Manual comparison of feature architectures between proteins.


  1. For the analysis of individual seed–ortholog pairs by eye, anno-
    tate the feature architectures with InterProScan [27]. Open the
    web interface of InterProScan, paste the protein sequence into
    the search field, and submit. You can always analyze only a
    single sequence at a time. Compare the resulting architectures
    by eye. Look out for a conspicuous absence of seed protein
    features in the ortholog and for features of the ortholog that
    are missing in the seed protein. Figures5 and 6 display the
    InterProScan annotation for PRKAA1 protein from human
    (Q13131) and of its ortholog in the fruit fly,D. melanogaster,
    (O18645), respectively. All features of the human protein are
    present in the drosophila ortholog as well. Thus, there is no


(A)


(B)


geneID H. sapiens M. musculus
PRKAA1 11
PRKAA2 11
PRKAB1 11
PRKAB2 11
PRKAG1 11
PRKAG2 11
PRKAG3 11

geneID ncbi9606 ncbi10090
PRKAA1Q13131#1.0 Q5EG47#0.9992
PRKAA2P54646#1.0 Q8BRK8#0.9984
PRKAB1Q9Y478#1.0 Q9R078#1.0
PRKAB2O43741#1.0 Q6PAM0#0.9996
PRKAG1P54619#1.0 O54950#0.9999
PRKAG2Q9UGJ0#1.0 Q91WG5#0.9996
PRKAG3Q9UGI9#1.0 Q8BGM7#0.9998

D. melanogaster S. cerevisiae A. thaliana
111
111
111
111
011
000
001

ncbi7227 ncbi559292 ncbi3702
P06782#0.9754 Q38997#0.9939
O18645#0.9760 P06782#0.9750 Q38997#0.9934
A1Z7Q8#0.9758 Q04739#0.9865 Q9SCY5#0.9950
A1Z7Q8#0.9758 P34164#0.9875 Q9SCY5#0.9944
NA P12904#0.9997 Q944A6#0.9868
NA NA NA
NA NA Q8LBB2#0.9902

M. jannaschhii C. aurantiacus
01
01
00
00
01
01
11

ncbi243232 ncbi324602
NA A9WD85#0.1306
NA A9WD85#0.1315
NA NA
NA NA
NA A9WBL7#0.9919
NA A9WHY8#0.4073
Q58799#0.9367 A9WGZ6#0.7253

O18645#0.9760

Fig. 4Different types of phylogenetic profile. (a) General format of a phylogenetic profile. The first row and first
column represent taxa and gene ids, respectively. The “1” and “0” in the cells represent the presence and
absence of orthologs, respectively. (b) Phylogenetic profile in the PhyloProfile format. The first row and first
column correspond to taxa and proteins, respectively. The cells have values in the format “OrthologId#FAS,”
where OrthologId is the protein sequence identifier, and FAS is the Feature Architecture Similarity score
between the seed and the ortholog. “NA” indicates that no ortholog was found


124 Arpit Jain et al.

Free download pdf