AMPK Methods and Protocols

(Rick Simeone) #1

  1. In cases where the alignment contains lots of gaps, you can add
    an optional post-processing step with Guidance [40] or TCS
    [41]. This serves to reduce noise in the data due to incorrectly
    aligned residues. Alternatively, you can remove alignment col-
    umns where the number of sequences represented by an amino
    acid does not exceed a certain threshold. A limit of 50% is
    common; however, you can increase this value if a more strin-
    gent analysis is desired.

  2. Maximum likelihood tree reconstruction requires thea priori
    definition of a protein sequence evolution model. Use the
    (post-processed) alignment as input for a ProtTest run
    (v3.4.1) [34]. Let ProtTest define its own tree for the analysis
    and choose AIC as a model selection criterion. Make sure to
    enable the testing for modeling substitution rate heterogeneity
    across sites (optionG), invariant sites (optionI), and for using
    empirical amino acid frequencies (optionF). The top-ranked
    model according to the AIC is the one that provides the best fit
    to your data and should be used for downstream maximum
    likelihood tree reconstruction (seeNote 15).

  3. Use the multiple sequence alignment and the model for
    sequence evolution as input for the maximum likelihood
    (ML) tree reconstruction with RAxML [35]. When you add
    the option–f A, RAxML will compute the maximum likelihood
    tree together with a rapid bootstrap support. Specify the num-
    ber of bootstrap replicates with the option–N. A complete
    program call could read something like this:raxml –n out-
    putFileName –s alignmentFileName –m PROTGAMMAILGF


(A) FASTA format (B) Phylip format



A1_YEASTMSSNNNTNT-------------APANANSQRDKMSEQE-ARRFFQQIISAVEY-------
CHRHKIVHRDLKPENLLLDEHLNVKIADFGLSNIMTDGNFLKTSCGSPNYAAPEVISGKL
YAGPEVDVWSCGVIL---------------------->PRKAA1_SORCE
MARCL-----------------VCNAENPGSARFCVAC-GASLTAKEAGAGAATSAPPAPGPRTTVPGQPLLGALVPEDPAHRASL------SALAAGGANGPAANV---HAPHIVPAPL
AGSLPRRA--PGGHLP---------------------
A1_ENTHI
-------------------------------MSQCYRV-GQFIIGKKLGEGMC-------
G-KVYLAFHEKTGVKVAIKIVDKTKL----MRKPEMKRKVEREIAFLKIINHRNVMQLYT
VYETTRYLFLVMELLEGGELFDYISSKGKLEIEEVLV
A1_ARATH
MFKRVDEFNLVSSTIDHRIFKSRMDGSGTGSRSGVESILPNYKLGRTLGIGSF-------
G-RVKIAEHALTGHKVAIKILNRRKI-----KNMEMEEKVRREIKILRLFMHPHIIRLYE
VIETPTDIY----------------------------
PRKAA1_HUMAN
-MRRLSSWR-------------KMATAEKQKHDGRVKI-GHYILGDTLGVGTF-------
G-KVKVGKHELTGHKVAVKILNRQKI-----RSLDVVGKIRREIQNLKLFRHPHIIKLYQ
VISTPSDIFMVMEYVSRAR------------------
PRKAA1_MOUSE
-MRRLSSWR-------------KMATAEKQKHDGRVKI-GHYILGDTLGVGTF-------
G-KVKVGKHELTGHKVAVKILNRQKI-----RSLDVVGKIRREIQNLKLFRHPHIIKLYQ
VISTPSDI-----------------------------



A1_YEAST MSSNNNTNT- ---------- --APANANSQ RDKMSEQE-A RRFFQQIISA 6 157
PRKAA1_SOR MARCL----- ---------- --VCNAENPG SARFCVAC-G ASLTAKEAGA A1_ENTHI ---------- ---------- ---------- -MSQCYRV-G QFIIGKKLGE
A1_ARATH MFKRVDEFNL VSSTIDHRIF KSRMDGSGTG SRSGVESILP NYKLGRTLGI PRKAA1_HUM -MRRLSSWR- ---------- --KMATAEKQ KHDGRVKI-G HYILGDTLGV
PRKAA1_MOU -MRRLSSWR- ---------- --KMATAEKQ KHDGRVKI-G HYILGDTLGV
VEY------- CHRHKIVHRD LKPENLLLDE HLNVKIADFG LSNIMTDGNF GAATSAPPAP GPRTTVPGQP LLGALVPEDP AHRASL---- --SALAAGGA
GMC------- G-KVYLAFHE KTGVKVAIKI VDKTKL---- MRKPEMKRKV
GSF------- G-RVKIAEHA LTGHKVAIKI LNRRKI---- -KNMEMEEKV GTF------- G-KVKVGKHE LTGHKVAVKI LNRQKI---- -RSLDVVGKI
GTF------- G-KVKVGKHE LTGHKVAVKI LNRQKI---- -RSLDVVGKI
LKTSCGSPNY AAPEVISGKL YAGPEVDVWS CGVIL----- ----------NGPAANV--- HAPHIVPAPL AGSLPRRA-- PGGHLP---- ----------
EREIAFLKII NHRNVMQLYT VYETTRYLFL VMELLEGGEL FDYISSKGKL
RREIKILRLF MHPHIIRLYE VIETPTDIY- ---------- ----------RREIQNLKLF RHPHIIKLYQ VISTPSDIFM VMEYVSRAR- ----------
RREIQNLKLF RHPHIIKLYQ VISTPSDI-- ---------- ----------
--------------
EIEEVLV
--------------
-------

Fig. 9Commonly used multiple sequence alignment formats. (a) FASTA and (b) Phylip. Please note that the
sequence identifiers in Phylip format are typically limited to a maximum of ten characters. Format converters
will therefore shorten longer identifier in the FASTA format (e.g., “PRKAA1_SORCE”) to the maximum length of
ten characters (“PRKAA1_SOR”) in Phylip format. You should therefore pay particular attention that the
identifiers are still unique after the conversion


132 Arpit Jain et al.

Free download pdf