AMPK Methods and Protocols

(Rick Simeone) #1
-refspec¼seedSpecies -coreOrth¼4 -minDist¼genus -max-
Dist¼kingdom -strict -checkCoorthologsRef -representative
-cleanup. “minDist” and “maxDist” control the phylogenetic
diversity of the core ortholog set. In the example call, no two
sequences from the same genus will be considered for training
the pHMM, and no sequences from a different kingdom than
your seed species. “coreOrth” takes the maximum number of
orthologs added to the seed sequence for the final pHMM
training. “cleanup” removes the temporary files created during
HaMStR-OneSeq run. Refer to the HaMStR-OneSeq-help for
the meaning of the other options. HaMStR-OneSeq will by
default search for orthologs in all species listed in the directory
pathToHaMStR/genome_dir.


  1. After a successful HaMStR-OneSeq run, a file with an exten-
    sion “.extended.fa” will be generated. This file contains all
    orthologs for the query protein with each sequence header
    having the following format: “proteinName|targetTaxaName|
    targetProteinId|number” (seestep 16in Subheading3.3). As a
    last step, combine the contents of all files ending with “.
    extended.fa” obtained from all seed proteins into a single file.
    Name this fileAMPK.extended.fa.
    Building the phylogenetic profile.

  2. Once you have compiled the orthologs for every gene in your
    pathway of interest, the next step is to build a phylogenetic
    profile. In a nutshell, a phylogenetic profile is basically a
    tab-delimited text file where the first row specifies the species
    names and the first column specifies the gene names. The
    remaining cells in the phylogenetic profile matrix are then filled
    with either “1” denoting presence or “0” denoting the absence
    of an ortholog, respectively (Fig.4a). For the dynamic visuali-
    zation and analysis of phylogenetic profiles using the Phylo-
    Profile application (seeSubheading3.5), we recommend the
    format displayed in Fig.4b. Here, you provide the NCBI taxon
    id instead of the species name. For example, “Homo sapiens”
    will be replaced by ncbi9606. In the cell corresponding to the
    seed protein, we note the presence of an ortholog in the format
    “targetProteinId#FAS_score,” where “targetProteinId” is the
    sequence identifier of the ortholog. We will discuss the FAS
    score in detail instep 2of Subheading3.4. If no ortholog was
    found in a given species, fill the corresponding cell with “NA.”
    Repeat this for all species–gene combinations and save the
    phylogenetic profile matrix file asAMPK_phylogeneticprofile-
    matrix. Obviously, the generation of a phylogenetic profile
    requires quite some text processing. However, if you identified
    orthologs with HaMStR-OneSeq, you are already in possession
    of a ready-to-use phylogenetic profile. It is stored in the file
    ending with “.matrix.”


Tracing AMPK Evolution 123
Free download pdf