AMPK Methods and Protocols

-refspec¼seedSpecies -coreOrth¼4 -minDist¼genus -max- Dist¼kingdom -strict -checkCoorthologsRef -representative -cleanup. “minDist” and “maxDist” control the phylogenetic diversity of the core ortholog set. In the example call, no two sequences from the same genus will be considered for training the pHMM, and no sequences from a different kingdom than your seed species. “coreOrth” takes the maximum number of orthologs added to the seed sequence for the final pHMM training. “cleanup” removes the temporary files created during HaMStR-OneSeq run. Refer to the HaMStR-OneSeq-help for the meaning of the other options. HaMStR-OneSeq will by default search for orthologs in all species listed in the directory pathToHaMStR/genome_dir.

After a successful HaMStR-OneSeq run, a file with an exten-
sion “.extended.fa” will be generated. This file contains all
orthologs for the query protein with each sequence header
having the following format: “proteinName|targetTaxaName|
targetProteinId|number” (seestep 16in Subheading3.3). As a
last step, combine the contents of all files ending with “.
extended.fa” obtained from all seed proteins into a single file.
Name this fileAMPK.extended.fa.
Building the phylogenetic profile.

Once you have compiled the orthologs for every gene in your
pathway of interest, the next step is to build a phylogenetic
profile. In a nutshell, a phylogenetic profile is basically a
tab-delimited text file where the first row specifies the species
names and the first column specifies the gene names. The
remaining cells in the phylogenetic profile matrix are then filled
with either “1” denoting presence or “0” denoting the absence
of an ortholog, respectively (Fig.4a). For the dynamic visuali-
zation and analysis of phylogenetic profiles using the Phylo-
Profile application (seeSubheading3.5), we recommend the
format displayed in Fig.4b. Here, you provide the NCBI taxon
id instead of the species name. For example, “Homo sapiens”
will be replaced by ncbi9606. In the cell corresponding to the
seed protein, we note the presence of an ortholog in the format
“targetProteinId#FAS_score,” where “targetProteinId” is the
sequence identifier of the ortholog. We will discuss the FAS
score in detail instep 2of Subheading3.4. If no ortholog was
found in a given species, fill the corresponding cell with “NA.”
Repeat this for all species–gene combinations and save the
phylogenetic profile matrix file asAMPK_phylogeneticprofile-
matrix. Obviously, the generation of a phylogenetic profile
requires quite some text processing. However, if you identified
orthologs with HaMStR-OneSeq, you are already in possession
of a ready-to-use phylogenetic profile. It is stored in the file
ending with “.matrix.”

Tracing AMPK Evolution 123

AMPK Methods and Protocols

Get our desktop app

Company

Features

Documentation

Resources