AMPK Methods and Protocols

(Rick Simeone) #1
particular tool over the one that we have listed, it is easy to substi-
tute it in the analysis workflow (Fig.1). All analyses can be run on a
typical desktop computer with at least 8 GB RAM and 500 GB disk
space. We suggest any Linux distribution or alternatively Mac OSX
as operating systems. Make sure that Perl (v5.22.1), Python
(v2.7), JAVA (v1.7), and R (v3.3.3) are installed.

2.1 Protein Sequence
Databases


You can generate a data collection optimized for your analysis from
a plethora of different public web pages and data repositories. Note
that almost all data is generated from draft genome sequences.
Thus, a fraction of the genome is not represented in the assembly,
and the annotation of genes is likely to be incomplete. Therefore, a
fraction of genes that are encoded in a species’ genome might not
be represented in the reconstructed gene set.

2.1.1 Joint Genome
Institute (JGI) Genome
Portal [14]


The JGI genome portal (http://genome.jgi.doe.gov) provides
access to all JGI genome databases. The complete gene sets for
individual species from all areas of the tree of life are available for
download through an ftp server (seeNote 1). The JGIGenome
OnLineDatabase (https://gold.jgi.doe.gov/) provides an excel-
lent start for compiling a data set.

2.1.2 National Center
for Biotechnology
Information (NCBI)


NCBI (https://www.ncbi.nlm.nih.gov/) provides an open-source
web server for searching and retrieving protein and nucleotide

KEGG Pathway

Proteins

OMA
OrthoDB

HaMStR-OneSeq

FAS-S

PhyloProfile

Proteins

Species

Tree Reconstruction
(RAxML)

Phylogenetic
Hypothesis Testing
(SH Test)

A


B


B'


C


D


E
F

Fig. 1Example workflow for tracing the evolutionary history of a metabolic pathway. (a) Identify the pathway
of interest and retrieve the protein sequences for the individual pathways components from the KEGG
database. (b) For every protein, retrieve orthologs in a selection of target taxa from public databases. (b^0 )
Alternatively identify orthologs using a targeted ortholog search with HaMStR-OneSeq. (c) Compute for each
ortholog its pairwise feature architecture similarity (FAS) to the seed protein. Note that this step is already
integrated into the HaMStR-OneSeq search. (d) Create a phylogenetic profile displaying orthologs and FAS
information for all pathway components with PhyloProfile. (e) Reconstruct the evolutionary relationships for
individual or collections of seed proteins with RAxML. (f) Optionally, test the resulting sequence tree against an
expected phylogeny, e.g., the species tree


Tracing AMPK Evolution 113
Free download pdf