AMPK Methods and Protocols

(Rick Simeone) #1
To increase the precision of evolutionary conclusions that can be
drawn from phylogenetic profiles, it is common to use the subset of
homologs, which descended from a common ancestor through a
speciation event, so-called orthologs [5]. Orthologs serve as an
indispensable resource to reconstruct the evolutionary relationships
of species [6]. At the same time two orthologs represent the mutu-
ally closest related sequences in the genomes of two species. It is
thus that orthologs are considered as the best guess when aiming at
the identification of functional equivalents in the genome of newly
sequenced species [7–9]. However, also orthologs can diverge in
their function, and this change becomes more likely the more time
has passed since their separation from a shared ancestral sequence.
As a consequence, the function a protein exerts in a pathway might
be younger than the protein itself. This makes a careful curation
necessary before one can assume a functional equivalence between
two orthologous sequences [9–11].
In the light of the above, we present a comprehensive workflow
to study the evolution of a functional pathway that has been
described in a given model species. This elucidates the progressive
construction of this pathway over evolutionary time and, at the
same time, facilitates the identification of the corresponding path-
ways—or of parts of it—in the genomes of non-model organisms.
We illustrate our approach by tracing the AMPK pathway across the
three domains of life [12]. We collect the AMPK pathway compo-
nents (121 proteins) from the KEGG database [13] and perform a
targeted ortholog search using each of the 121 proteins to seed its
phylogenetic profile. We then infer for each of the detected ortho-
logs the extent of protein feature architecture similarity to the seed
protein. Precisely, we calculate a feature architecture similarity score
ranging between 0 (no similarity) and 1 (ortholog displays all
features of the seed), and assess whether the ortholog is likely to
share the same function with the the seed. Lastly, we provide a
novel R/Shiny-based application to dynamically visualize multilay-
ered phylogenetic profiles.

2 Materials


The following collection provides all relevant data sources and
bioinformatics tools for analyzing proteins in a functional and
evolutionary context. We assume basic knowledge in bioinformat-
ics sequence data processing and management. Prior experiences
with working in a Unix shell, such as thebash, will be helpful;
however, it is considerably straightforward to acquire this knowl-
edge in the course of the analysis. To keep the protocol concise, we
do not list all alternative software choices for the individual analysis
steps. Instead, we have exemplarily selected those tools that we have
successfully applied in our daily routine. If the reader prefers one

112 Arpit Jain et al.

Free download pdf