- Likewise, you can also apply a cutoff for the minimal FAS score
(default is 0). This option serves to reduce the impact of
orthologs displaying deviating feature architectures and
might be helpful in tracing functionality rather than only
orthology.
3.6 Integration
of Phylogenetic
Profiles and KEGG
Pathway Maps
The following steps will result in a graphical pathway representation
where proteins are labeled if an ortholog was identified in a (set of)
species.
- Decide on a set of species for which you want to display the
information. For a start, useMus musculus,Drosophila melano-
gaster,Saccharomyces cerevisiae,Arabidopsis thaliana,Methano-
caldococcus jannaschii, andChloroflexus aurantiacus. - Transfer the KO identifier of the seed protein to its orthologs
in the species selected in the previous step. Use the earlier
generated cross-reference fileAMPK-hsa-xref.txt(seestep 11
in Subheading3.3) to obtain the KO identifiers. - Create an input file for the KEGG Mapper (http://www.
genome.jp/kegg/tool/map_pathway.html). You will find
details about the file format on the KEGG web page (http://
http://www.genome.jp/kegg/tool/example/genelist2.txt)..) - Open the KEGG Mapper web page, upload the input file, and
execute the mapping. - The KEGG Mapper will present you with a list of pathways to
which your KO identifiers match. Choose the appropriate one
(AMPK) for display. Figure8 shows the representation of the
human AMPK-TOR pathways in four eukaryotes, one
archaeon, and one bacterium.
3.7 Phylogenetic
Analysis to Explore
the Evolutionary
History of Individual
Pathway Components
The aim of the following steps is the assignment of either a specia-
tion event or a gene duplication event to each split in the recon-
structed phylogeny. Given the plethora of different tools that can
be used in the workflow of phylogeny reconstructions, we refrain
from providing a tool-specific step-by-step guide where we trust
that the tool use is self-explanatory. We put more emphasis on the
interpretation of the phylogenies.
- For phylogenetic tree reconstruction, assemble the ortholo-
gous sequences for one protein—or several related proteins—
of interest into a single text file. The FASTA format is most
widely used as a format to store sequence information and is
accepted by almost all sequence analysis tools. - As all files generated in the course of this analysis contain
human-readable text, it comes in handy to use file endings
that inform about the file format, such as yourname.fafor
sequence files in FASTA format oryourname.phyfor the Phylip
format.
130 Arpit Jain et al.