now further that for the same protein, you then don’t find any
orthologs in nonmammalian eukaryotes, but there are few
bacterial species predicted to have an ortholog present. Now
the situation becomes complex as you then need to choose
between at least three alternative explanations: First, the
corresponding gene is as old as the last common ancestor of
bacteria and eukaryotes and was independently lost multiple
times on the eukaryotic lineage such that it is nowadays repre-
sented only in mammals. Second, you are facing the result of a
horizontal gene transfer from bacteria into the mammalian
lineage. Third, the orthology assignment is wrong. Further
analyses are then required to solve this issue. Among others, a
look at the feature architecture similarity of the proteins might
provide some indications about how to explain the
observation best.
- There is no gold standard for a sequence alignment program.
We suggest MUSCLE [32], because it facilitates the alignment
of large data collections containing hundreds to thousands of
sequences. To rule out any influence of the alignment program
on the outcome of the phylogenetic analysis, you may consider
repeating the analysis with different aligners and to monitor
any changes.
- It is always advisable to keep sequence headers concise, making
sure that the first ten characters identify the sequence uniquely.
Moreover, white spaces should be avoided.
- Sequence header should be designed such that the information
provided is consistent across all sequences and taxa. It might be
helpful to think about the header as a row in a spread sheet,
where each cell contains a particular kind of information.
- ProtTest [34] comes with other selection criteria (BIC, AICc,
and LnL). The criteria penalize model complexity slightly dif-
ferently, and thus the favored model might differ between the
criteria. If you are unsure which selection criterion should be
used, and if the highest ranked model differs between them,
you can compute trees using alternative models and check for
differences in the branching order.
- Midpoint rooting makes perfect sense when sequences evolve
approximately clocklike, i.e., accumulate similar amounts of
sequence change per unit of time on all branches. However,
quickly evolving sequences represented by long branches in the
tree tend to attract the midpoint root placement leaving the
root without evolutionary meaning. You can identify such
instances when the root placement renders a known monophy-
letic group (single evolutionary origin) such as animals, plants,
or fungi as paraphyletic (multiple evolutionary origins).
Tracing AMPK Evolution 139