AMPK Methods and Protocols

(Rick Simeone) #1

  1. In a nutshell, a gene missed in the annotation of a draft genome
    does little harm to the analysis, if there is a second genome
    from a species in the same clade where the gene has been
    correctly identified.

  2. The extension of precompiled orthologous groups with
    HaMStR [25] is in principle straightforward. However, the
    naming conventions of sequences in this package are very strict,
    and it might be not too simple for an uninitiated user to
    meet all requirements. We therefore recommend for the start
    the use of HaMStR_OneSeq instead.

  3. Please be aware that file names used here are only examples and
    may differ in the actual version of the program you are using.

  4. For a more stringent ortholog identification, replace the
    –refspecoption with-strict and omit any specification of a
    reference species. The “-strict” option in HaMStR tells the
    program to confirm orthology of a candidate sequence from
    the target species for each sequence and species represented in
    the core ortholog set. Refer to the HaMStR manual for further
    details.

  5. Remember that orthology specifies only the evolutionary rela-
    tionships of two sequences. However, it does not inform about
    whether or not two sequences also exert the same function.

  6. If you run the PhyloProfile application for the first time, it may
    perform some preprocessing on your data, such as mapping the
    NCBI taxonomy ids to species names. Simply follow the guide-
    lines of the tool. Once the preprocessing is completed, a restart
    of the application might be required.

  7. Exploring phylogenetic profiles for the first time is not easy. It
    requires to have the evolutionary relationships of the analyzed
    species in mind, together with all possible evolutionary events
    explaining the presence/absence pattern of proteins in these
    species. Only then will the phylogenetic profile start making
    sense. As an example, imagine you find orthologs to a particu-
    lar protein in all mammals except say the dog. It is then safe to
    assume that the corresponding protein was present in the last
    common ancestor of all mammals. From this follows that the
    corresponding gene was either lost on the dog lineage or it was
    erroneously missed in the annotation of the dog genome. You
    would need to look into the genome assembly of the dog to
    differentiate between the two possibilities. If, however, a sec-
    ond species that is more closely related to dogs than to any
    other species in your collection also lacks the protein, the “loss
    hypothesis” gains weight, as it might appear less likely that
    twice the same gene has been missed in two independent
    genome reconstructions. Of course, you could ask to what
    extent the reconstructions are indeed independent. Imagine


138 Arpit Jain et al.

Free download pdf