168 7 Sequence Similarity Searching Tools
E-value P-value
10 0.9999546
5 0.9932621
1 0.6321206
0.1 0.0951626
0.05 0.0487706
0.001 0.0009995 (about 0.001)
0.0001 0.0001000
Table 7.1 Comparison betweenE-values and their correspondingP-values.
the homologous sequences, it is necessary to compare them with each other
and to organize them to show important features such as highly conserved
regions and other subtle similarities. This is known asmultiple sequence align-
ment(MSA). We have already mentioned MSAs in section 5.4 where they are
used in the PROSITE database. MSAs are also used in some BLAST variants.
Summary
- The raw scoreSfor a search can be normalized so that the results of dif-
ferent searches can be compared. - The expectation valueEcan be used to test whether the result of a search
has statistical significance.
7.4.4 BLAST Variants
Since the launch of the original version of BLAST, many BLAST variants have
been developed. The major variants are presented in the following.
bl2seq http://www.ncbi.nlm.nih.gov/blast/bl2seq/bl2.html
In many cases, biologists only want to compare two sequences, rather than
embarking on a time-consuming journey of a full database search. The BLAST
2 SEQUENCES program is specifically designed for pairwise comparisons of
DNA or amino acid sequences (Tatusova and Madden 1999).
PSI-BLAST bioinfo.bgu.ac.il/blast/psiblast_cs.html
Queries using PSSMs differ from queries using substitution matrices in some
important ways. Unlike substitution matrices, there are no default or stan-
dard PSSMs. In fact, the PSSM is an important part of the query itself.