untitled

(ff) #1

164 7 Sequence Similarity Searching Tools


7.4.2 BLAST Search Types


BLAST can be used to perform the following types of sequence similarity
searches.
blastn
The query is compared against a nucleotide sequence database, using pa-
rameters appropriate for nucleotides.
blastp
Query against an amino acid sequence database, using parameters appropri-
ate for amino acid sequences.
blastx
The query is first translated into each of the six possible reading frames, then
compared against amino acid sequence databases.
tblastn
The query is taken to be an amino acid sequence, but it is compared against
a nucleotide sequence database after translating each database entry into an
amino acid sequence using all six reading frames.
tblastx
The query is first translated into each of all six possible reading frames, then
compared against nucleotide databases, with each database sequence trans-
lated into an amino acid sequence in each of its possible reading frames.
The advantage of using one of theblastx,tblastn,ortblastxsearch
methods is that it allows one to find matches to distantly related sequences.
The disadvantage is that the searches are computationally intensive and may
take an inordinate length of time. An example of the use ofblastxfor a
DNA sequence similarity search is shown in figure 7.2.
BLAST searches can be obtained either by using a publicly available web
service (e.g.,www.ncbi.nlm.nih.gov/blast/) or by downloading the
BLAST program and running it locally. Both of these techniques require that
the queries be in FASTA format. The web services are convenient but only
accept a single query at a time. This can take a long time if one needs to run
a large number of queries. It also has the disadvantage that there are only
a limited number of customization options. For example, there are typically
only a few choices for the substitution matrix used in an amino acid sequence
query.
Because of the large computational requirements of BLAST, it is becom-
ing increasingly common to run BLAST searches on a cluster of computers.
Free download pdf