CBMG 688I   Spring 2010
Home Syllabus Communications (recent) Blackboard
Steve Mount Links Model Organisms NCBI


National Center for Biotechnology Information (NCBI)

BLAST

An online resource is available at NCBI (The NCBI handbook)
Basic Local Alignment Search Tool.
point: - NCBI offers several ways of searching with blast (blastn, blastp, tblastn, etc..).
point: - It is no longer possible to search everything in a single database.
Be careful to search the relevant database. Limitation to species of interest will speed up your search.
You must add species individually but there is text completion.
There are several nucleotide databases, including wgs, htgs and gss .tip: - Blink gives you a precomputed blast search; there is no need to run blast at all in many cases.
tip: - Search the refseq database or specific species to limit output.
Scores: There are bit scores and E values.
point: - BLAST: Bit scores don't depend on the database; E values do.
blastn tip: - BLAST: blastn default setting are most useful for nearly exact matches.
For less related matches, use blastn, under algorithm parameters, select a word size of 7 and match/mismatch scores of 4 and -5.
blastp tip: - BLAST: consider turning off the conditional compositional score matrix adjustment, in which case you should select “mask for lookup table only” under “choose filter.”
tip: - BLAST: always search genomic sequence with an amino acid query (tblastn)
point: - BLAST: Psi-BLAST is used to generate a PSSM (Position Specific Scoring Matrix that can be used instead of a BLOSUM matrix)
point: - BLAST: You can search the CD database of pre-computed PSSMs with rpsBLAST
point: - Conserved domains are curated at Pfam, SMART and UniProt

Note the NCBI tutorial course (offered a few times each year): ncbi.nlm.nih.gov/Class/FieldGuide/nlm.html

As an exercise, compare the following searches:

Use NP_788665 as a query for blastp, with defaults, no filtering
Use NP_788665 as a query, turning off the conditional compositional score matrix adjustment
Use NP_788665 as a query for blastp. Filter the query.
Use NP_788665 as a query for blastp. Filter with "lookup table only" option.
Look at the blink report for NP_788665.

Compare those results with those obtained using the corresponding nucleotide entry, NM_176488.
Compare the default parameters with -r 5 -q -4 -G 10 -E 6 and a word size of 7 (-W 7)

Compare searching a more distant database (human nucleotide) with each of the following:
NP_788665 (tblastn), NM_176488 (blastn), NM_176488 (blastn, but with -r 5 -q -4 -G 10 -E 6 and a word size of 7 (-W 7).

Note the NCBI tutorial course (Oct. 23 and 24): ncbi.nlm.nih.gov/Class/FieldGuide/nlm.html


Page by Steve Mount last modified March 25, 2010.
Please report any bad links or other problems 
counter hit make