finding similar nucleotide sequences (high throughput?)

2 posts / 0 new
Last post
p_rogers's picture
finding similar nucleotide sequences (high throughput?)

So I have some sequences that don't appear to have ORFs but I want to find what they are similar to. I have been BLASTing them using the NCBI web BLAST tool but this is slow and I want to do this more quickly on my own computer. Problem: I can't find the "nr" database that is used by this web tool. I doownloaded the only nr database on the ncbi ftp but it is protein sequences. Can someone point me in the right direction?

ryan_m's picture
It seems you want the

It seems you want the "nonredundant" nucleotide database (which actually contains some redundancy due to technical issues). You can download the fasta file from the NCBI ftp site here. I warn you that this is a very large file and may cause memory issues if you attempt to run formatdb or blast against this on your home computer. You can also download the pre-formated database files from the ftp site (one directory up).

Good luck!