Scientist Solutions: Life Science Discussions
 Refer a Friend    Link To Us    Bookmark Us       

      
 » Home » Genomics » DNA Sequencing » Help with bioinformatics for next-generation sequencing

Other Topics
11/21/2008 01:30 PM
ABI 3730xl and Roche MagN ...
11/12/2008 11:15 PM
Genome Analyzer II - Raw ...
11/6/2008 04:57 PM
ABI 3730xl available to t ...
10/15/2008 02:33 PM
Blogs discussing genomics ...
7/24/2008 04:14 PM
who has the protocol of m ...
7/24/2008 04:21 PM
who has the protocol of m ...
6/4/2008 11:31 PM
SSB-Protein as additive t ...
3/18/2008 10:19 PM
de-novo assemblers for sh ...
1/26/2008 10:04 PM
Personal Genome Project
11/13/2007 02:08 PM
DNA video on utube
11/13/2007 02:20 PM
Genome within a Genome
11/13/2007 11:42 AM
Need a collaborat. in gen ...
11/13/2007 11:44 AM
Buffalo DNA Sequencing
11/1/2007 05:54 PM
Cinnamon (the Cat), Genom ...
9/7/2007 01:41 AM
First Diploid Human Genom ...
8/7/2007 01:35 PM
transposons
7/6/2007 08:50 PM
German Genome Center is C ...
6/18/2007 04:38 PM
Why does sequencing witho ...
5/31/2007 03:10 PM
ClinSeq: A Large-Scale Me ...
5/15/2007 08:22 PM
Ten Mistakes to Avoid Whe ...
4/2/2007 05:12 PM
1 Gigabase Sequencing Gra ...
1/4/2007 04:04 PM
Sequence without primer
11/1/2006 10:18 PM
Need to find the vampire ...
10/31/2006 12:45 AM
minimum length required f ...
10/7/2006 07:31 PM
New X Prize is in Genomic ...
9/24/2006 05:45 AM
metagenomics
8/13/2006 07:51 PM
Positions of rRNA genes i ...
7/28/2006 03:22 AM
finding similar nucleotid ...
7/26/2006 06:57 PM
How to sequence orthologs ...
7/13/2006 03:38 PM
Look for gene upstream se ...
Subscribet to topic
Add Reply  Add New Topic  Add New Poll
bottom of page RSS Feed 

Topic Feed

 

Help with bioinformatics for next-generation sequencing

 [View Printable]
ryan_m

Frog Laureate

See
Similar
Scientists





Group: Moderators
Posts: 284
Joined: May 06, 2006







 Go to homepage of ryan_m Send a personal messsage to ryan_m Reply with a quote from this post Go to the top of the page

I came across this forum (SeqAnswers) which focuses on the bioinformatics for handling/using data from next-generation sequencing platforms. Of course, anyone with problems or questions in that area could post here and find more than a few people working in this area who could probably advise them!

Ryan

.........................

Posted Feb 18, 2008, 6:10 AM
khan

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 2
Joined: Dec 01, 2007







 Send a personal messsage to khan Reply with a quote from this post Go to the top of the page

I am new to SNP genotyping. If someone could help me to understand it for example if you discover 13000--15000 SNPs by EST sequences through 454 sequencing then how you genotype them on a population of 50 individuals. If you need to design 13000--15000 primers and which would be the best method of genotyping (cost effective)?

Thank you for your time and effort

.........................

Posted Feb 21, 2008, 12:13 PM
ryan_m

Frog Laureate

See
Similar
Scientists





Group: Moderators
Posts: 284
Joined: May 06, 2006







 Go to homepage of ryan_m Send a personal messsage to ryan_m Reply with a quote from this post Go to the top of the page

Hi Khan.
Once you have your SNPs of interest, it is better to go to the highly parallel genotyping assays such as those provided by Illumina. There are many companies that will design and perform your arrays for you, see for example, this site.

Regards,

Ryan

.........................

Posted Feb 21, 2008, 15:52 PM
khan

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 2
Joined: Dec 01, 2007







 Send a personal messsage to khan Reply with a quote from this post Go to the top of the page

Thanks for your reply. I will appreciate if you help me understanding how it works. I assume i found three thousand SNPs in EST sequences. Now i have to design three thosand primers and use them in multiplex in the assays like in CMMT genotyping assays?
Thanks

.........................

Posted Feb 21, 2008, 21:01 PM
ryan_m

Frog Laureate

See
Similar
Scientists





Group: Moderators
Posts: 284
Joined: May 06, 2006







 Go to homepage of ryan_m Send a personal messsage to ryan_m Reply with a quote from this post Go to the top of the page

khan said:
Thanks for your reply. I will appreciate if you help me understanding how it works. I assume i found three thousand SNPs in EST sequences. Now i have to design three thosand primers and use them in multiplex in the assays like in CMMT genotyping assays?
Thanks


That is the basic idea, yes. But many companies would design the probes for you, so probably all you would need to supply would be the positions of the polymorphisms and the two alleles.

.........................

Posted Feb 21, 2008, 22:11 PM
JayM

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 4
Joined: May 12, 2008







 Send a personal messsage to JayM Reply with a quote from this post Go to the top of the page

I have worked with 454 data for transcriptome analysis and SNPs, and working with that data is not a big problem. Now, the shift from 454 to solexa seems daunting (we recently acquired the sequencer), is there anyone out there who has an assembly software that they can recommend for me (for 454 I used Codoncode Aligner and it did what I wanted; problem is it has limitations on memory and RAM settings thus making it a problem for solexa).

I am looking for software that is robust enough to handle solexa data without having to stretch its capabilities too much.

.........................

Posted May 12, 2008, 8:12 AM
zee

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 1
Joined: Sep 03, 2008







 Send a personal messsage to zee Reply with a quote from this post Go to the top of the page

We have written some software specifically for this. You can get it for free research/nonprofit use at www.novocraft.com

.........................

Posted Sep 03, 2008, 3:54 AM
ryan_m

Frog Laureate

See
Similar
Scientists





Group: Moderators
Posts: 284
Joined: May 06, 2006







 Go to homepage of ryan_m Send a personal messsage to ryan_m Reply with a quote from this post Go to the top of the page

zee said:
We have written some software specifically for this. You can get it for free research/nonprofit use at www.novocraft.com


In my experience, novoalign can consume upwards of 15 gigabytes of RAM when mapping solexa reads to human, though you can change some parameters when creating your index to reduce this. For the people reading this thread, zee, could you let us know what the suggested lower RAM limit is for running novoalign with solexa reads against the human reference genome?

Thanks,

Ryan

.........................

Posted Sep 04, 2008, 12:43 PM
sparks

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 4
Joined: Sep 07, 2008







 Send a personal messsage to sparks Reply with a quote from this post Go to the top of the page

Hi Ryan,
Using a 14-mer index with step of 3, the Human genome can be indexed in approx 6Gbyte of RAM and novoalign & novopaired will then run quite happily on a workstation with 8Gbyte of RAM. BY default novoindex will look at how much RAM a server has and then choose k-mer length and step size to give optimum performance on that server. If your server has 16Gb RAM that might mean building a 12Gbyte index, you can always specify k&s and have a 6GB index on a 16Gb server.
Up to a limit (4^k < genome length / s) larger k and smaller s will improve performance.
Colin

ryan_m said:
zee said:
We have written some software specifically for this. You can get it for free research/nonprofit use at www.novocraft.com


In my experience, novoalign can consume upwards of 15 gigabytes of RAM when mapping solexa reads to human, though you can change some parameters when creating your index to reduce this. For the people reading this thread, zee, could you let us know what the suggested lower RAM limit is for running novoalign with solexa reads against the human reference genome?

Thanks,

Ryan

.........................

Posted Sep 07, 2008, 20:26 PM
ryan_m

Frog Laureate

See
Similar
Scientists





Group: Moderators
Posts: 284
Joined: May 06, 2006







 Go to homepage of ryan_m Send a personal messsage to ryan_m Reply with a quote from this post Go to the top of the page

Thanks for the details, Colin. And just to confirm, the parameters used when the index is created does not affect the quality of the results (i.e. result in some missed alignments), it just leads to increased runtime to complete the process, correct?

Thanks again.
Ryan

.........................

Posted Sep 07, 2008, 23:35 PM
sparks

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 4
Joined: Sep 07, 2008







 Send a personal messsage to sparks Reply with a quote from this post Go to the top of the page

Hi Ryan,
You're right, the index k-mer length and step size really only affect runtime performance. It shouldn't affect the alignment location of a read.
Colin

.........................

Posted Sep 08, 2008, 2:51 AM
G_nome

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 11
Joined: Feb 13, 2007







 Send a personal messsage to G_nome Reply with a quote from this post Go to the top of the page

Hi Sparks
I have a dual quad-core MacPro (64-bit Xeon processors) with 16G of RAM. This should be able to run novoalign just fine on the human genome, but I am having trouble getting novoindex to run. It seems that novoindex 'thinks' it is on a 32-bit machine and is complaining about memory limitations. Here is a piece of the error output:

Error: Sequence Index cannot fit in available RAM
Error: RAM available: 2048Mb
Error: Minimum RAM req'd: 4027Mb

Is there some way to get around this? Have others out there had success running novoindex and/or novoalign on a Mac?

.........................

Posted Oct 20, 2008, 16:57 PM
sparks

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 4
Joined: Sep 07, 2008







 Send a personal messsage to sparks Reply with a quote from this post Go to the top of the page

Hi G_nome,

The problem is in determining how much memory is available. Can you tell me what version you are using.

If you specify the k &s parameters then it shouldn't be a problem. On a 16GByte server and using human reference genome you could use either -k14 -s1 or -k15 -s2


Best Regards, Colin

.........................

Posted Oct 20, 2008, 20:41 PM
G_nome

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 11
Joined: Feb 13, 2007







 Send a personal messsage to G_nome Reply with a quote from this post Go to the top of the page

sparks said:
Hi G_nome,

The problem is in determining how much memory is available. Can you tell me what version you are using.

If you specify the k &s parameters then it shouldn't be a problem. On a 16GByte server and using human reference genome you could use either -k14 -s1 or -k15 -s2


Best Regards, Colin


Thank you for your fast reply, Colin. The novoindex version is 1.5. As per your suggestion, it seems to work OK with -k15 -s2.

Regards,
Sean

.........................

Posted Oct 21, 2008, 12:17 PM
G_nome

Frog Egg

See
Similar
Scientists





Group: Member
Posts: 11
Joined: Feb 13, 2007







 Send a personal messsage to G_nome Reply with a quote from this post Go to the top of the page

Hi Again.
I am not getting alignments as quickly as I would have expected from the rough benchmarks (and comparisons to Maq and Eland). I started 8 novopaired jobs (on an 8-cpu machine) a week ago (one lane of data for each job). Some jobs are 42-bp reads and some are 76-bp reads. Currently, each job has aligned between 1 and 5 million reads. Each lane has about 20 million reads (10 million pairs), so it is looking like I have many more weeks to wait. Am I doing something wrong? The only non-default option I am using is (-Q 30), hoping that would provide a speed-up by ignoring low quality alignments. By the way, this is on a MacPro with 16G of memory.

I appreciate you help.

Sean

.........................

Posted Nov 07, 2008, 11:04 AM
Current Page:1   << Last Page 1 2  Next Page >>
Total Pages: 2
top of page Add Reply  Add New Topic  Add New Poll

Forum Jump