Just read this paper in PLoS Computational Biology
Gene-Boosted Assembly of a Novel Bacterial Genome from Very Short Reads
The authors offer access to free software for assembling sequencing fragments from Solexa 33 bp sequences of a whole bacterial genome. Not my exact field but the data look strong to me. Anyone care to comment?
1. Genomic DNA was extracted by SDS lysis, proteinase K digest, and phenol/chloroform extraction.
2. Sequencing was performed by Illumina using the 1G Genome Analyzer, also known as the Solexa sequencer. The 8.6 million reads represent 1/4 of the current capacity of a flow cell.
3. For sequencing trimming in step 1, we mapped all reads to the initial assembly and then trimmed up to three bases from the 3' end when those bases failed to match a contig.
The AMOScmp pipeline for trimming and short read assembly is described at Sr- assembly.
4. Contig merging in step 2 of our algorithm used the merger program from the EMBOSS package . The Edena, Velvet, and ssake assemblers were run with a wide range of parameters in order to optimize them for the data used in this study, with the best results coming from Velvet with a minimum overlap requirement of 24 bases. (The other methods created more numerous, shorter contigs.)
5. The ABBA assembler has been added to the free, open-source AMOS assembler package, which also includes the AMOScmp assembler. ABBA can be found at http://amos.sourceforge.net/docs/pipeline/abba.html.
Salzberg SL, Sommer DD, Puiu D, Lee VT (2008) Gene-Boosted Assembly of a Novel Bacterial Genome from Very Short Reads. PLoS Comput Biol 4(9): e1000186. doi:10.1371/journal.pcbi.1000186