EMBL format

2 posts / 0 new
Last post
huda
huda's picture
EMBL format

hi,
I'm new in DNA sequence field .
from EMBL format, I got the follwoing sequences.
But, I don't know what are these numbers at right of the following sequences meaning.
 
and why there is spaces between these substrings?

acaagatgcc attgtccccc ggcctcctgc tgctgctgct ctccggggcc acggccaccg 60
ctgccctgcc cctggagggt ggccccaccg gccgagacag cgagcatatg caggaagcgg 120
caggaataag gaaaagcagc ctcctgactt tcctcgcttg gtggtttgag tggacctccc 180
aggccagtgc cgggcccctc ataggagagg aagctcggga ggtggccagg cggcaggaag 240
gcgcaccccc ccagcaatcc gcgcgccggg acagaatgcc ctgcaggaac ttcttctgga 300
agaccttctc ctcctgcaaa taaaacctca cccatgaatg ctcacgcaag tttaattaca 360
gacctgaa 368
//
thanks

xhanix
xhanix's picture
Hi Huda

Hi Huda

I understand that this could be confusing to a novice but it's simple really one you get used to it. Ok so generally sequences are written
5' to 3' from left to right. So your sequence starts with "aca" in the top left. The first base is 'a' and the 60th base is the last letter on the top line. You see, the bases are in groups of 10 bases. You can group however you like, e.g groups of 3 or 5. It's arbitrary. It's just to make it easier to identify a certain base or sequence within the shown sequence. So u can use the numbers on right to identify for example the sequence between bases 50 and 75? Or can u tell me what is the 90th base? U can count the bases one by one from the top line until you get to the 90th base. But it is much easier if you just count the bases in groups of 10. The spaces are separating the groups of 10. Hope this was clear.

PCRWiz -helping you with your PCRs
WantProgress.com