Aligment of peptides with protein

10 posts / 0 new
Last post
Guy Sovak
Guy Sovak's picture
Aligment of peptides with protein

Hi All,
I Encountered a problem that maybe one of you can help me resolve.
I got Mass Spec results from a sample that I prepared couple of days ago.
I found out that it matched my anticipation. I know that there is a possibility to find out using some kind of software that maybe you know about where the matches are in the whole protein.
I looked into the Swiss expasy web site but was not able to find something that would help me.
Now for the main thing, what am I looking for:
I got 24 peptides that match 13% of CFTR a trans-membrane protein channel.
I am interested to find out using some kind of an alignment software what areas the peptides are covering. I did it manualy using an old software but I guess there is a much easier way to do it, do you know one?
Thank you in advance
Guy

bgood
bgood's picture
Hi Guy,

Hi Guy,

Let me see if I understand this correctly,

- You have a list of 24 short amino acid sequences (peptides).
- These 24 sequences cover a total of 13% of the CFTR gene
- You want to know where on the gene these sequences match.

Is this correct?

Guy Sovak
Guy Sovak's picture
Hi ben

Hi ben
Thanks for the quick response.
That is exactly why I like sceintist solutions.

24 different peptide sequnces that covers 13% of the CFTR gene. Each peptide sequance average between 10 (shortest) to 30 (longest) amino acid sequences.
This 13% cover not fully the cytosolic portion which make sense as the transmembrane portion would be harder to shoot through the mass spec.
I am looking to find exactly where these swquences match.
Thanks
Guy

bgood
bgood's picture
What do you think of this?

What do you think of this?

1) get the amino acid sequence for your gene - perhaps from the UniProt record

2) create a single fasta-formatted text file containing the gene and all of your peptides (sort of like this):
>P13569
MQRSPLEKASVVSKL
>PEP1
MQR
>PEP2
VV

3) submit the text file to a multiple alignment tool like ClustalW

4) you should be able to see where they line up in the output.

do you need to see where they sit in terms of the protein structure or is sequence position good enough ?

Guy Sovak
Guy Sovak's picture
Hi again for the answer.

Hi again for the answer.
First of all I would prefer to get it in term of protein structure.
Then just now I tried it and it seems that it works partially
see below.
So it dose not match the whole sequence.
If I do it one by one it is OK.
Guy

PEP9 ------------------------------KVSLAPQANLT-ELDIYSR----------- 18
PEP24 ------------------------------KVSLAPQANLT-ELDIYSRR---------- 19
PEP12 ------------------------------KVSLAPQANLT-ELDIYSR----------- 18
PEP3 -------------------------------VSLAPQANLT-ELDIYSR----------- 17
PEP7

---------------------------------FSLEGDAP-VSWTETKK---------- 16
PEP22 ---------------------------------FSLEGDAP-VSWTETKK---------- 16
PEP5 ---------------------------------FSLEGDAP-VSWTETK----------- 15
PEP2 -----------------------NSILTETLHRFSLEGDAP-VSWTETK----------- 25
PEP14 -------------------------------------RNSI-LTETLHR----------- 11
PEP15 --------------------------------------NSI-LTETLHR----------- 10
PEP10 -----------------------------QRLELSDIYQIP-SVDSADNLSEK------- 23
PEP11 -------------------------------LELSDIYQIP-SVDSADNLSEK------- 21
PEP16 ----------------------------------NSILN-P--INSIR------------ 11
PEP21 --------------------------------RKNSILN-P--INSIR------------ 13
PEP18 ----------------------------------NSILN-P--INSIRK----------- 12
PEP19 ---------------------------------LFFSWT----RPILR------------ 11
PEP23 ---------------------------------LFFSWT----RPILR------------ 11
PEP1 ---------------------------------ISVIST----GPTLQAR---------- 13
PEP6 ---------------------------------LSLVPDSE-QGEAILPR---------- 16
PEP8 --------------------------------RLSLVPDSE-QGEAILPR---------- 17
PEP20 --------------------------------RLSLVPDSE-QGEAILPR---------- 17
PEP4 ----------------------------YTEGGNAILENIS-FSISPGQR---------- 21
P13569 MQRSPLEKASVVSKLFFSWTRPILRKGYRQRLELSDIYQIP-SVDSADNLSEKLEREWDR 59
PEP13 --------------------------AVYKDADLYLLDSPFGYLDVLTEK---------- 24
PEP17 ------------------------------DADLYLLDSPFGYLDVLTEK---------- 20

Guy Sovak
Guy Sovak's picture
Now I tried another

Now I tried another application the Tcoffee, it works with 2 sequences ontop of the CFTR one but the moment I add anoter sequence it can not handle it.
?
Guy

bgood
bgood's picture
Ya sorry.. I was thinking

Ya sorry.. I was thinking about it last night and a multiple alignment isn't really what you want. As you said, you need to align them one by one to the primary sequence of the gene.

Let me look around a bit.. There must be a tool that will do this for you. I would have assumed your mass-spec software would incorporate it in their protein calling software . E.g. the people that use mass-spec in our group uses the MASCOT database/software to process the output of their machine. (Though they are always complaining about the complexity of the output). MASCOT is generally more for figuring out what protein was identified then precisely characterizing the hits, but it might have something that could help?

Once you do get the peptides aligned correctly, you could use the SwissPdbViewer to see where they sit on the Structure

Guy Sovak
Guy Sovak's picture
Thanks once again,

Thanks once again,
Did it 2 sequences at a time. As per the structure alignment it is more difficult as not all of the domains are christelized. So I did it the old way found the regions at swiss port and did it manualy.
Geat help Ben.
Guy

bgood
bgood's picture
Glad I could help.

Glad I could help.

If this is something you are going to have to do again, let me know and I'll write/find a Workflow so you don't have to do it manually.

Guy Sovak
Guy Sovak's picture
Appreciate your help,

Appreciate your help,
I send couple of more samples so I guess i would need your further help in creating some kind of a work flow for the next couple of samples.
Thank you in advance.
Guy