|
|
Available sections:
ATGGACACTAAATTAACACCACCCTGGTTGCGTCAAGAAAATAACCTGCTCAACCCCGAAGTTACCCAACGTTTGTTTAT CAACCAGGACAACATTCCGAACTTACCCACTGAAGCCCCCCAAGTTTATTTAGCCCGTTTTTTGGAATGAGTTGAACCAC TAGTCTCGAAAGTAAAATTTGTGAAAGCACAAGCAGCAGTTCAGGATTACCTCAATTCTAAAGCCTGTGCACAAATTGAA ACCATTATTGCAGAACGGGCGCAAAACACCCATTCGAGTTGGTTAGCTAACTGGTGGGTGCAGTACGCTTATTTAACTTC GACTGGCCCGGTTAGTCCGGAAGTTAATGCACCCTATTATTTAGAGCTCCCGACTGTTGGTTGAAGTCAAGCCGAGTTAG CAGCAGCATTGTCAGCCCAGCTATGACACATATACCAGCAAGTCCAAAAACGGCAGTTAACGAGCTTTAGTGTCAAGGAC AAGCTGTTTTCACTCGATACCTTGCAATCAATCTTTGCTAGTTGCCAGATTCACCGCGCTGATGGGGATGTGTACTTTGT TAATGATCAACCTGCTAACTTTATTGTGGTGATTAAGAACAATGTCTTCTATAAGTTAGTAATAGATAACAGTGGTAGCT TAGAACAACTGCAAGCGCAACTGCAGTTAAGCTTTGTGCAAATCTTAGACAATGAGCTAAGTCACCCACCGCACTGGAAC TTACTCACTGCTACCACAACCAAAGCGGAGTCACAAAGTTTACTCGATCAGTTGTGAGCGCAAAACCCAGAAATGTTGTT GGACATCTATAACAGTGCCTTTATTGTGAACTTGGACAACGTCGAATTAACGACTCCCTTACAACTGTTGCGCAATTCGA CATGAACCCCTAACTTTAACCGTTGGCACGCTAAGGGGATTCAGTTGGTCATTACTAAGAACGCTCAGTTAGTCATTCTA GCGGATCACACTAGTTTTGACGGTTCTAGTGTAGCTACCCTAGCCAACATCTTTGTCTCTAAACTCCAAAAGGTCAACAC TGAGGGTGCCAGCGCTTTAACACCGACCATGCTTTCCTTTCCAACAGTAGACCAGGATAAGCAAAAGTATTTTAAGCAAT TGAGCAAAAATTTTAAGGACTATGTGTACAATGCCGTTATGTTTGAACTGAAATGGGACTGGTTTACGAAGCCACTTATC AAAGCCAAAGGGATTAAGAACTCCGAAGCCTTTATCCATCTCTGTTACCAAATTGCCCAGTACCAAACGAACAAAAAGTT GCAGAACACCTATGTGGCCGTGGACATGCGTCAGTACTTCCGGGGACGCACCGAGTGTCTTCGTCCTTTAAGTAAACAAT CAGTTGCCTTTGTCAAACGCTACTGCAAAGATCCGAAAGGTACCTTAAAGCAGTTCCGAAAGTACTACCCAGCGATTGAA AGTCTCCACTTTGAAAAAACCCGGTTAGCGCAAAAGGGTAGTGGGGTCAACCGTCACTTGCTGGGAGCATATTTGGCTTG GAATGAACACCAAGACACAATAGCTAAACCCGCTTTATTTGAAACAAAGGCTTGAAAAACGATTGCTGCCAACCCACTGT CCACATCGAGCATTGTGGACAAGTATTTGCGCAACTTTTCCTTCGACCCCGTAGAGCCAAACGGCATTGGTATTGCCTAT GCCATTGATGACACTAATTTCCGTGCCATTCTCAGTGTGTATCAGCACAACTTACAGTACCTCAAAGATTGGATGAAACA CTTTGAACAAACAGTGAAAACAATCCTTAAAACTCTTAAATAA
(601 codons) fields: [triplet] [frequency: per thousand] ([number]) UUU39.9( 24) UCU5.0( 3) UAU21.6( 13) UGU5.0( 3) UUC8.3( 5) UCC6.7( 4) UAC20.0( 12) UGC3.3( 2) UUA38.3( 23) UCA8.3( 5) UAA1.7( 1) UGA10.0( 6) UUG26.6( 16) UCG8.3( 5) UAG0.0( 0) UGG16.6( 10) CUU8.3( 5) CCU5.0( 3) CAU3.3( 2) CGU13.3( 8) CUC18.3( 11) CCC13.3( 8) CAC18.3( 11) CGC8.3( 5) CUA8.3( 5) CCA15.0( 9) CAA44.9( 27) CGA1.7( 1) CUG13.3( 8) CCG11.6( 7) CAG30.0( 18) CGG6.7( 4) AUU33.3( 20) ACU20.0( 12) AAU18.3( 11) AGU25.0( 15) AUC13.3( 8) ACC21.6( 13) AAC46.6( 28) AGC10.0( 6) AUA5.0( 3) ACA16.6( 10) AAA41.6( 25) AGA0.0( 0) AUG10.0( 6) ACG8.3( 5) AAG31.6( 19) AGG0.0( 0) GUU16.6( 10) GCU25.0( 15) GAU20.0( 12) GGU11.6( 7) GUC18.3( 11) GCC30.0( 18) GAC25.0( 15) GGC3.3( 2) GUA8.3( 5) GCA16.6( 10) GAA30.0( 18) GGA3.3( 2) GUG21.6( 13) GCG11.6( 7) GAG11.6( 7) GGG6.7( 4)
Note: This was generated using the standard codon usage table; UGA codons in MP/MP genes will show up as terminators ("*")
atggacactaaattaacaccaccctggttgcgtcaagaaaataacctgctcaaccccgaa M D T K L T P P W L R Q E N N L L N P E gttacccaacgtttgtttatcaaccaggacaacattccgaacttacccactgaagccccc V T Q R L F I N Q D N I P N L P T E A P caagtttatttagcccgttttttggaatgagttgaaccactagtctcgaaagtaaaattt Q V Y L A R F L E * V E P L V S K V K F gtgaaagcacaagcagcagttcaggattacctcaattctaaagcctgtgcacaaattgaa V K A Q A A V Q D Y L N S K A C A Q I E accattattgcagaacgggcgcaaaacacccattcgagttggttagctaactggtgggtg T I I A E R A Q N T H S S W L A N W W V cagtacgcttatttaacttcgactggcccggttagtccggaagttaatgcaccctattat Q Y A Y L T S T G P V S P E V N A P Y Y ttagagctcccgactgttggttgaagtcaagccgagttagcagcagcattgtcagcccag L E L P T V G * S Q A E L A A A L S A Q ctatgacacatataccagcaagtccaaaaacggcagttaacgagctttagtgtcaaggac L * H I Y Q Q V Q K R Q L T S F S V K D aagctgttttcactcgataccttgcaatcaatctttgctagttgccagattcaccgcgct K L F S L D T L Q S I F A S C Q I H R A gatggggatgtgtactttgttaatgatcaacctgctaactttattgtggtgattaagaac D G D V Y F V N D Q P A N F I V V I K N aatgtcttctataagttagtaatagataacagtggtagcttagaacaactgcaagcgcaa N V F Y K L V I D N S G S L E Q L Q A Q ctgcagttaagctttgtgcaaatcttagacaatgagctaagtcacccaccgcactggaac L Q L S F V Q I L D N E L S H P P H W N ttactcactgctaccacaaccaaagcggagtcacaaagtttactcgatcagttgtgagcg L L T A T T T K A E S Q S L L D Q L * A caaaacccagaaatgttgttggacatctataacagtgcctttattgtgaacttggacaac Q N P E M L L D I Y N S A F I V N L D N gtcgaattaacgactcccttacaactgttgcgcaattcgacatgaacccctaactttaac V E L T T P L Q L L R N S T * T P N F N cgttggcacgctaaggggattcagttggtcattactaagaacgctcagttagtcattcta R W H A K G I Q L V I T K N A Q L V I L gcggatcacactagttttgacggttctagtgtagctaccctagccaacatctttgtctct A D H T S F D G S S V A T L A N I F V S aaactccaaaaggtcaacactgagggtgccagcgctttaacaccgaccatgctttccttt K L Q K V N T E G A S A L T P T M L S F ccaacagtagaccaggataagcaaaagtattttaagcaattgagcaaaaattttaaggac P T V D Q D K Q K Y F K Q L S K N F K D tatgtgtacaatgccgttatgtttgaactgaaatgggactggtttacgaagccacttatc Y V Y N A V M F E L K W D W F T K P L I aaagccaaagggattaagaactccgaagcctttatccatctctgttaccaaattgcccag K A K G I K N S E A F I H L C Y Q I A Q taccaaacgaacaaaaagttgcagaacacctatgtggccgtggacatgcgtcagtacttc Y Q T N K K L Q N T Y V A V D M R Q Y F cggggacgcaccgagtgtcttcgtcctttaagtaaacaatcagttgcctttgtcaaacgc R G R T E C L R P L S K Q S V A F V K R tactgcaaagatccgaaaggtaccttaaagcagttccgaaagtactacccagcgattgaa Y C K D P K G T L K Q F R K Y Y P A I E agtctccactttgaaaaaacccggttagcgcaaaagggtagtggggtcaaccgtcacttg S L H F E K T R L A Q K G S G V N R H L ctgggagcatatttggcttggaatgaacaccaagacacaatagctaaacccgctttattt L G A Y L A W N E H Q D T I A K P A L F gaaacaaaggcttgaaaaacgattgctgccaacccactgtccacatcgagcattgtggac E T K A * K T I A A N P L S T S S I V D aagtatttgcgcaacttttccttcgaccccgtagagccaaacggcattggtattgcctat K Y L R N F S F D P V E P N G I G I A Y gccattgatgacactaatttccgtgccattctcagtgtgtatcagcacaacttacagtac A I D D T N F R A I L S V Y Q H N L Q Y ctcaaagattggatgaaacactttgaacaaacagtgaaaacaatccttaaaactcttaaa L K D W M K H F E Q T V K T I L K T L K taa *
MDTKLTPPWLRQENNLLNPEVTQRLFINQDNIPNLPTEAPQVYLARFLEWVEPLVSKVKFVKAQAAVQDYLNSKACAQIE TIIAERAQNTHSSWLANWWVQYAYLTSTGPVSPEVNAPYYLELPTVGWSQAELAAALSAQLWHIYQQVQKRQLTSFSVKD KLFSLDTLQSIFASCQIHRADGDVYFVNDQPANFIVVIKNNVFYKLVIDNSGSLEQLQAQLQLSFVQILDNELSHPPHWN LLTATTTKAESQSLLDQLWAQNPEMLLDIYNSAFIVNLDNVELTTPLQLLRNSTWTPNFNRWHAKGIQLVITKNAQLVIL ADHTSFDGSSVATLANIFVSKLQKVNTEGASALTPTMLSFPTVDQDKQKYFKQLSKNFKDYVYNAVMFELKWDWFTKPLI KAKGIKNSEAFIHLCYQIAQYQTNKKLQNTYVAVDMRQYFRGRTECLRPLSKQSVAFVKRYCKDPKGTLKQFRKYYPAIE SLHFEKTRLAQKGSGVNRHLLGAYLAWNEHQDTIAKPALFETKAWKTIAANPLSTSSIVDKYLRNFSFDPVEPNGIGIAY AIDDTNFRAILSVYQHNLQYLKDWMKHFEQTVKTILKTLK
This information was obtained using the Protein Parameters tool on the ExPASy Molecular Biology Server.
Number of amino acids: 600
Molecular weight: 68886.8
Theoretical pI: 9.01
Amino acid composition:
Ala (A) 50 8.3%
Arg (R) 18 3.0%
Asn (N) 39 6.5%
Asp (D) 27 4.5%
Cys (C) 5 0.8%
Gln (Q) 45 7.5%
Glu (E) 25 4.2%
Gly (G) 15 2.5%
His (H) 13 2.2%
Ile (I) 31 5.2%
Leu (L) 68 11.3%
Lys (K) 44 7.3%
Met (M) 6 1.0%
Phe (F) 29 4.8%
Pro (P) 27 4.5%
Ser (S) 38 6.3%
Thr (T) 40 6.7%
Trp (W) 16 2.7%
Tyr (Y) 25 4.2%
Val (V) 39 6.5%
Asx (B) 0 0.0%
Glx (Z) 0 0.0%
Xaa (X) 0 0.0%
Total number of negatively charged residues (Asp + Glu): 52
Total number of positively charged residues (Arg + Lys): 62
Atomic composition:
Carbon C 3149
Hydrogen H 4860
Nitrogen N 824
Oxygen O 892
Sulfur S 11
Formula: C3149H4860N824O892S11
Total number of atoms: 9736
Extinction coefficients:
Conditions: 6.0 M guanidium hydrochloride
0.02 M phosphate buffer
pH 6.5
Extinction coefficients are in units of M-1 cm-1 .
The first table lists values computed assuming ALL Cys
residues appear as half cystines, whereas the second table
assumes that NONE do.
276 278 279 280 282
nm nm nm nm nm
Ext. coefficient 122940 124854 124425 123280 119840
Abs 0.1% (=1 g/l) 1.785 1.812 1.806 1.790 1.740
276 278 279 280 282
nm nm nm nm nm
Ext. coefficient 122650 124600 124185 123040 119600
Abs 0.1% (=1 g/l) 1.780 1.809 1.803 1.786 1.736
Estimated half-life:
The N-terminal of the sequence considered is M (Met).
The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (Escherichia coli, in vivo).
Instability index:
The instability index (II) is computed to be 34.63
This classifies the protein as stable.
Aliphatic index: 91.53
Grand average of hydropathicity (GRAVY): -0.280| Model | Percentage of MP Gene |
|---|---|
| Coiled Coil (CCP) | 0 |
| Disordered (SEG) | 4 |
| Transmembrane (PHDhtm) | 0 |
| Transmembrane (TMHMM) | 0 |
| Homologous to known structure (PSIBLAST) | 0 |
Sequence: Amino acid sequence.
CCP (C or -): coiled coil prediction from the ccp program (NCBI toolkit)
SEG (D or -): low complexity regions (possibly disordered) from SEG
PHDhtm (H or -): Transmembrane prediction from PHDhtm
TMHMM (H or -): Transmembrane prediction from TMHMM
PSIBLAST (3 or -): Regions potentially homologous to a protein of known 3D structure, according to PSIBLAST
Pred2ary (H, E, or -): Secondary structure prediction from Pred2ary
10 20 30 40 50 60 70 80
| | | | | | | |
Sequence MDTKLTPPWLRQENNLLNPEVTQRLFINQDNIPNLPTEAPQVYLARFLEWVEPLVSKVKFVKAQAAVQDYLNSKACAQIE
CCP --------------------------------------------------------------------------------
SEG --------------------------------------------------------------------------------
PHDhtm --------------------------------------------------------------------------------
TMHMM --------------------------------------------------------------------------------
PSIBLAST --------------------------------------------------------------------------------
Pred2ary ---------------------------------------HHHHHHHHHHHH-----HHHHHHHHHHHHHHH----HHHHH
90 100 110 120 130 140 150 160
| | | | | | | |
Sequence TIIAERAQNTHSSWLANWWVQYAYLTSTGPVSPEVNAPYYLELPTVGWSQAELAAALSAQLWHIYQQVQKRQLTSFSVKD
CCP --------------------------------------------------------------------------------
SEG ------------------------------------------------DDDDDDDDDDDDD-------------------
PHDhtm --------------------------------------------------------------------------------
TMHMM --------------------------------------------------------------------------------
PSIBLAST --------------------------------------------------------------------------------
Pred2ary HHHHHHHH-----HHHHHHHHHHHHH-----EE--------------HHHHHHHHHHHHHHHHHHHHHH-----------
170 180 190 200 210 220 230 240
| | | | | | | |
Sequence KLFSLDTLQSIFASCQIHRADGDVYFVNDQPANFIVVIKNNVFYKLVIDNSGSLEQLQAQLQLSFVQILDNELSHPPHWN
CCP --------------------------------------------------------------------------------
SEG -----------------------------------------------------DDDDDDDDDD-----------------
PHDhtm --------------------------------------------------------------------------------
TMHMM --------------------------------------------------------------------------------
PSIBLAST --------------------------------------------------------------------------------
Pred2ary ----HHHHHHHHH--------------------EEEEEE---EEEEEE---------HHHHHHHHHHHH-----------
250 260 270 280 290 300 310 320
| | | | | | | |
Sequence LLTATTTKAESQSLLDQLWAQNPEMLLDIYNSAFIVNLDNVELTTPLQLLRNSTWTPNFNRWHAKGIQLVITKNAQLVIL
CCP --------------------------------------------------------------------------------
SEG --------------------------------------------------------------------------------
PHDhtm --------------------------------------------------------------------------------
TMHMM --------------------------------------------------------------------------------
PSIBLAST --------------------------------------------------------------------------------
Pred2ary ------HHHHHHHHHHH-----HHHHHHHHH--EEEEE------HHHHHHHHH-------------EEEEEE----EEEE
330 340 350 360 370 380 390 400
| | | | | | | |
Sequence ADHTSFDGSSVATLANIFVSKLQKVNTEGASALTPTMLSFPTVDQDKQKYFKQLSKNFKDYVYNAVMFELKWDWFTKPLI
CCP --------------------------------------------------------------------------------
SEG --------------------------------------------------------------------------------
PHDhtm --------------------------------------------------------------------------------
TMHMM --------------------------------------------------------------------------------
PSIBLAST --------------------------------------------------------------------------------
Pred2ary --------HHHHHHHHHHHHH-----------------------HHHHHHHHHHHHHHHHHHH---EEEEEE--------
410 420 430 440 450 460 470 480
| | | | | | | |
Sequence KAKGIKNSEAFIHLCYQIAQYQTNKKLQNTYVAVDMRQYFRGRTECLRPLSKQSVAFVKRYCKDPKGTLKQFRKYYPAIE
CCP --------------------------------------------------------------------------------
SEG --------------------------------------------------------------------------------
PHDhtm --------------------------------------------------------------------------------
TMHMM --------------------------------------------------------------------------------
PSIBLAST --------------------------------------------------------------------------------
Pred2ary --------HHHHHHHHHHHHHHH-------HHHHHHHHHH------------HHHHHHHHHH------HHHHHHHHHHHH
490 500 510 520 530 540 550 560
| | | | | | | |
Sequence SLHFEKTRLAQKGSGVNRHLLGAYLAWNEHQDTIAKPALFETKAWKTIAANPLSTSSIVDKYLRNFSFDPVEPNGIGIAY
CCP --------------------------------------------------------------------------------
SEG --------------------------------------------------------------------------------
PHDhtm --------------------------------------------------------------------------------
TMHMM --------------------------------------------------------------------------------
PSIBLAST --------------------------------------------------------------------------------
Pred2ary HHHHHHHHHHH----HHHHHHHHHHHHHHH---------------------EEE-----------------------EEE
570 580 590 600
| | | |
Sequence AIDDTNFRAILSVYQHNLQYLKDWMKHFEQTVKTILKTLK
CCP ----------------------------------------
SEG ----------------------------------------
PHDhtm ----------------------------------------
TMHMM ----------------------------------------
PSIBLAST ----------------------------------------
Pred2ary EE----EEEEEE-----HHHHHHHHHHHHHHHHHHHHHH-