|
|
This gene has the annotation "conserved hypothetical protein".
The target was selected as part of JMC's third set.
Available sections:
| Experiment | Result | User | Experiment Date |
|---|---|---|---|
| Selected | Done | jmc | 2002-11-07 |
| Cloned | Done | r_kim | 2003-02-11 |
| Expression tested | Done | r_kim | 2003-02-11 |
| Solubility tested | S | r_kim | 2003-02-11 |
| Purified | Done | dhshin | 2003-07-14 |
| Crystallized | Done | dhshin | 2003-07-14 |
| Diffraction quality crystals | Done | dhshin | 2003-07-14 |
| Native diffraction data | Done | dhshin | 2003-07-14 |
| Phasing diffraction data | Done | dhshin | 2003-08-29 |
| Traceable map | Done | dhshin | 2003-08-29 |
| Crystal structure | Done | dhshin | 2004-10-04 |
| In PDB | 1YTD 1YTE 1YTK 2I1O | jmc | 2006-08-16 |
| MP gene | log10(PSIBLAST E) | % identical | % coverage (of MP gene) |
|---|---|---|---|
| MP107 / MPN047 | -27.00 | 18.27 | 76.79 |
ATGAATGTGTTCAACACAGCAAGTGATGAAGATATAAAAAAGGGGCTGGCTTCGGATGTCTATTTTGAGAGGACAATATC GGCTATAGGCGATAAATGCAATGATCTGAGGGTCGCCATGGAGGCCACCGTTTCCGGGCCATTGGATACATGGATAAATT TCACCGGTCTGGACGAGGTCCTGAAGCTTCTGGAAGGGCTTGATGTGGATCTGTACGCAATCCCGGAGGGCACAATTCTA TTCCCCAGAGATGCAAACGGGCTGCCGGTTCCCTTCATCAGGGTAGAAGGAAGGTACTGCGACTTCGGTATGTACGAGAC GGCCATACTCGGATTCATATGCCAGGCTTCTGGAATATCCACAAAGGCTTCCAAGGTGAGGCTGGCTGCAGGAGATTCGC CGTTCTTCTCCTTCGGAATAAGGAGAATGCATCCGGCCATATCGCCGATGATCGATCGATCCGCATATATCGGGGGAGCA GACGGTGTTTCCGGCATCCTTGGTGCAAAGCTGATAGATCAGGATCCGGTCGGTACCATGCCGCACGCGCTATCCATAAT GCTTGGCGATGAGGAAGCGTGGAAGCTCACCCTTGAAAACACAAAGAATGGACAGAAATCGGTACTTCTTATCGATACGT ACATGGACGAAAAGTTCGCAGCCATAAAGATCGCTGAAATGTTCGATAAGGTTGATTATATAAGATTGGACACGCCCTCA TCCAGAAGGGGAAACTTCGAAGCACTCATAAGAGAGGTCAGGTGGGAACTGGCCCTGCGTGGAAGGAGCGACATAAAGAT TATGGTCTCTGGCGGTCTCGATGAGAATACCGTCAAGAAGCTAAGAGAAGCCGGAGCAGAAGCCTTTGGAGTCGGTACTT CCATATCATCGGCAAAGCCTTTCGACTTCGCCATGGACATAGTTGAGGTGAATGGAAAGCCCGAGACGAAGCGCGGGAAG ATGTCGGGCAGGAAGAACGTCCTGAGGTGCACATCCTGCCACAGGATTGAGGTCGTGCCTGCCAACGTTCAGGAAAAGAC ATGCATCTGCGGCGGTAGTATGCAGAATCTGCTCGTGAAGTATCTATCTCATGGTAAGAGAACATCCGAGTACCCAAGGC CAAAGGAGATAAGATCCAGGTCCATGAAGGAGCTGGAATATTTCAAAGATATTTCATAA
(393 codons) fields: [triplet] [frequency: per thousand] ([number]) UUU5.1( 2) UCU7.6( 3) UAU12.7( 5) UGU0.0( 0) UUC38.2( 15) UCC33.1( 13) UAC12.7( 5) UGC17.8( 7) UUA0.0( 0) UCA7.6( 3) UAA2.5( 1) UGA0.0( 0) UUG5.1( 2) UCG17.8( 7) UAG0.0( 0) UGG7.6( 3) CUU17.8( 7) CCU5.1( 2) CAU5.1( 2) CGU2.5( 1) CUC12.7( 5) CCC10.2( 4) CAC5.1( 2) CGC2.5( 1) CUA10.2( 4) CCA7.6( 3) CAA0.0( 0) CGA2.5( 1) CUG35.6( 14) CCG17.8( 7) CAG12.7( 5) CGG0.0( 0) AUU10.2( 4) ACU2.5( 1) AAU17.8( 7) AGU5.1( 2) AUC20.4( 8) ACC12.7( 5) AAC15.3( 6) AGC2.5( 1) AUA45.8( 18) ACA22.9( 9) AAA10.2( 4) AGA20.4( 8) AUG35.6( 14) ACG10.2( 4) AAG58.5( 23) AGG35.6( 14) GUU15.3( 6) GCU15.3( 6) GAU48.3( 19) GGU22.9( 9) GUC25.4( 10) GCC25.4( 10) GAC20.4( 8) GGC17.8( 7) GUA5.1( 2) GCA28.0( 11) GAA33.1( 13) GGA30.5( 12) GUG15.3( 6) GCG5.1( 2) GAG35.6( 14) GGG15.3( 6)
Details (problem DNA region is in upper case, with the key on the line below):
Key:
atgaatgtgttcaacacagcaagtgatgaagatataaaaaaggggctggcttcggatgtc
tattttgagaggacaatatcggctataggcgataaatgcaatgatctgagggtcgccatg
gaggccaccgtttccgggccattggatacatggataaatttcaccggtctggacgaggtc
ctgaagcttctggaagggcttgatgtggatctgtacgcaatcccggagggcacaattcta
ttccccagagatgcaaacgggctgccggttcccttcatcagggtagaaggaaggtactgc
gacttcggtatgtacgagacggccatactcggattCATATGccaggcttctggaatatcc
1
acaaaggcttccaaggtgaggctggctgcaggagattcgccgttcttctccttcggaata
aggagaatgcatccggccatatcgccgatgatcgatcgatccgcatatatcgggggagca
gacggtgtttccggcatccttggtgcaaagctgatagatcaGGATCCggtcggtaccatg
2
ccgcacgcgctatccataatgcttggcgatgaggaagcgtggaagctcacccttgaaaac
acaaagaatggacagaaatcggtacttcttatcgatacgtacatggacgaaaagttcgca
gccataaagatcgctgaaatgttcgataaggttgattatataagattggacacgccctca
tccagaaggggaaacttcgaagcactcataagagaggtcaggtgggaactggccctgcgt
ggaaggagcgacataaagattatggtctctggcggtctcgatgagaataccgtcaagaag
ctaagagaagccggagcagaagcctttggagtcggtacttccatatcatcggcaaagcct
ttcgacttcgccatggacatagttgaggtgaatggaaagcccgagacgaagcgcgggaag
atgtcgggcaggaagaacgtcctgaggtgcacatcctgccacaggattgaggtcgtgcct
gccaacgttcaggaaaagacatgcatctgcggcggtagtatgcagaatctgctcgtgaag
tatctatctcatggtaagagaacatccgagtacccaaggccaaaggagataagatccagg
tccatgaaggagctggaatatttcaaagatatttcataa
Note: These were generated using Hisao's algorithm, and do not take into account possible mispriming, self-complementarity, or requirements for mutations anywhere in the sequences. The -R1 primers are for the N-terminal end, and begin with an NdeI restriction site (CATATG). The -R2 primers are for the C-terminal end, and begin with a BamHI site (GGATCC) if the target has no other BamHI sites. If the target does have a BamHI site, the -R2 primer wil begin with a BglII site (AGATCT), unless the target has no BglII sites. If the target has both sites, the design program gives up and doesn't prepend a restriction site to the -R2 primer!
Predicted Tm was obtained using the Tm determination tool at the Virtual Genome Center.
| Primer ID | Predicted Tm | sequence |
|---|---|---|
| 1350B-R1 | 60.70 | CATATGAATGTGTTCAACACAGCAAGT |
| 1350B-R2 | 61.30 | AGATCTTTATGAAATATCTTTGAAATATTCCAGCTC |
| 1350B-R10 | 63.30 | GGCGGTGGTGGCGGCATGAATGTGTTCAACACAGCAAGTG |
| 1350B-R11 | 63.40 | GTTCTTCTCCTTTGCGCCCCTATGAAATATCTTTGAAATATTCCAGCTCCT |
Note: This was generated using the standard codon usage table; UGA codons in MP/MP genes will show up as terminators ("*")
atgaatgtgttcaacacagcaagtgatgaagatataaaaaaggggctggcttcggatgtc M N V F N T A S D E D I K K G L A S D V tattttgagaggacaatatcggctataggcgataaatgcaatgatctgagggtcgccatg Y F E R T I S A I G D K C N D L R V A M gaggccaccgtttccgggccattggatacatggataaatttcaccggtctggacgaggtc E A T V S G P L D T W I N F T G L D E V ctgaagcttctggaagggcttgatgtggatctgtacgcaatcccggagggcacaattcta L K L L E G L D V D L Y A I P E G T I L ttccccagagatgcaaacgggctgccggttcccttcatcagggtagaaggaaggtactgc F P R D A N G L P V P F I R V E G R Y C gacttcggtatgtacgagacggccatactcggattcatatgccaggcttctggaatatcc D F G M Y E T A I L G F I C Q A S G I S acaaaggcttccaaggtgaggctggctgcaggagattcgccgttcttctccttcggaata T K A S K V R L A A G D S P F F S F G I aggagaatgcatccggccatatcgccgatgatcgatcgatccgcatatatcgggggagca R R M H P A I S P M I D R S A Y I G G A gacggtgtttccggcatccttggtgcaaagctgatagatcaggatccggtcggtaccatg D G V S G I L G A K L I D Q D P V G T M ccgcacgcgctatccataatgcttggcgatgaggaagcgtggaagctcacccttgaaaac P H A L S I M L G D E E A W K L T L E N acaaagaatggacagaaatcggtacttcttatcgatacgtacatggacgaaaagttcgca T K N G Q K S V L L I D T Y M D E K F A gccataaagatcgctgaaatgttcgataaggttgattatataagattggacacgccctca A I K I A E M F D K V D Y I R L D T P S tccagaaggggaaacttcgaagcactcataagagaggtcaggtgggaactggccctgcgt S R R G N F E A L I R E V R W E L A L R ggaaggagcgacataaagattatggtctctggcggtctcgatgagaataccgtcaagaag G R S D I K I M V S G G L D E N T V K K ctaagagaagccggagcagaagcctttggagtcggtacttccatatcatcggcaaagcct L R E A G A E A F G V G T S I S S A K P ttcgacttcgccatggacatagttgaggtgaatggaaagcccgagacgaagcgcgggaag F D F A M D I V E V N G K P E T K R G K atgtcgggcaggaagaacgtcctgaggtgcacatcctgccacaggattgaggtcgtgcct M S G R K N V L R C T S C H R I E V V P gccaacgttcaggaaaagacatgcatctgcggcggtagtatgcagaatctgctcgtgaag A N V Q E K T C I C G G S M Q N L L V K tatctatctcatggtaagagaacatccgagtacccaaggccaaaggagataagatccagg Y L S H G K R T S E Y P R P K E I R S R tccatgaaggagctggaatatttcaaagatatttcataa S M K E L E Y F K D I S *
MNVFNTASDEDIKKGLASDVYFERTISAIGDKCNDLRVAMEATVSGPLDTWINFTGLDEVLKLLEGLDVDLYAIPEGTIL FPRDANGLPVPFIRVEGRYCDFGMYETAILGFICQASGISTKASKVRLAAGDSPFFSFGIRRMHPAISPMIDRSAYIGGA DGVSGILGAKLIDQDPVGTMPHALSIMLGDEEAWKLTLENTKNGQKSVLLIDTYMDEKFAAIKIAEMFDKVDYIRLDTPS SRRGNFEALIREVRWELALRGRSDIKIMVSGGLDENTVKKLREAGAEAFGVGTSISSAKPFDFAMDIVEVNGKPETKRGK MSGRKNVLRCTSCHRIEVVPANVQEKTCICGGSMQNLLVKYLSHGKRTSEYPRPKEIRSRSMKELEYFKDIS
This information was obtained using the Protein Parameters tool on the ExPASy Molecular Biology Server.
Number of amino acids: 392
Molecular weight: 43296.8
Theoretical pI: 6.20
Amino acid composition:
Ala (A) 29 7.4%
Arg (R) 25 6.4%
Asn (N) 13 3.3%
Asp (D) 27 6.9%
Cys (C) 7 1.8%
Gln (Q) 5 1.3%
Glu (E) 27 6.9%
Gly (G) 34 8.7%
His (H) 4 1.0%
Ile (I) 30 7.7%
Leu (L) 32 8.2%
Lys (K) 27 6.9%
Met (M) 14 3.6%
Phe (F) 17 4.3%
Pro (P) 16 4.1%
Ser (S) 29 7.4%
Thr (T) 19 4.8%
Trp (W) 3 0.8%
Tyr (Y) 10 2.6%
Val (V) 24 6.1%
Asx (B) 0 0.0%
Glx (Z) 0 0.0%
Xaa (X) 0 0.0%
Total number of negatively charged residues (Asp + Glu): 54
Total number of positively charged residues (Arg + Lys): 52
Atomic composition:
Carbon C 1913
Hydrogen H 3065
Nitrogen N 523
Oxygen O 577
Sulfur S 21
Formula: C1913H3065N523O577S21
Total number of atoms: 6099
Extinction coefficients:
Conditions: 6.0 M guanidium hydrochloride
0.02 M phosphate buffer
pH 6.5
Extinction coefficients are in units of M-1 cm-1 .
The first table lists values computed assuming ALL Cys
residues appear as half cystines, whereas the second table
assumes that NONE do.
276 278 279 280 282
nm nm nm nm nm
Ext. coefficient 31135 31181 30790 30230 29160
Abs 0.1% (=1 g/l) 0.719 0.720 0.711 0.698 0.673
276 278 279 280 282
nm nm nm nm nm
Ext. coefficient 30700 30800 30430 29870 28800
Abs 0.1% (=1 g/l) 0.709 0.711 0.703 0.690 0.665
Estimated half-life:
The N-terminal of the sequence considered is M (Met).
The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (Escherichia coli, in vivo).
Instability index:
The instability index (II) is computed to be 35.96
This classifies the protein as stable.
Aliphatic index: 86.84
Grand average of hydropathicity (GRAVY): -0.185