帳號:guest(3.142.133.41)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳昱廷
作者(外文):Chen, Yu-Ting
論文名稱(中文):利用貝氏推論分析SPOUT扭結蛋白家族的結構演化及其先祖蛋白序列的重構
論文名稱(外文):The structural phylogenetic analysis of SPOUT trefoil-knotted protein family by Bayesian inference and the reconstruction of ancestral sequences
指導教授(中文):呂平江
指導教授(外文):Lyu, Ping-Chiang
口試委員(中文):徐尚德
羅惟正
口試委員(外文):Hsu, Shang-Te
Lo, Wei-Cheng
學位類別:碩士
校院名稱:國立清華大學
系所名稱:生物資訊與結構生物研究所
學號:106080587
出版年(民國):110
畢業學年度:109
語文別:英文
論文頁數:124
中文關鍵詞:扭結蛋白譜系分析先祖序列重建
外文關鍵詞:knotted proteinphylogenetic analysisancestral sequence reconstructionSPOUT superfamily
相關次數:
  • 推薦推薦:0
  • 點閱點閱:22
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
扭結蛋白是一種複雜的蛋白質摺疊構型。其中,SPOUT蛋白質家族是最著名的例子之一。在先前的研究中,已經藉由有限的蛋白結構來建立SPOUT家族的演化樹。如今,已有更多的SPOUT蛋白結構被解析,使我們更進一步分析SPOUT家族的演化。本研究中,我們使用了PSI-CD-HIT將已知的SPOUT結構進行分群至不同的子家族中,然後使用DALI將個子家族進行結構相似性的分組。多重序列排比方面,同時使用了結構與序列的排比方法來進行。結構排比使用了Swiss-PDB viewer來進行,而序列排比則使用MAFFT,並且同時去除了SPOUT結構域以外的結構域。在結構排比中,我們檢查了SPOUT結構域在結構上的一致性,並且以二級結構作為參照來改進結構的多重排比。之後使用了MrBayes來建立結構演化樹,以及RAxML來建立序列演化樹。最後運用GRASP來重建SPOUT可能的先祖蛋白序列以及用MODELLER來模擬其可能的結構。總體而言,我們改進了SPOUT家族的結構演化樹,並且提供了重建的先祖蛋白序列以及其模型作為進一步研究中合成蛋白的參考。
The knotted proteins are intricate protein foldings. Among these, the SPOUT superfamily is one of the best-known examples of knotted protein structures. The phylogenetic tree of the SPOUT superfamily has been demonstrated by applying limited structures in the previous study. Nowadays, more SPOUT protein structures are solved allowing us to improve the phylogenetic analysis.
In this study, we use the PSI-CD-HIT to cluster the SPOUT structures into different subfamilies and use DALI for the structural similarity grouping of subfamilies. Both structural and sequence alignment methods are used to generate the multiple sequence alignment (MSA). The Swiss-PDB viewer is used for the structural alignment and the MAFFT is used for sequence alignment. The extra domains besides the SPOUT domain are restricted. We examined the conservation of SPOUT domain structure and use secondary structures as references to improve the structural MSA. The structure phylogenetic tree is made by MrBayes and the sequence phylogenetic tree is processed by RAxML. The ancestral sequences of the SPOUT superfamily are reconstructed by the GRASP and models of ancestral sequences are made by MODELLER.
Collectively, we improved the structure phylogenetic tree of the SPOUT superfamily and provide the reconstructed ancestral sequences and the predicted models as references for protein synthesis in further studies.
Abstract ii
Contents iv
1. Introduction 1
1-1. Knot proteins 1
1-2. Categories of knotted protein 3
1-3. SPOUT family 5
1-4. The Bayesian inference and the MCMC Chains 10
1-5. The phylogenetic analysis of SPOUT. 14
1-6. Motivation 16
2. Methods 18
2-1. Collecting known SPOUT PDB dataset 18
2-2. Clustering with CD-HIT and PSI-CD-HIT 18
2-2-1. CD-HIT 18
2-2-2. PSI-CD-HIT 19
2-3. Identified knots by KnotProt 2.0 database 19
2-3-1. Knot regions identify by KnotProt 19
2-3-2. Process custom structure in KnotProt 20
2-4. Structure search by DALI 20
2-4-1. DALI database search 20
2-4-2. DALI Pairwise comparison 22
2-5. BLAST 22
2-5-1. BLAST database search 22
2-5-2. BLAST pairwise comparison 23
2-6. Swiss-PDB viewer 24
2-6-1. Multiple structure alignment by Swiss-PDB viewer v4.1 [50] 24
2-6-2. Manual improvement 24
2-7. Multiple Sequence Alignment by MAFFT 25
2-8. Phylogenetic analysis by MrBayes & RAxML 25
2-8-1. MrBayes 25
2-8-2. RAxML 28
2-9. Ancestral sequence prediction by GRASP 29
2-10. Protein modeling 30
2-10-1. swiss-model 30
2-10-2. MODELLER 30
2-10-3. Remove indel domain and add a modified loop 30
3. Results 32
3-1. The SPOUT PDB dataset clustering 32
3-2. Knot identifying by KnotProt 33
3-3. The DALI Search in PDB Database 39
3-4. The Pairwise Comparison of DALI and BLAST 41
3-5. The Multiple Structure Alignment 45
3-6. MrBayes Phylogenetic Tree Results 56
3-7. The Sequence Phylogenetic Analysis 61
3-8. Ancestral Sequence Reconstruction 69
3-9. The Modeling of Reconstructed Ancestral Proteins 72
4. Discussion 81
4-1. The Comparison of the Phylogenetic Trees 81
4-2. The importance of crossing loop in the protein knot 92
4-3. The common ancestor of SPOUT protein 94
5. Conclusion 97
References 98
Appendix 104
1. Colin, C.A., The knot book. An elementary introduction to the mathematical theory of knots. 2004, WH Freeman and Company, USA.
2. Richardson, J.S., beta-Sheet topology and the relatedness of proteins. Nature, 1977. 268(5620): p. 495-500.
3. Sonnhammer, E.L., S.R. Eddy, and R. Durbin, Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins, 1997. 28(3): p. 405-20.
4. El-Gebali, S., et al., The Pfam protein families database in 2019. Nucleic Acids Res, 2019. 47(D1): p. D427-D432.
5. Dabrowski-Tumanski, P., et al., KnotProt 2.0: a database of proteins with knots and other entangled structures. Nucleic Acids Res, 2019. 47(D1): p. D367-D375.
6. Mishra, R. and S. Bhushan, Knot theory in understanding proteins. J Math Biol, 2012. 65(6-7): p. 1187-213.
7. Mansfield, M.L., Are there knots in proteins? Nat Struct Biol, 1994. 1(4): p. 213-4.
8. Taylor, W.R. and K. Lin, Protein knots: A tangled problem. Nature, 2003. 421(6918): p. 25.
9. Lou, S.C., et al., The Knotted Protein UCH-L1 Exhibits Partially Unfolded Forms under Native Conditions that Share Common Structural Features with Its Kinetic Folding Intermediates. J Mol Biol, 2016. 428(11): p. 2507-2520.
10. Lee, Y.C. and S.D. Hsu, A Natively Monomeric Deubiquitinase UCH-L1 Forms Highly Dynamic but Defined Metastable Oligomeric Folding Intermediates. J Phys Chem Lett, 2018. 9(9): p. 2433-2437.
11. Schmidberger, J.W., et al., The crystal structure of DehI reveals a new alpha-haloacid dehalogenase fold and active-site mechanism. J Mol Biol, 2008. 378(1): p. 284-94.
12. Bolinger, D., et al., A Stevedore's protein knot. PLoS Comput Biol, 2010. 6(4): p. e1000731.
13. Wang, I., S.Y. Chen, and S.T. Hsu, Folding analysis of the most complex Stevedore's protein knot. Sci Rep, 2016. 6: p. 31514.
14. Chiang, P.K., et al., S-Adenosylmethionine and methylation. FASEB J, 1996. 10(4): p. 471-80.
15. Toyooka, T. and H. Hori, Differences in substrate selectivities of the SPOUT superfamily of methyltransferases. Nucleic Acids Symp Ser (Oxf), 2007(51): p. 445-6.
16. Taylor, A.B., et al., The crystal structure of Nep1 reveals an extended SPOUT-class methyltransferase fold and a pre-organized SAM-binding site. Nucleic Acids Res, 2008. 36(5): p. 1542-54.
17. Kim, D.J., et al., Crystal structure of Thermotoga maritima SPOUT superfamily RNA methyltransferase Tm1570 in complex with S-adenosyl-L-methionine. Proteins, 2009. 74(1): p. 245-9.
18. Chen, H.Y. and Y.A. Yuan, Crystal structure of Mj1640/DUF358 protein reveals a putative SPOUT-class RNA methyltransferase. J Mol Cell Biol, 2010. 2(6): p. 366-74.
19. Tkaczuk, K.L., et al., Structural and evolutionary bioinformatics of the SPOUT superfamily of methyltransferases. BMC Bioinformatics, 2007. 8: p. 73.
20. Anantharaman, V., E.V. Koonin, and L. Aravind, SPOUT: a class of methyltransferases that includes spoU and trmD RNA methylase superfamilies, and novel superfamilies of predicted prokaryotic RNA methylases. J Mol Microbiol Biotechnol, 2002. 4(1): p. 71-5.
21. Hori, H., Transfer RNA methyltransferases with a SpoU-TrmD (SPOUT) fold and their modified nucleosides in tRNA. Biomolecules, 2017. 7(1).
22. Jackman, J.E., et al., Identification of the yeast gene encoding the tRNA m1G methyltransferase responsible for modification at position 9. RNA, 2003. 9(5): p. 574-85.
23. Purta, E., et al., The yfhQ gene of Escherichia coli encodes a tRNA:Cm32/Um32 methyltransferase. BMC Mol Biol, 2006. 7: p. 23.
24. Purta, E., et al., YbeA is the m3Psi methyltransferase RlmH that targets nucleotide 1915 in 23S rRNA. RNA, 2008. 14(10): p. 2234-44.
25. Kempenaers, M., et al., New archaeal methyltransferases forming 1-methyladenosine or 1-methyladenosine and 1-methylguanosine at position 9 of tRNA. Nucleic Acids Res, 2010. 38(19): p. 6533-43.
26. Somme, J., et al., Characterization of two homologous 2'-O-methyltransferases showing different specificities for their tRNA substrates. RNA, 2014. 20(8): p. 1257-71.
27. Liu, R.J., et al., tRNA recognition by a bacterial tRNA Xm32 modification enzyme from the SPOUT methyltransferase superfamily. Nucleic Acids Res, 2015. 43(15): p. 7489-503.
28. Lv, F., et al., Structural basis for Sfm1 functioning as a protein arginine methyltransferase. Cell Discov, 2015. 1: p. 15037.
29. Swinehart, W.E. and J.E. Jackman, Diversity in mechanism and function of tRNA methyltransferases. RNA Biol, 2015. 12(4): p. 398-411.
30. Elkins, P.A., et al., Insights into catalysis by a knotted TrmD tRNA methyltransferase. J Mol Biol, 2003. 333(5): p. 931-49.
31. Sakaguchi, R., et al., A divalent metal ion-dependent N(1)-methyl transfer to G37-tRNA. Chem Biol, 2014. 21(10): p. 1351-1360.
32. Krishnamohan, A. and J.E. Jackman, Mechanistic features of the atypical tRNA m1G9 SPOUT methyltransferase, Trm10. Nucleic Acids Res, 2017. 45(15): p. 9019-9029.
33. Krishnamohan, A. and J.E. Jackman, A Family Divided: Distinct Structural and Mechanistic Features of the SpoU-TrmD (SPOUT) Methyltransferase Superfamily. Biochemistry, 2019. 58(5): p. 336-345.
34. Young, B.D., et al., Identification of methylated proteins in the yeast small ribosomal subunit: a role for SPOUT methyltransferases in protein arginine methylation. Biochemistry, 2012. 51(25): p. 5091-104.
35. Shao, Z., et al., Crystal structure of tRNA m1G9 methyltransferase Trm10: insight into the catalytic mechanism and recognition of tRNA substrate. Nucleic Acids Res, 2014. 42(1): p. 509-25.
36. Van Laer, B., et al., Structural and functional insights into tRNA binding and adenosine N1-methylation by an archaeal Trm10 homologue. Nucleic Acids Res, 2016. 44(2): p. 940-53.
37. Oerum, S., et al., Structural insight into the human mitochondrial tRNA purine N1-methyltransferase and ribonuclease P complexes. J Biol Chem, 2018. 293(33): p. 12862-12876.
38. Singh, R.K., et al., Structural and biochemical analysis of the dual-specificity Trm10 enzyme from Thermococcus kodakaraensis prompts reconsideration of its catalytic mechanism. RNA, 2018. 24(8): p. 1080-1092.
39. Lemey, P., Salemi, M., & Vandamme, A.-M., The Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing. 2 ed. 2009: Cambridge University Press.
40. Altekar, G., et al., Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference. Bioinformatics, 2004. 20(3): p. 407-15.
41. Lewis, P.O., M.T. Holder, and K.E. Holsinger, Polytomies and Bayesian phylogenetic inference. Syst Biol, 2005. 54(2): p. 241-53.
42. Kozbial, P.Z. and A.R. Mushegian, Natural history of S-adenosylmethionine-binding proteins. BMC Struct Biol, 2005. 5: p. 19.
43. Ouzounis, C.A., et al., A minimal estimate for the gene content of the last universal common ancestor--exobiology from a terrestrial perspective. Res Microbiol, 2006. 157(1): p. 57-68.
44. Jamroz, M., et al., KnotProt: a database of proteins with knots and slipknots. Nucleic Acids Res, 2015. 43(Database issue): p. D306-14.
45. Chuang, Y.C., et al., Untying a Protein Knot by Circular Permutation. J Mol Biol, 2019. 431(4): p. 857-863.
46. Ko, K.T., et al., Untying a Knotted SPOUT RNA Methyltransferase by Circular Permutation Results in a Domain-Swapped Dimer. Structure, 2019. 27(8): p. 1224-1233 e4.
47. Fu, L., et al., CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics, 2012. 28(23): p. 3150-2.
48. Holm, L., Benchmarking fold detection by DaliLite v.5. Bioinformatics, 2019. 35(24): p. 5326-5327.
49. Camacho, C., et al., BLAST+: architecture and applications. BMC Bioinformatics, 2009. 10: p. 421.
50. Guex, N. and M.C. Peitsch, SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis, 1997. 18(15): p. 2714-23.
51. Katoh, K. and D.M. Standley, MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol, 2013. 30(4): p. 772-80.
52. Ronquist, F., et al., MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol, 2012. 61(3): p. 539-42.
53. Ronquist, F., J. P. Huelsenbeck, M. Teslenko, C, Zhang, and J. A. A Nylander, MrBayes version 3.2 manual: tutorials and model summaries. 2020: https://github.com/NBISweden/MrBayes.
54. Rambaut, A., et al., Posterior Summarization in Bayesian Phylogenetics Using Tracer 1.7. Systematic Biology, 2018. 67(5): p. 901-904.
55. Stamatakis, A., RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 2014. 30(9): p. 1312-3.
56. Stamatakis, A., RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, 2006. 22(21): p. 2688-90.
57. Foley, G., et al., Identifying and engineering ancient variants of enzymes using Graphical Representation of Ancestral Sequence Predictions (GRASP). bioRxiv, 2020: p. 2019.12.30.891457.
58. Guex, N., M.C. Peitsch, and T. Schwede, Automated comparative protein structure modeling with SWISS-MODEL and Swiss-PdbViewer: a historical perspective. Electrophoresis, 2009. 30 Suppl 1: p. S162-73.
59. Marti-Renom, M.A., et al., Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct, 2000. 29: p. 291-325.
60. Benjamin Webb, A.S., Comparative Protein Structure Modeling Using MODELLER. Current Protocols in Bioinformatics, 2016. 54(1).
61. Janson, G. and A. Paiardini, PyMod 3: a complete suite for structural bioinformatics in PyMOL. Bioinformatics, 2020.
62. Liu, J., et al., Crystal structure of tRNA (m1G37) methyltransferase from Aquifex aeolicus at 2.6 A resolution: a novel methyltransferase fold. Proteins, 2003. 53(2): p. 326-8.
63. Waterhouse, A., et al., SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res, 2018. 46(W1): p. W296-W303.
64. Holm, L., et al., Searching protein structure databases with DaliLite v.3. Bioinformatics, 2008. 24(23): p. 2780-1.
65. Altschul, S.F., et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 1997. 25(17): p. 3389-402.
66. Remmert, M., et al., HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods, 2011. 9(2): p. 173-5.
67. Crooks, G.E., et al., WebLogo: a sequence logo generator. Genome Res, 2004. 14(6): p. 1188-90.
68. Fredrik Ronquist, J.P.H., C.Z. Maxim Teslenko, and J.A.A. Nylander, MrBayes version 3.2 Manual, in Tutorials and Model Summaries. 2020: https://github.com/NBISweden/MrBayes/blob/develop/doc/manual/Manual_MrBayes_v3.2.pdf.
69. Katoh, K., et al., MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res, 2002. 30(14): p. 3059-66.
70. Hutchinson, E.G. and J.M. Thornton, HERA--a program to draw schematic diagrams of protein secondary structures. Proteins, 1990. 8(3): p. 203-12.
71. Laskowski, R.A., et al., PDBsum: Structural summaries of PDB entries. Protein Sci, 2018. 27(1): p. 129-134.
72. Nureki, O., et al., Deep knot structure for construction of active site and cofactor binding site of tRNA modification enzyme. Structure, 2004. 12(4): p. 593-602.
73. Thomas, S.R., et al., Structural insight into the functional mechanism of Nep1/Emg1 N1-specific pseudouridine methyltransferase in ribosome biogenesis. Nucleic Acids Res, 2011. 39(6): p. 2445-57.
74. Koh, C.S., et al., Small methyltransferase RlmH assembles a composite active site to methylate a ribosomal pseudouridine. Sci Rep, 2017. 7(1): p. 969.
75. Zarembinski, T.I., et al., Deep trefoil knot implicated in RNA binding found in an archaebacterial protein. Proteins, 2003. 50(2): p. 177-83.
76. Perlinska, A.P., et al., Restriction of S-adenosylmethionine conformational freedom by knotted protein binding sites. PLoS Comput Biol, 2020. 16(5): p. e1007904.
77. Schubert, H.L., R.M. Blumenthal, and X. Cheng, Many paths to methyltransfer: a chronicle of convergence. Trends Biochem Sci, 2003. 28(6): p. 329-35.

 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *