帳號:guest(3.135.187.231)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):蔡俊彥
作者(外文):Tsai, Jyun-yan
論文名稱(中文):解決多目標密碼子優化的有效率演算法
論文名稱(外文):Efficient Algorithms for Solving Multi-objective Codon Optimization
指導教授(中文):盧錦隆
指導教授(外文):Lu, Chin-Lung
口試委員(中文):邱顯泰
林苕吟
口試委員(外文):Chiu, Hsien-Tai
Lin, Tiao-Yin
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:108062594
出版年(民國):110
畢業學年度:109
語文別:中文
論文頁數:53
中文關鍵詞:密碼子優化動態規劃整數線性規劃
外文關鍵詞:codon optimizationdynamic programminginteger linear programming
相關次數:
  • 推薦推薦:0
  • 點閱點閱:58
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
由於密碼子的簡併性,胺基酸序列可以被多種mRNA序列轉譯產生。密碼子優化即在不改變胺基酸序列的情況下設計一個mRNA序列來使得蛋白質的表現上升。影響蛋白質的表現有許多因素,例如密碼子使用偏好、RNA二級結構、模體等等。在本篇論文中,我們考慮模體出現、RNA二級結構的穩定度與密碼子使用偏好這三個因素定義出一個多目標的密碼子優化問題。首先,我們提出一個動態規劃 (DP) 的方法來解決RNA二級結構穩定度要最大化的多目標密碼子優化問題。為了方便,我們用 MCOmax 來表示這個問題。接著,我們設計一個整數線性規劃 (ILP) 的方法來有效率地求得 MCOmax 問題的近似解答。如果RNA二級結構穩定度需要最小化的話,則這個多目標密碼子優化問題 (用 MCOmin 來表示) 已被證明是NP-hard的問題。因此,我們設計一個ILP方法來有效率地解決 MCOmin 問題。我們的實驗結果顯示,在 MCOmax 問題中,用ILP方法所得到的序列在RNA二級結構穩定度上大多數都比野生型的序列還穩定。除此之外,我們使用ILP方法所得到的序列品質是接近我們用DP方法所找到的序列,但是我們的ILP方法在執行時間上比我們的DP方法更快。對於 MCOmin 問題,我們的ILP方法所得到的序列在RNA二級結構穩定度上大多數都比野生型的序列還不穩定。
Due to the degeneracy of codons, the amino acid sequence can be translated from a variety of mRNA sequences. Codon optimization aims to increase the protein expression by designing an mRNA sequence without changing the target amino acid sequence. There are many factors that can affect protein expression, such as codon usage bias, RNA secondary structure, motifs, etc. In this thesis, we define a multi-objective codon optimization problem by considering motif appearance, stability of RNA secondary structure and codon usage bias. We first propose a dynamic programming (DP) method to solve this problem when the stability of RNA secondary structure needs to be maximized. For convenience, we denote this problem by MCOmax. We next design an integer linear programming (ILP) method to efficiently obtain an approximate solution of MCOmax problem. If the stability of RNA secondary structure needs to be minimized, the multi-objective codon optimization problem, denoted by MCOmin, is NP-hard. Therefore, we design an ILP method to efficiently solve the MCOmin problem. Our experimental results show that for the MCOmax problem, most of the sequences obtained by the ILP method are more stable than the wild-type sequences. In addition, the qualities of these sequences obtained by our ILP method are close to those of sequences found by our DP method, but the running time of our ILP method is much less than that of our DP method. For the MCOmin problem, most of sequences obtained by our ILP method are less stable than the wild-type sequences.
中文摘要 1
Abstract 2
Acknowledgement 3
Contents 4
List of figures 6
List of tables 12
Chapter 1 Introduction 13
Chapter 2 Methods 17
2.1 Multi-objective codon optimization problem 17
2.1.1 Preliminaries 17
2.1.2 Maximizing Stability of RNA 2D Structures 19
2.1.3 Minimizing Stability of RNA 2D Structures 21
2.2 Dynamic programming 22
2.2.1 Notations 22
2.2.2 Recursive Formula 26
2.2.3 Time Complexity 33
2.3 ILP Formulation 34
2.3.1 Notations 34
2.3.2 ILP Variables 35
2.3.3 ILP Constraints 36
2.3.4 ILP Objective Function 37
Chapter 3 Experiment Results and Discussion 39
3.1 Experimental Dataset 39
3.2 Experiments of Maximizing Stability of RNA 2D Structures 40
3.3 Experiments of Minimizing Stability of RNA 2D Structures 44
3.4 Discussion 48
Chapter 4 Conclusion 50
References 51


[1] Alfred, V. (2014). Algorithms for finding patterns in strings. Algorithms and Complexity, 1, 255.
[2] Arbib, C., Pınar, M. Ç., Rossi, F. and Tessitore, A. (2020). Codon optimization by 0-1 linear programming. Computers and Operations Research, 119, 104932.
[3] Cherry, J. M., Hong, E. L., Amundsen, C., Balakrishnan, R., Binkley, G., Chan, E. T., Christie, K. R., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hirschman, J. E., Hitz, B. C., Karra, K., Krieger, C. J., Miyasato, S. R., Nash, R. S., Park, J., Skrzypek, M. S., Simison, M., Weng, S. and Wong, E. D. (2012). Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Research, 40(D1), D700-D705.
[4] Cohen, B. and Skiena, S. (2003). Natural selection and algorithmic design of mRNA. Journal of Computational Biology, 10(3-4), 419-432.
[5] Condon, A. and Thachuk, C. (2012). Efficient codon optimization with motif engineering. Journal of Discrete Algorithms, 16, 104-112.
[6] ENCODE Project Consortium. (2004). The ENCODE (ENCyclopedia of DNA elements) project. Science, 306(5696), 636-640.
[7] Gaspar, P., Moura, G., Santos, M. A. and Oliveira, J. L. (2013). mRNA secondary structure optimization using a correlated stem–loop prediction. Nucleic Acids Research, 41(6), e73-e73.
[8] Gould, N., Hendy, O. and Papamichail, D. (2014). Computational tools and algorithms for designing customized synthetic genes. Frontiers in Bioengineering and Biotechnology, 2, 41.

[9] Gu, W., Zhou, T. and Wilke, C. O. (2010). A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. PLoS Computational Biology, 6(2), e1000664.
[10] Guimaraes, J. C., Rocha, M., Arkin, A. P. and Cambray, G. (2014). D-Tailor: automated analysis and design of DNA sequences. Bioinformatics, 30(8), 1087-1094.
[11] Gustafsson, C., Govindarajan, S. and Minshull, J. (2004). Codon bias and heterologous protein expression. Trends in Biotechnology, 22(7), 346-353.
[12] Hofacker, I. L. (2003). Vienna RNA secondary structure server. Nucleic Acids Research, 31(13), 3429-3431.
[13] Mauger, D. M., Cabral, B. J., Presnyak, V., Su, S. V., Reid, D. W., Goodman, B., Link, K., Khatwani, N., Reynders, J., Moore, M. J., and McFadyen, I. J. (2019). mRNA structure regulates protein expression through changes in functional half-life. Proceedings of the National Academy of Sciences, 116(48), 24075-24083.
[14] Nussinov, R. and Jacobson, A. B. (1980). Fast algorithm for predicting the secondary structure of single-stranded RNA. Proceedings of the National Academy of Sciences, 77(11), 6309-6313.
[15] Satya, R. V., Mukherjee, A. and Ranga, U. (2003, August). A pattern matching algorithm for codon optimization and CpG motif-engineering in DNA expression vectors. In Computational Systems Bioinformatics. Proceedings of the 2003 IEEE Bioinformatics Conference CSB2003, 294-305.
[16] Seligmann, H. (2019). Localized context-dependent effects of the “ambush” hypothesis: more off-frame stop codons downstream of shifty codons. DNA and Cell Biology, 38(8), 786-795.
[17] Şen, A., Kargar, K., Akgün, E. and Pınar, M. Ç. (2020). Codon optimization:a mathematical programing approach. Bioinformatics, 36(13), 4012-4020.

[18] Sharp, P. M., & Li, W. H. (1987). The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Research, 15(3), 1281-1295.
[19] Terai, G., Kamegai, S. and Asai, K. (2016). CDSfold: an algorithm for designing a protein-coding sequence with the most stable secondary structure. Bioinformatics, 32(6), 828-834.
[20] Wright, F. (1990). The ‘effective number of codons’ used in a gene. Gene, 87(1), 23-29.
[21] Zuker, M. and Stiegler, P. (1981). Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research, 9(1), 133-148.
(此全文20260817後開放外部瀏覽)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *