帳號:guest(18.119.135.0)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):巴達古
作者(外文):Dagoberto Josue Vaquedano Garrido
論文名稱(中文):A Genetic Algorithm Based On Maximum Likelihood and Normalized Mutual Information to Infer Haplotypes from Genotypes
論文名稱(外文):基於最大可能性與正規化交互資訊之基因演算法來從基因型推論單倍體
指導教授(中文):蘇豐文
指導教授(外文):Soo, Von Wun
口試委員(中文):陳朝欽
陳宜欣
口試委員(外文):Chen, Chaur Chin
Chen, Yi Shin
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:101065422
出版年(民國):103
畢業學年度:102
語文別:英文
論文頁數:54
中文關鍵詞:Genetic AlgorithmHaplotype Inference ProblemHardy Weinberg EquilibriumLinkage DisequilibriumMaximum Likelihood EstimatesNormalized Mutual Information
相關次數:
  • 推薦推薦:0
  • 點閱點閱:519
  • 評分評分:*****
  • 下載下載:18
  • 收藏收藏:0
Haplotypes consist of blocks of single nucleotide polymorphisms (SNPs). Haplotypes being a unit of inheritance are widely used for association studies and gene candidate studies. However, obtaining these blocks of SNPs through in vitro methods is both time consuming and expensive. In silico studies try to infer haplotypes from genotypic data. This thesis utilizes a genetic algorithm (i.e. a heuristic approach) guided through two genetic models, essentially the Hardy-Weinberg equilibrium and linkage disequilibrium. These have been statistically assessed by maximum likelihood estimates and a normalized mutual information respectively. This technique generates an adequate solution in polynomial time to an inherently NP-Hard problem. The results showed that our algorithm has a better accuracy rate compared to a genetic algorithm that only utilizes the Hardy-Weinberg equilibrium.
單倍體基因型(Haplotypes)中包含有多組的單核苷酸多型性(single nucleotide polymorphisms). 而單倍體基因型(Haplotypes)做為遺傳研究中的一個單位,已被大量使用於相關遺傳與候選基因的研究當中. 然而,藉由試管實驗的研究方法來獲取這些單核苷酸多型性(SNPs)的資訊,不僅非常花時間,成本也非常高昂. 相反的,筆者嘗試透過電腦模擬的研究方法,藉由基因資料庫的運用與推導進而解讀出這些單倍體基因型的資訊. 本次研究希望透過使用兩組遺傳模型-哈代‧溫柏格平衡定律(Hardy-Weinberg equilibrium)與連鎖不平衡(linkage disequilibrium),來發展一套新的遺傳演算法(genetic algorithm). 研究所使用的兩組遺傳模型將分別使用最大似然估計法則(maximum likelihood estimates)與標準化共同資訊量(Normalized Mutual Information)進行統計與評估. 而這套遺傳演算法在處理NP困難問題(NP-Hard problem)中,產生出一個適當的多項式時間解決方法. 最終研究結果顯示,研究中所使用的遺傳演算法在只有使用哈代‧溫柏格平衡定律(Hardy-Weinberg equilibrium)時才能有較高的準確率.
Abstract 1
摘要 2
Acknowledgement 3
Chapter 1. Introduction 4
Chapter 2. Haplotype Inference Problem 7
Chapter 3. Existing Methods 8
Chapter 4. Genetic Models 10
Hardy-Weinberg Equilibrium 10
Linkage Disequilibrium 12
Chapter 5. Method 15
Stage 1 15
Initialization 15
Stage 2 16
Partial Haplotype Generator 16
Stage 3 16
Partial Haplotype Pair Generator 16
Stage 4 17
Final Haplotype Pair Generator 17
Stage 5 17
Fitness Function 17
Genetic Algorithm Settings 20
Chapter 6. Materials 21
β2 –Adrenergic receptors 21
Gene CYP19 21
Apolipoprotein E 21
Lipoprotein Lipase Gene 21
Chapter 7. Results and Comparisons 22
Accuracy Rate 22
β2 –Adrenergic receptors 22
Gene CYP19 22
Apolipoprotein E 22
Lipoprotein Lipase Gene 23
Chapter 8. Future Work 24
Chapter 9. Conclusions 25
References 26
Figures & Tables 30
O'Brien, S. J., and Nelson G. W. 2004. "Human genes that limit AIDS." Nature Genetics 36 (6): 565-574.
Wilke, R. A., Lin D. W., Roden, D. M., Watkins, P. B., Flockhart, D., Zineh, I., Giacomini, K. M., and Krauss, R. M. 2007. "Identifying genetic risk factors for serious adverse drug reactions: Current progress and challenges." Nature Reviews Drug Discovery 6 (11): 904-916.
Carlson, C. S., Eberle, M. A., Rieder, M. J., Smith, J. D., Kruglyak, L., and Nickerson, D. A. 2003. “Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans”. Nature Genetics 33: 518–521.
Drysdale, C. M., McGraw, D. W., Stack, C. B., Stephens, J. C., Judson, R. S., Nandabalan, K., Arnold, K., Ruano, G., and Liggett, S. B. 2000. "Complex promoter and coding region b2-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness." Proceedings of the National Academy of Sciences 97 (19): 10483–10488.
Vljeg, A. V. H., Baglin, C. A., Bare, L. A., Rosendaal, F. R., and Baglin, T. P. (2008). “Proof of principle of potential clinical utility of multiple SNP analysis for prediction of recurrent venous thrombosis.” Journal of Thrombosis and Haemostasis, 6: 751–754.
de Bakker, P. I. W., McVean, G., Sabeti, P. C., Miretti, M. M, Green, T., Marchini, J., Ke, X., Monsuur, A. J., Whittaker, P., Delgado, M., et al. 2006. "A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC." Nature Genetics 38: 1166–1172.
Tycko, B. 2010. "Allele-specific DNA methylation: beyond imprinting." Human Molecular Genetics 19 (R2): R210–R220.
Giardina, E., Pietrangeli, I., Martínez-Labarga, C., Martone, C., de Angelis, F., Spinella, A., De Stefano, G., Rickards, O., and Novelli, G. 2008. "Haplotypes in SLC24A5 gene as ancestry informative markers in different populations." Current Genomics 9(2): 110–114.
Lawson, D. J., Hellenthal, G., Myers, S. and Falush, D. 2012 "Inference of population structure using dense haplotype data." PLoS Genetics 8(1): e1002453.
Roach, J.C., Glusman, G., Smit, A. F., Huff, C. D., Hubley, R., Shannon, P. T., Rowen, L., Pant, K. P., Goodman, N., Bamshad, M., et al. 2010. “Analysis of genetic inheritance in a family quartet by whole-genome sequencing.” Science 328: 636–639.
Kirkness, E. F., Grindberg, R. V., Yee-Greenbaum, J., Marshall, C. R., Scherer, S. W., Lasken, R. S., and Venter, J. C. 2013. "Sequencing of isolated sperm cells for direct haplotyping of a human genome." Genome Research, 23: 826–832.
Fellows, M. R., Hartman, T., Hermelin, D., Landau, G. M., Rosamond, F., and Rozenberg, L. 2011. "Haplotype Inference Constrained by Plausible Haplotype Data." IEEE Computer Society 8(6): 1692-1699.
Niu, T. 2004, "Algorithms for inferring haplotypes." Genetic Epidemiology, 27(4): 334–347.
Lakshminarasimhan, P., Marmelstein, R., Devito, M., Dongsheng, C., and Qi, L. 2010. "A maximum likelihood based genetic algorithm for inferring haplotypes from genotypes." Education Technology and Computer (ICETC), 2010 2nd International Conference on 5: V5-92 - V5-96.
Wigginton, J. E., Cutler, D. J., and Abecasis, G. R. 2005. "A Note on Exact Tests of Hardy-Weinberg Equilibrium." American journal of human genetics 76(5): 887-893.
Shifman, S., Kuypers, J., Kokoris, M., Yakir, B., and Darvasi, A. 2003. "Linkage disequilibrium patterns of the human genome across populations." Human Molecular Genetics 12(7):771-776.
Gao, X., Huang, M., Liu, L., He, Y., Yu, Q., Zhao, H., Zhou, C., Zhang, J., Zhu, Z., Wan, J., et al. 2013. "Insertion/Deletion Polymorphisms in the Promoter Region of BRM Contribute to Risk of Hepatocellular Carcinoma in Chinese Populations." PLoS ONE 8(1): e55169.
Bingham, E., Koivisto, M., Leino, Y. and Mannila, H. 2010. "Linkage Disequilibrium between Chromosomes in the Human Genome: Test Statistics and Rapid Computation." The Digital Repository of University of Helsinki. http://hdl.handle.net/10138/16957
Haiman, C. A.,Stram, D. O., Pike, M. C., Kolonel, L. N., Burtt, N. P., Altshuler, D., Hirschhorn, J., and Henderson, B. E. 2003. "A comprehensive haplotype analysis ofCYP19 and breast cancer risk: the Multiethnic Cohort." Human Molecular Genetics 12(20): 2679–2692
Zhenqiu L., and Shili L., 2005. “Multilocus LD Measure and Tagging SNP Selection With Generalized Mutual Information”. Genetic Epidemiology 29:353-364.
Liang, H., and Hua, Y. 2010. "An Efficient Tagging SNP Selection Method Using Normalized Mutual Information and Joint Entropy," Intelligent Systems and Applications (ISA), 2010 2nd International Workshop on: 1-4.
Carlson, C. S., Eberle, M. A., Rieder, M. J., Yi, Q., Kruglyak, L., and Nickerson, D. A. 2004. "Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium." American journal of human genetics 74(1): 106-120.
Takeuchi, F., Yanai, K., Morii, T., Ishinaga, Y., Taniguchi-Yanai, K., Nagano, S., and Kato, N. 2005. "Linkage Disequilibrium Grouping of Single Nucleotide Polymorphisms (SNPs) Reflecting Haplotype Phylogeny for Efficient Selection of Tag SNPs." Genetics 170: 291-304.
Wang, L., and Xu, Y. 2003 "Haplotype inference by maximum parsimony." Bioinformatics 19(14): 1773-1780.
Stram, D. O., Haiman, C. A., Hirschhorn, J. N., Altshuler, D., Kolonel, L. N., Henderson, B. E., and Pike, M. C. 2003. "Choosing Haplotype-Tagging SNPS Based on Unphased Genotype Data Using a Preliminary Sample of Unrelated Subjects with an Example from the Multiethnic Cohort Study." Human Heredity 55: 27-36.
Nickerson, D. A., Taylor, S. L., Fullerton, S. M., Weiss, K. M., Clark, A. G., Stengård, J. H., Salomaa, V., Boerwinkle, E., and Sing, C. F. 2000. "Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene." Genome Research 10: 1532-1545.
Qin, Z. S., Niu, T., and Liu, J. S. 2002. "Letter to the Editor." American journal of human genetics 71:1242–1247.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *