帳號:guest(18.118.139.82)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳冠宇
作者(外文):Chen, Kuan-Yu
論文名稱(中文):運用一體化醫學語言系統之病歷文字自動轉換ICD-10-CM代碼
論文名稱(外文):A Method for Automatic ICD-10-CM Coding from Clinical Free Text by using UMLS
指導教授(中文):林華君
指導教授(外文):Lin, Hwa-Chun
口試委員(中文):陳俊良
蔡榮宗
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學號:104064538
出版年(民國):107
畢業學年度:106
語文別:中文
論文頁數:41
中文關鍵詞:一體化醫學語言疾病分類疾病分類代碼自動化自然語言處理
外文關鍵詞:UMLSMedical ClassificationMedical CodesMedical CodingAutomationNatural Language ProcessingNLP
相關次數:
  • 推薦推薦:0
  • 點閱點閱:545
  • 評分評分:*****
  • 下載下載:3
  • 收藏收藏:0
疾病分類,為依據既定的標準,針對疾病或病徵給予分類。此動作需要大量的專業知識及訓練,為了能將此過程變得更有效率,疾病分類自動化成了學術研究的方向之一。在此論文中,我們針對一個國際訂定的標準疾病分類代碼ICD-10-CM,運用UMLS的語彙庫,提出一個病歷文字自動轉換ICD-10-CM代碼的方法,此方法的設計流程包含三個階段:「前處理」、「過濾及擴展」及「配對」。依據輸入的病歷內容,可即時產生數個與病歷內容相關的ICD-10-CM代碼供使用者參考,加速疾病分類的效率,此方法不需要預先輸入大量的病歷資料以建立模型,減少病歷資料不易取得的困難。我們以93份出院病歷摘要為參考資料,對於ICD-10-CM類別代碼的分析,可以達到超過50%的準確率;在ICD-10-CM代碼的分類,則有0.346的準確率及0.33的F-measure值。
Medical Classification, or Medical Coding, is a procedure of converting disease diagnostic into medical codes according to specific standard.
For a person who deals with medical classification, this coding process needs to be done under massive background knowledge and sufficient training. To make this process more efficient, Automation of Medical Classification and Coding becomes a popular academic research area for medical and information experts. In our work, we proposed a method for ICD-10-CM (International Classification of Diseases, Tenth Revision, Clinical Modification) coding automation from clinical free text using UMLS (Unified Medical Language System). This method consists of three components: Pre-processing, Filter& Extension and Matching. To make medical classification process more efficient, this method can recommend ICD-10-CM codes in real time according to the medical free text. In addition, this method does not need large electronic medical record data set for training in advance. With 93 discharge summaries, this model can reach more than 50% accuracy on analyzing ICD-10-CM category codes. For classification of ICD-10-CM codes, there's still room for improvement for the model we proposed (0.346 accuracy and 0.33 F-measure).
1 Introduction
1.1 Contribution . . . . . . . . 1
1.2 Thesis Organization . . . . . 2

2 Background
2.1 國際疾病傷害及死因分類標準第十版(International Statistical Classification of Diseases and Related Health Problems 10th Revision,ICD-10). . . . . . . . . 3
2.2 國際疾病與相關健康問題統計分類第十版臨床修訂(International Statistical Classification of Diseases and Related Health Problems,10th Revision,Clinical Modification,ICD-10-CM) . . . . 4
2.3 一體化醫學語言系統(Unified Medical Language System,UMLS) . . 6
2.3.1 泛索引典(Metathesaurus) . . . . . . . . . . . . . . . . . . 6
2.3.2 語意網路(Semantic Network) . . . . . . . . . . . . . . . . 6
2.3.3 專家辭典(SPECIALIST Lexicon) . . . . . . . . . . . . . . . 8
2.4 spaCy . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 WordNet . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Related Work . . . . . . . . . . . 10
3.1 疾病分類代碼自動化 . . . . . . . . . . . . . . . 10
3.2 病歷文字轉換ICD-9-CM代碼. . . . . . . . . . . . . . 11
3.3 病歷文字轉換ICD-10代碼. . . . . .. . . . . . . . . 11

4 Model Architecture . . . . . . . . 13
4.1 資料結構. . . . . . . . . . . . . 13
4.1.1 集合(set) . . . . . . . . . . . . . . . . 13
4.1.2 列表(list) . . . . . . . . . . . . . . . . 15
4.1.3 字典(dictionary) . . . . . . . . . . . . . 15
4.2 專有名詞定義 . . . . . . . . . . . . . . . . 15
4.3 字典的建立 . . . . . . . . . . . . . . . . . 15
4.4 Pre-Processing(前處理). . . . . . . . . . . . . . . . 18
4.5 Filtering and Extension(過濾與擴展) . . . . . . . . . . . 24
4.6 Matching(配對) . . . . . . . . . . . . . . . 26

5 Evaluations. . . . . . . . . . . 30
5.1 實驗資料 . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 實驗流程 . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 實驗結果 . . . . . . . . . . . . . . . . . . . . . . . 32

6 Conclusion . . . . . . . . . . . . . 35
6.1 總結 . . . . . . . . . . . . . . . . . . . . . . . 35
6.2 未來展望. . . . . . . . . . . . . . . . . . . . . . 35
[1] “Unified Medical Language System,” National Library of Medicine(US). [Online]. Available: https://www.nlm.nih.gov/research/umls/
[2] “UMLS R Reference Manual [Internet],” National Library of Medicine(US).[Online]. Available: https://www.ncbi.nlm.nih.gov/books/NBK9676/
[3] D. Lareau, “5 Benefits Automated Coding Tools Bring to ICD-10 Transition.” [Online]. Available: http://www.medicomp.com/5-benefits-automated-coding-tools-bring-to-icd-10-transition/
[4] “International Classification of Diseases,” World Health Organaization. [Online].Available: http://www.who.int/classifications/icd/en/
[5] “History of the development of the ICD,” World Health Organaization. [Online]. Available: http://www.who.int/classifications/icd/en/HistoryOfICD.pdf
[6] A. Hurt, “Six Key Differences Between ICD-9 and ICD-10.” [Online]. Available: http://www.physicianspractice.com/icd-10/six-key-differences-between-icd-9-and-icd-10
[7] “ICD-9 vs. ICD-10: Similarities and differences.” [Online]. Available: http://www.hcpro.com/HOM-293035-7200/ICD9-vs-ICD10-Similarities-and-differences.html
[8] M. Topaz, L. Shafran-Topaz, and K. H. Bowles, “ICD-9 to ICD-10: evolution, revolution,and current debates in the United States,” Perspectives in Health InformationManagement/AHIMA, American Health Information Management Association,vol. 10, no. Spring, 2013.
[9] “醫療安全暨品質研討系列《69》-ICD-10-CM/PCS 簡介與實務操作,” 2014. [Online]. Available: http://www.tma.tw/ltk/104580104.pdf
[10] “spaCy.” [Online]. Available: https://spacy.io/
[11] “WordNet,” Princeton University. [Online]. Available: https://wordnet.princeton.edu/
[12] G. A. Miller, “Wordnet: a lexical database for english,” Communications of theACM, vol. 38, no. 11, pp. 39–41, 1995.
[13] G. K. Savova, J. J. Masanz, P. V. Ogren, J. Zheng, S. Sohn, K. C. Kipper-Schuler,and C. G. Chute, “Mayo clinical Text Analysis and Knowledge Extraction System(cTAKES): architecture, component evaluation and applications,” Journal of the American Medical Informatics Association, vol. 17, no. 5, pp. 507–513, 2010.
[14] “SNOMED CT.” [Online]. Available: https://www.nlm.nih.gov/healthit/snomedct/
[15] K. Donnelly, “SNOMED-CT: The advanced terminology and coding system for eHealth,” Studies in Health Technology and Informatics, vol. 121, pp. 279–290, 2006.
[16] R. A. Cote, “Architecture of SNOMED: its contribution to medical language processing,” in Proceedings of the Annual Symposium on Computer Application in Medical Care. American Medical Informatics Association, 1986, p. 74.
[17] J. Patrick, Y. Wang, and P. Budd, “An automated system for conversion of clinical notes into SNOMED clinical terminology,” in Proceedings of the fifth Australasian Symposium on ACSW frontiers-Volume 68. Australian Computer Society, Inc., 2007, pp. 219–226.
[18] R. Batool, A. M. Khattak, T. S. Kim, and S. Lee, “Automatic extraction and mapping of discharge summary’s concepts into SNOMED CT,” in Proceedings of IEEE Eng Med Biol Soc, 2013, pp. 4195–4198.
[19] “MetaMap,” National Library of Medicine(US). [Online]. Available: https://metamap.nlm.nih.gov/
[20] A. R. Aronson, “Metamap: Mapping Text to the UMLS Metathesaurus,” Bethesda, MD, USA: NLM, NIH, DHHS, pp. 1–26, 2006.
[21] A. R. Aronson and F.-M. Lang, “An overview of MetaMap: historical perspective and recent advances,” Journal of the American Medical Informatics Association, vol. 17, no. 3, pp. 229–236, 2010.
[22] L. Soldaini and N. Goharian, “QuickUMLS: a fast, unsupervised approach for medical concept extraction,” in MedIR Workshop, SIGIR, 2016.
[23] N. Okazaki and J. Tsujii, “Simple and efficient algorithm for approximate dictionary matching,” in Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 2010, pp. 851–859.
[24] R. Kavuluru, S. Han, and D. Harris, “Unsupervised extraction of diagnosis codes from EMRs using knowledge-based and extractive text summarization techniques,” in Canadian Conference on Artificial Intelligence. Springer, 2013, pp. 77–88.
[25] “The Computational Medicine Center’s 2007 Medical Natural Language Processing Challenge,” Computational Medicine Center, 2007. [Online]. Available: http://computationalmedicine.org/challenge/index.php
[26] I. Goldstein, A. Arzumtsyan, and O¨ . Uzuner, “Three approaches to automatic assignment of ICD-9-CM codes to radiology reports,” in Proceedings of AMIA Symposium, vol. 2007. American Medical Informatics Association, 2007, pp. 279–283.
[27] A. R. Aronson, O. Bodenreider, D. Demner-Fushman, K. W. Fung, V. K. Lee, J. G. Mork, A. N´ev´eol, L. Peters, andW. J. Rogers, “From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches,” in Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing. Association for Computational Linguistics, 2007, pp. 105–112.
[28] A. R. Aronson, J. G. Mork, C. W. Gay, S. M. Humphrey, and W. J. Rogers, “The NLM indexing initiative’s medical text indexer,” MEDINFO, vol. 89, pp. 268–272, 2004.
[29] R. Farkas and G. Szarvas, “Automatic construction of rule-based ICD-9-CM coding systems,” BMC Bioinformatics, vol. 9, no. 3, p. S10, 2008.
[30] N. Hema and S. Justus, “Conceptual Graph Representation Framework for ICD-10,” Procedia Computer Science, vol. 50, pp. 635–642, 2015.
[31] S. S. Krishna and M. Hans, “Understanding Medical free text: A Terminology driven approach,” Computerm 2016, p. 121, 2016.
[32] D. Arifo˘glu, O. Deniz, K. Alec¸akır, and M. Y¨ondem, “CodeMagic: Semi-Automatic Assignment of ICD-10-AM Codes to Patient Records,” in Information Sciences and Systems 2014. Springer, 2014, pp. 259–268.
[33] M. Van Erp and L. Schomaker, “Variants of the Borda Count method for combining ranked classifier hypotheses,” in Proceedings of the 7th International Workshop Frontiers in Handwriting Recognition, 2000.
[34] S. Pereira, A. N´ev´eol, P. Massari, M. Joubert, and S. Darmoni, “Construction of a semi-automated ICD-10 coding help system to optimize medical and economic coding,” in Studies in Health Technology and Informatics, 2006, pp. 845–850.
[35] “MeSH,” National Library of Medicine(US). [Online]. Available: https://www.nlm.nih.gov/mesh/
[36] A. N´ev´eol, A. Rogozan, and S. Darmoni, “Automatic indexing of online health resources for a French quality controlled gateway,” Information Processing & Management, vol. 42, no. 3, pp. 695–709, 2006.
[37] B. Koopman, S. Karimi, A. Nguyen, R. McGuire, D. Muscatello, M. Kemp, D. Truran, M. Zhang, and S. Thackway, “Automatic classification of diseases from freetext death certificates for real-time surveillance,” BMC Medical Informatics and Decision Making, vol. 15, no. 1, p. 53, 2015.
[38] B. Koopman, G. Zuccon, A. Nguyen, A. Bergheim, and N. Grayson, “Automatic icd-10 classification of cancers from free-text death certificates,” International Journal of Medical Informatics, vol. 84, no. 11, pp. 956–965, 2015.
[39] “名詞片語.” [Online]. Available: http://www.taiwantestcentral.com/Grammar/Title.aspx?ID=143
[40] “Hyponym and Hypernym.” [Online]. Available: https://en.wikipedia.org/wiki/Hyponymy and hypernymy
[41] “WordNet glossary,” Princeton University. [Online]. Available: https://wordnet.princeton.edu/wordnet/man/wngloss.7WN.html
[42] H. Harkema, J. N. Dowling, T. Thornblade, and W. W. Chapman, “ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports,” Journal of Biomedical Informatics, vol. 42, no. 5, pp. 839–851, 2009.
[43] B. E. Chapman, S. Lee, H. P. Kang, and W. W. Chapman, “Document-level classification of CT pulmonary angiography reports based on an extension of the ConText algorithm,” Journal of Biomedical Informatics, vol. 44, no. 5, pp. 728–737, 2011.
[44] “Wordnet search,” Princeton University. [Online]. Available: http://wordnetweb.princeton.edu/perl/webwn
[45] “Collins dictionary.” [Online]. Available: https://www.collinsdictionary.com/
[46] “Stopwords list.” [Online]. Available: http://www.ranks.nl/stopwords
[47] “Aamc-specific abbreviations.” [Online]. Available: http://studylib.net/doc/8111679/aamc-specific-abbreviations-approved-for-use-in-the-medical-record
[48] N. Limsopatham, C. Macdonald, and I. Ounis, “Inferring conceptual relationships to improve medical records search,” in Proceedings of the 10th Conference on Open Research Areas in Information Retrieval. Le centre de hautes etudes internationales d’informatique documentaire, 2013, pp. 1–8.
[49] “Word2vec.” [Online]. Available: https://deeplearning4j.org/word2vec
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *