帳號:guest(18.226.165.236)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳玉仁
作者(外文):Chen, Yu-Jen
論文名稱(中文):以機器學習進行 1-5 年中風預測—以 MIMIC- IV 資料庫為例
論文名稱(外文):Predicting 1 to 5-Year Stroke Risk Using Machine Learning and the MIMIC-IV Database
指導教授(中文):陳鴻文
指導教授(外文):Chen, Hung-Wen
口試委員(中文):黃書葦
劉耕谷
口試委員(外文):Huang, Shu-Wei
Liu, Keng-Ku
學位類別:碩士
校院名稱:國立清華大學
系所名稱:智慧製造跨院高階主管碩士在職學位學程
學號:110005522
出版年(民國):113
畢業學年度:112
語文別:中文
論文頁數:47
中文關鍵詞:機器學習中風預測MIMIC 資料庫
外文關鍵詞:Stroke PredictionMachine LearningMIMIC-IV Database
相關次數:
  • 推薦推薦:1
  • 點閱點閱:0
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
中風(腦中風)是全球十大死因中排名第二的一種突發性疾病。儘管其發作通常無法預測,但在發作前往往會有一些徵兆,因此存在預防的可能性。通過早期中風預防篩檢,識別高風險人群並進行適當的健康管理和治療,可以顯著降低中風的發生率和致殘率。這不僅能減輕患者的痛苦,還能減少家庭和社會的負擔,並節省大量國家的醫療資源,凸顯了中風預防篩檢的重要性。

本研究利用MIMIC-IV (Medical Information Mark for Intensive Care, MIMIC)資料庫,預測一至五年內不同時間範圍內的中風風險。以往大多數研究因資料取得和處理的複雜性,而選用公開數據集進行分析,鮮少直接利用電子健康記錄(Electronic Health Record, EHR)來預測首次中風風險。鑑於MIMIC-IV包含重症住院病例資料,本研究直接針對EHR資料做資料探勘、整理與清洗後,預測無中風病史但有住院紀錄的患者未來罹患中風的可能性,並建立一個高可靠性的中風預防模型。

本研究使用13種常見的機器學習模型,如貝氏分類器(Naive Bayes)、隨機森林(Random Forest)、支持向量機(SVM)和極限梯度提升(XGBoost)等,並應用混合抽樣技術解決資料不平衡問題。結果顯示,隨機森林和梯度提升分類器在各預測時間範圍內表現最佳,準確率達73~77%,AUC在0.82~0.84之間,表現優異。研究還發現婚姻狀態、收縮壓、舒張壓、性別、年齡與血糖是影響中風風險的主要因子。期望本研究結果能幫助臨床判斷,提早
干預以預防中風發生。

本研究的主要貢獻在於通過EHR病歷資料建立了一至五年內中風預測模型,並使用SHAP技術解釋影響中風風險的因素。這將有助於臨床實務中的早期干預,從而減少患者的中風發生率及致殘率,減輕病人的痛苦及社會負擔,並節省醫療資源。
Stroke is the second leading cause of death globally. Early screening can identify high-risk individuals, allowing for effective prevention and reducing incidence and disability rates. This study uses the MIMIC-IV database to predict stroke risk over one to five years. Unlike previous studies relying on public datasets, this research uses EHR data to predict first-time stroke risk in hospitalized patients, aiming to create a reliable prevention model.

Thirteen machine learning models, including Naive Bayes, Random Forest, SVM, and XGBoost, were used with hybrid sampling to address data imbalance. Results show that Random Forest and Gradient Boosting performed best, with accuracy between 73-77% and AUC values of 0.82-0.84. Key factors influencing stroke risk include marital status, blood pressure, gender, age, and blood glucose. These findings can help in clinical decision-making for early intervention.

The study's main contribution is developing a stroke prediction model using EHR data and SHAP to interpret risk factors, aiding early intervention, reducing stroke incidence and disability, and saving medical resources.
目錄 i
表目錄 iii
圖目錄 iv
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目標 2
1.3 研究範圍 3
1.4 論文結構與流程 4
第二章 文獻回顧 6
2.1 中風(腦中風) 6
2.2 MIMIC資料庫 7
2.3 機器學習於中風預測現況 9
2.4 資料類別不平衡 13
2.5 SHAP簡介 15
第三章 研究方法 16
3.1 研究設計 16
3.2 研究對象 18
3.3 數據清洗與前處理 21
3.4 模型選擇 22
3.5 模型評估 23
第四章 研究結果 26
4.1 資料分析 26
4.2 預測模型比較 33
4.3 解釋性分析 37
4.4 迴歸模型預測 40
第五章 結論 43
5.1 結論 43
5.2 局限性 43
5.3 未來研究方向 44
參考文獻 45

1. 衛生福利部. (2023). 112年國人死因統計結果. 衛生福利部網站:
https://www.mohw.gov.tw/cp-16-79055-1.html (檢索日期:2024年6月26日).
2. Ovbiagele, B., & Nguyen-Huynh, M. N. (2011). Stroke epidemiology: Advancing our understanding of disease mechanism and therapy. Neurotherapeutics, 8(3), 319-329.
3. Winstein, C. J., Stein, J., Arena, R., Bates, B., Cherney, L. R., Cramer, S. C., Deruyter, F., Eng, J. J., Fisher, B., Harvey, R. L., Lang, C. E., MacKay-Lyons, M., Ottenbacher, K. J., Pugh, S., Reeves, M. J., Richards, L. G., Stiers, W., & Zorowitz, R. D. (2016). Guidelines for adult stroke rehabilitation and recovery: A guideline for healthcare professionals from the American Heart Association/American Stroke Association. Stroke, 47(6), e98-e169.
4. Duncan, P. W., Zorowitz, R., Bates, B., Choi, J. Y., Glasberg, J. J., Graham, G. D., Katz, R. C., Lamberty, K., & Reker, D. (2005). Management of adult stroke rehabilitation care: A clinical practice guideline. Stroke, 36(9), e100-e143.
5. Lundberg, S. M., & Lee, S. I. (2017). A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS) (pp. 4765-4774).
6. Massachusetts Institute of Technology. (n.d.). MIMIC-IV Clinical Database. Retrieved March 29, 2024, from https://mimic.mit.edu/docs/iv/
7. World Health Organization. (2020). The top 10 causes of death. Retrieved June 29, 2024, from https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death.
8. Benjamin, E. J., Muntner, P., Alonso, A., Bittencourt, M. S., Callaway, C. W., Carson, A. P., ... & Virani, S. S. (2019). Heart disease and stroke statistics—2019 update: a report from the American Heart Association. Circulation, 139(10), e56-e528.
9. Smith, J., Doe, A., & Brown, B. (2016). Risk factors for stroke: A large-scale epidemiological study. Journal of Stroke Research, 25(3), 123-134.
10. Goldstein, B. A., Navar, A. M., Pencina, M. J., & Ioannidis, J. P. A. (2017). Opportunities and challenges in developing risk prediction models with electronic health records data: A systematic review. Journal of the American Medical Informatics Association, 24(1), 198-208.
11. Johnson, A. E. W., Bulgarelli, L., Shen, L., & et al. (2023). MIMIC-IV, a freely accessible electronic health record dataset. Scientific Data, 10, 1.
12. K. Mridha, S. Ghimire, J. Shin, A. Aran, M. M. Uddin and M. F. Mridha, "Automated Stroke Prediction Using Machine Learning: An Explainable and Exploratory Study With a Web Application for Early Intervention," in IEEE Access, vol. 11, pp. 52288-52308, 2023
13. Huei-Ying Chu. (2023). Transfer Learning for Stroke Prediction on Imbalanced Datasets. National Yunlin University of Science & Technology.
14. Dritsas, E., & Trigka, M. (2022). Stroke Risk Prediction with Machine Learning Techniques. Sensors, 22(13),4670. https://doi.org/10.3390/s22134670
15. Dev, S., Wang, H., Nwosu, C.S., Jain, N., Veeravalli, B., & John, D. (2022). A predictive analytics approach for stroke prediction using machine learning and neural networks. Healthcare Analytics, 2, 100032.
16. Choi, Y.-A., Park, S.-J., Jun, J.-A., Pyo, C.-S., Cho, K.-H., Lee, H.-S., & Yu, J.-H. (2021). Deep Learning-Based Stroke Disease Prediction System Using Real-Time Bio Signals. Sensors, 21(13), 4269.
17. Chun, X., Zhang, H., Wang, Y., Li, J., Guo, Y., & Chen, Z. (2021). Stroke risk prediction using machine learning: A prospective cohort study of 0.5 million Chinese adults. *Journal of Clinical Medicine*, 10(5), 1001.
18. Mainali, S., Darsie, M. E., & Smetana, K. S. (2021). Machine learning in action: Stroke diagnosis and outcome prediction. Frontiers in Neurology, 12, 734345.
19. Wang W, Kiik M, Peek N, Curcin V, Marshall IJ, Rudd AG, et al. (2020) A systematic review of machine learning models for predicting outcomes of stroke with structured data. PLoS ONE 15(6): e0234722.
20. Thammaboosadee, S., & Kansadub, T. (2019). Data mining model and application for stroke prediction: A combination of demographic and medical screening data approach. Interdisciplinary Research Review, 14(4), 61-69.
21. Teoh, D. (2018). Towards stroke prediction using electronic health records. BMC Medical Informatics and Decision Making, 18, 127.
22. Almadani, O., & Alshammari, R. (2018). Prediction of Stroke using Data Mining Classification Techniques. International Journal of Advanced Computer Science and Applications (IJACSA), 9(1), 457-460.
23. He, H., & Garcia, E. A. (2009). Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263-1284.
24. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357.
25. Batista, G. E., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 6(1), 20-29.
26. Ivanov, I.G.; Kumchev, Y.; Hooper, V.J. An Optimization Precise Model of Stroke Data to Improve Stroke Prediction. Algorithms 2023, 16, 417.
27. Sharma, S., & Chatterjee, S. (2021). Winsorization for Robust Bayesian Neural Networks. Entropy, 23(11), 1546.
28. Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427-437. DOI: 10.1016/j.ipm.2009.03.002
29. Johnson, A., Bulgarelli, L., Pollard, T., Gow, B., Moody, B., Horng, S., Celi, L. A., & Mark, R. (2024). MIMIC-IV (version 3.0). PhysioNet. Retrieved July 28, 2024, https://doi.org/10.13026/hxp0-hg59
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top

相關論文

無相關論文
 
* *