帳號:guest(3.138.33.120)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):鄭明哲
作者(外文):Cheng, Ming-Che
論文名稱(中文):深度隨機時間序列補值法於電子醫療病例的應用
論文名稱(外文):Deep STI: Deep Stochastic Time-series Imputation on Electronic Medical Records
指導教授(中文):林澤
翁詠祿
指導教授(外文):Lin, Che
Ueng, Yeong-Luh
口試委員(中文):蘇東弘
王偉仲
簡仁宗
口試委員(外文):Su, Tung-Hung
Wang, Weichung
Chien, Jen-Tzung
學位類別:碩士
校院名稱:國立清華大學
系所名稱:通訊工程研究所
學號:107064520
出版年(民國):110
畢業學年度:110
語文別:英文
論文頁數:54
中文關鍵詞:深度學習電子病歷補值癌症預測肝細胞癌
外文關鍵詞:Deep learningelectronic health recordimputationcancer predictionhepatocellular carcinoma
相關次數:
  • 推薦推薦:0
  • 點閱點閱:252
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
近年,深度學習被應用在醫療領域,帶來許多進步。其中一項應用為電子病歷
的研究。深度神經網路能夠從病歷解析出疫病的資訊,並擁有提供新穎且與眾不同
見解的潛力。然而,電子病歷的不完整性阻礙深度模型的發展,限制其應用上的預
測表現。 電子病歷中檢驗的採檢頻率不同,各項目需要採檢的情況也不同,每個病
例中不可能記錄所有項目。因此,在整合病例資料時,那些未檢測的項目就會成為
大量的缺失值,造成電子病歷的不完整。該如何取代缺失的數值,正是影響深度模
型能否精準預測疾病的關鍵。本研究提出深度隨機時間序列補值法 (Deep STI) 解
決電子病歷缺值帶來的挑戰。其核心概念為運用深度補值網路取代缺失值,同時利
用補值的電子病歷預測疾病。 藉由融合補值網路以及預測網路,我們提出的模型能
被應用於變動長度的電子病歷序列,產生其中的缺失值,且同時產生疾病的預測。
我們以預測肝病病人一年後是否會罹患肝細胞癌 (HCC) 來評估預測模型的表現,
並在本研究中展示我們的模型確實能夠有效增進預測表現,達到 40.74%平均
AUPRC 以及 89.46%平均 AUROC。 相較於邏輯斯回歸僅有 22.68%以及平均
AUPRC 83.39%平均 AUROC, 我們的預測模型更能顯其優勢。除此之外,我們還
發現模型的補值網路能夠增進模型的穩定度,降低隨機性對模型的干擾。 我們相信
本研究將會成為於深度時序列補值應用於電子病歷研究中重要的一步。
In recent years, deep learning has brought advances to various medical applications.
One of them is exploring electronic health records (EHRs) via deep learning. Deep neural
networks (DNNs) possess the ability to extract abstract information from patients’
medical history and the potential to provide new and intelligent insights. However, the
lack of complete EHRs hinders model development and limits model performance.
Features in EHRs are observed in different frequencies or under different conditions; not
all features are recorded in every EHR. After feature alignment, unobserved features in
every EHR turns into numerous missing values. Determining a good representation of
missing value is one of the keys to improving prediction performance. This study
proposes the Deep Stochastic Time-series Imputation (Deep STI) model to address the
challenge. The central concept is to infer missing values from observed values by an
imputation network and simultaneously predict target disease according to imputed data
by a prediction network. By integrating both the imputation network and prediction
network into an end-to-end architecture, our model can simultaneously generate missing
values in dynamic length EHR sequence and predict target probability. We evaluated our
model via the prediction of real-world hepatocellular carcinoma (HCC) patients.
Numerical experiments showed that Deep STI could improve model performance in
predicting HCC in one year. Our model yielded the top mean AUROC of 89.46% and
mean AUPRC of 40.74%. Our model significantly outperforms logistic regression, with
a mean AUROC of 83.39% and a mean AUPRC of 22.68%. Besides, our imputation
mechanism can efficiently improve model stability and reduce variance caused by
randomness. We believe that this study can be an important piece of research iniv
developing deep learning models for time-series EHR imputations.
致謝 i
中文摘要 ii
Abstract iii
List of Figures vii
List of Tables viii
Chapter 1. Introduction 1
Chapter 2. Data 4
2.1 Data Description 4
2.2 The Prediction Task 4
2.3 Feature 4
2.4 Preprocessing 5
2.4.1 Data Cleaning 6
2.4.2 Summarization 7
Chapter 3. Method 11
3.1 Deep Neural Network 11
3.2 Recurrent Neural Network 13
3.2.1 Long Short-term Memory 15
3.2.2 Gated Recurrent Unit 16
3.2.3 Bidirectional Recurrent Neural Network 17
3.3 Variational Autoencoder 18
3.4 RNN-based VAE 20
3.5 KL vanishing 21
3.5 Normalizing Flow 22
3.6 Inverse Autoregressive Flow 23
3.7 Deep Stochastic Time-series Imputation 24
3.7.1 Imputation Block 25
3.7.2 Prediction Block 27
3.7.3 Training 28
Chapter 4. Experiment 30
4.1 Baseline Model 30
4.2 The Subgroup Patients 31
4.3 Evaluation Metrics 32
4.4 Result 34
4.4.1 All Patients 34
4.4.2 HBV(\CIR) Patients 36
4.4.3 HCV(\CIR) Patients 38
4.4.4 CIR Patients 39
4.4.5 OTHER Patients 41
Chapter 5. Discussion 44
5.1 Subgroup Analysis via Visualization 44
5.2 Risk Score Comparison 46
5.3 Stability Analysis 46
5.4 Influence of LOCF 48
5.5 Future Enhancement 49
Chapter 6. Conclusion 51
References 52
1. Nguyen, P., et al., "Deepr: a convolutional net for medical records," IEEE
journal of biomedical and health informatics. 21(1): p. 22-30, 2016.
2. Miotto, R., et al., "Deep patient: an unsupervised representation to predict the
future of patients from the electronic health records," Scientific reports. 6(1): p.
1-10, 2016.
3. Choi, E., et al. "Doctor ai: Predicting clinical events via recurrent neural
networks": PMLR. in Machine learning for healthcare conference. 2016.
4. Goodfellow, I., et al., "Generative adversarial nets," Advances in neural
information processing systems. 27, 2014.
5. Yoon, J., J. Jordon, and M. Schaar. "Gain: Missing data imputation using
generative adversarial nets": PMLR. in International Conference on Machine
Learning. 2018.
6. Luo, Y., et al. "Multivariate time series imputation with generative adversarial
networks". in Proceedings of the 32nd International Conference on Neural
Information Processing Systems. 2018.
7. Kingma, D.P. and M. Welling, "Auto-encoding variational bayes," arXiv preprint
arXiv:1312.6114. 2013.
8. Jun, E., A.W. Mulyadi, and H.-I. Suk. "Stochastic imputation and uncertaintyaware attention to EHR for mortality prediction": IEEE. in 2019 International
Joint Conference on Neural Networks (IJCNN). 2019.
9. Rosenblatt, F., "The perceptron: a probabilistic model for information storage
and organization in the brain," Psychological review. 65(6): p. 386, 1958.
10. Rumelhart, D.E., G.E. Hinton, and R.J. Williams, "Learning representations by
back-propagating errors," nature. 323(6088): p. 533-536, 1986.
11. Xu, B., et al., "Empirical evaluation of rectified activations in convolutional
network," arXiv preprint arXiv:1505.00853. 2015.
12. Lu, L., et al., "Dying relu and initialization: Theory and numerical examples,"
arXiv preprint arXiv:1903.06733. 2019.
13. Duchi, J., E. Hazan, and Y. Singer, "Adaptive subgradient methods for online
learning and stochastic optimization," Journal of machine learning research.
12(7), 2011.
14. Kingma, D.P. and J. Ba, "Adam: A method for stochastic optimization," arXiv
preprint arXiv:1412.6980. 2014.
15. Williams, R.J. and D. Zipser, "Gradient-based learning algorithms for recurrent,"53
Backpropagation: Theory, architectures, and applications. 433: p. 17, 1995.
16. Hochreiter, S. and J. Schmidhuber, "Long short-term memory," Neural
computation. 9(8): p. 1735-1780, 1997.
17. Cho, K., et al., "Learning phrase representations using RNN encoder-decoder for
statistical machine translation," arXiv preprint arXiv:1406.1078. 2014.
18. Chung, J., et al., "Empirical evaluation of gated recurrent neural networks on
sequence modeling," arXiv preprint arXiv:1412.3555. 2014.
19. Kramer, M.A., "Nonlinear principal component analysis using autoassociative
neural networks," AIChE journal. 37(2): p. 233-243, 1991.
20. Sutskever, I., O. Vinyals, and Q.V. Le, "Sequence to sequence learning with
neural networks," arXiv preprint arXiv:1409.3215. 2014.
21. Bowman, S.R., et al., "Generating sentences from a continuous space," arXiv
preprint arXiv:1511.06349. 2015.
22. Iyyer, M., et al. "Deep unordered composition rivals syntactic methods for text
classification". in Proceedings of the 53rd annual meeting of the association for
computational linguistics and the 7th international joint conference on natural
language processing (volume 1: Long papers). 2015.
23. Kumar, A., et al. "Ask me anything: Dynamic memory networks for natural
language processing": PMLR. in International conference on machine learning.
2016.
24. Yang, Z., et al. "Improved variational autoencoders for text modeling using
dilated convolutions": PMLR. in International conference on machine learning.
2017.
25. Xu, J. and G. Durrett, "Spherical latent spaces for stable variational
autoencoders," arXiv preprint arXiv:1808.10805. 2018.
26. Zhao, Y., et al., "Discretized Bottleneck in VAE: Posterior-Collapse-Free
Sequence-to-Sequence Learning," arXiv preprint arXiv:2004.10603. 2020.
27. Rezende, D. and S. Mohamed. "Variational inference with normalizing flows":
PMLR. in International conference on machine learning. 2015.
28. Kingma, D.P., et al., "Improving variational inference with inverse
autoregressive flow," arXiv preprint arXiv:1606.04934. 2016.
29. Germain, M., et al. "Made: Masked autoencoder for distribution estimation":
PMLR. in International Conference on Machine Learning. 2015.
30. He, K., et al. "Deep residual learning for image recognition". in Proceedings of
the IEEE conference on computer vision and pattern recognition. 2016.
31. Srivastava, N., et al., "Dropout: a simple way to prevent neural networks from54
overfitting," The journal of machine learning research. 15(1): p. 1929-1958,
2014.
32. Ioffe, S. and C. Szegedy. "Batch normalization: Accelerating deep network
training by reducing internal covariate shift": PMLR. in International conference
on machine learning. 2015.
33. Fu, H., et al., "Cyclical annealing schedule: A simple approach to mitigating kl
vanishing," arXiv preprint arXiv:1903.10145. 2019.
34. Lin, T.-Y., et al. "Focal loss for dense object detection". in Proceedings of the
IEEE international conference on computer vision. 2017.
35. Breiman, L., "Random forests," Machine learning. 45(1): p. 5-32, 2001.
36. Akiba, T., et al. "Optuna: A next-generation hyperparameter optimization
framework". in Proceedings of the 25th ACM SIGKDD international conference
on knowledge discovery & data mining. 2019.
37. Yang, H.-I., et al., "Risk estimation for hepatocellular carcinoma in chronic
hepatitis B (REACH-B): development and validation of a predictive score," The
lancet oncology. 12(6): p. 568-574, 2011.
38. Harrison, S.A., et al., "Development and validation of a simple NAFLD clinical
scoring system for identifying patients without advanced disease," Gut. 57(10):
p. 1441-1447, 2008.
39. Papatheodoridis, G., et al., "PAGE-B: a risk score for hepatocellular carcinoma
in Caucasians with chronic hepatitis B under a 5-year entecavir or tenofovir
therapy," Journal of Hepatology. 2015.
40. Davis, J. and M. Goadrich. "The relationship between Precision-Recall and ROC
curves". in Proceedings of the 23rd international conference on Machine
learning. 2006.
41. Youden, W.J., "Index for rating diagnostic tests," Cancer. 3(1): p. 32-35, 1950.
42. Efron, B., "Bootstrap methods: another look at the jackknife," in Breakthroughs
in statistics, Springer. p. 569-593. 1992.
43. Papatheodoridis, G., et al., "PAGE-B predicts the risk of developing
hepatocellular carcinoma in Caucasians with chronic hepatitis B on 5-year
antiviral therapy," Journal of hepatology. 64(4): p. 800-806, 2016.
44. Lai, T.S., et al., "Hepatitis C viral load, genotype, and increased risk of
developing end‐stage renal disease: REVEAL‐HCV study," Hepatology. 66(3):
p. 784-793, 2017.
45. Chen, R.T., et al., "Neural ordinary differential equations," arXiv preprint
arXiv:1806.07366. 2018.
(此全文20261104後開放外部瀏覽)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *