|
[1] Samsa GP, Bian J, Lipscomb J, Matchar DB. Epidemiology of recurrent cerebral infarction: a medicare claims-based comparison of first and recurrent strokes on 2-year survival and cost. Stroke. 1999;30:338–349 [2] CAPRIE Steering Committee. A randomised, blinded, trial of Clopidogrel versus Aspirin in Patients at Risk of Ischaemic Events (CAPRIE). Lancet. 1996;348:1329 –1339. [3] Diener HC, Ringleb PA, Savi P. Clopidogrel for the secondary prevention of stroke. Exp Opin Pharmacother. 2005;6:755–764. [4] Kernan WN, Viscoli CM, Brass LM, Makuch RW, Sarrel PM, Roberts RS, Gent M, Rothwell P, Sacco RL, Liu RC, Boden-Albala B, Horwitz RI. The Stroke Prognosis Instrument II (SPI-II): a clinical prediction instrument for patients with transient ischemia and nondisabling ischemic stroke. Stroke. 2000;31:456 – 462. [5] Hankey GJ, Slattery JM, Warlow CP. Can the long term outcome of individual patients with transient ischaemic attacks be predicted accurately? J Neurol Neurosurg Psychiatry. 1993;56:752–759. [6] van Wijk I, Kappelle LJ, van Gijn J, Koudstaal PJ, Franke CL, Vermeulen M, Gorter JW, Algra A. Long-term survival and vascular event risk after transient ischaemic attack or minor ischaemic stroke: a cohort study. Lancet. 2005;365:2098 –2104. [7] Weimar C, Bennemann J, Michalski D, Muller M, Luckner K, Katsarava Z, Weber R, Diener H-C: Prediction of recurrent stroke and vascular health in patients with transient ischemic attack or nondisabling stroke: a prospective comparison of validated prognostic scores. Stroke 2010;41:487–493. [8] Ay H, Gungor L, Arsava EM, et al. A score to predict early risk of recurrence after ischemic stroke. Neruology 2010;74:128e35. [9] Jingshu Liu, Zachariah Zhang, and Narges Razavian. Deep ehr: Chronic disease prediction using medical notes. arXiv preprint arXiv:1808.04928, 2018. [10] Geraci J, Wilansky P, Luca VD, Roy A, Kennedy JL, Strauss J. Applying deep neural networks to unstructured text notes in electronic medical records for phenotyping youth depression. EBMH. 2017; 20:83-87 [11] Z. C. Lipton. The Mythos of Model Interpretability. ArXiv e-prints, June 2016. [12] Balázs Csanád Csáji (2001) Approximation with Artificial Neural Networks; Faculty of Sciences; Eötvös Loránd University, Hungary [13] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, pages 1106–1114, 2012. [14] Werbos, P.J. (1975). Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. [15] Cho, Kyunghyun; van Merrienboer, Bart; Gulcehre, Caglar; Bahdanau, Dzmitry; Bougares, Fethi; Schwenk, Holger; Bengio, Yoshua (2014). "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation" [16] Williams, Ronald J.; Hinton, Geoffrey E.; Rumelhart, David E. (October 1986). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536. [17] Bahdanau, D., Cho, K., and Bengio, Y. Neural machine translation by jointly learning to align and translate. In International Conference on Learning Representations (2015). [18] Yang Z, Yang D, Dyer C, He X, Smola AJ, and Hovy EH. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2016), 2016. [19] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In DLRS. 7–10 [20] Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine Series, 5, 50 (302): 157–175. [21] Tomas Mikolov, K. C., Greg Corrado,Jeffrey Dean (2013). Efficient Estimation of Word Representations in Vector Space. International Conference on Learning Representations. [22] Harris, Z. (1954). Distributional structure. Word, 10(23): 146–162. [23] Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society. Series B (Methodological), 20(2), 215-242. [24] Ho, Tin Kam (1995). Random Decision Forests. Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14–16 August 1995. pp. 278–282. [25] L. Breiman. Random forests. Machine Learning, 45(1): 5–32, 2001. [26] Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. “Model Class Reliance: Variable importance measures for any machine learning model class, from the ‘Rashomon’ perspective.” 2018. [27] Friedman, Jerome H. “Greedy function approximation: A gradient boosting machine.” Annals of statistics (2001): 1189-1232. [28] DeLong, E.R., DeLong, D.M., Clarke-Pearson, D.L., 1988. Comparing the areas under two or more correlated receiver operating characteristic curves: a non-parametric approach. Biometrics 44, 837–845. [29] Billy Chiu, Gamal Crichton, Sampo Pyysalo, and Anna Korhonen. 2016. How to train good word embeddings for biomedical NLP. In Proceedings of BioNLP’16. [30] J. Ratcliff, and D. Metzener, “Pattern Matching: The Gestalt Approach,” in Dr. Dobb’s Journal, Jul. 1988. [31] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, pages 1929–1958, 2014
|