|
[1] 2016 cost of data center outages report. https://goo.gl/OeNM4U. [2] Google cluster data - discussions, 2011. https://groups.google.com/ forum/#!forum/googleclusterdata-discuss. [3] L. A. Barroso, J. Clidaras, and U. H¨olzle. The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synthesis lectures on computer architecture, 8(3):1–154, 2013. [4] C. Bishop. Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, 2006. [5] M. Botezatu, I. Giurgiu, J. Bogojeska, and D. Wiesmann. Predicting disk replacement towards reliable data centers. 2016. [6] G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015. [7] L. Breiman. Random forests. Machine learning, 45(1):5–32, 2001. [8] X. Chen, C.-D. Lu, and K. Pattabiraman. Failure analysis of jobs in compute clouds: A google cluster case study. In 2014 IEEE 25th International Symposium on Software Reliability Engineering, pages 167–177. IEEE, 2014. [9] S. Di, D. Kondo, and W. Cirne. Characterization and comparison of cloud versus grid workloads. In 2012 IEEE International Conference on Cluster Computing, pages 230–238. IEEE, 2012. [10] Q. Guan and S. Fu. Adaptive anomaly identification by exploring metric subspace in cloud computing infrastructures. In Reliable Distributed Systems (SRDS), 2013 IEEE 32nd International Symposium on, pages 205–214. IEEE, 2013. [11] G. Hamerly, C. Elkan, et al. Bayesian approaches to failure prediction for disk drives. In ICML, pages 202–209. Citeseer, 2001. [12] D.-C. Juan, L. Li, H.-K. Peng, D. Marculescu, and C. Faloutsos. Beyond poisson: Modeling inter-arrival time of requests in a datacenter. In Advances in Knowledge Discovery and Data Mining, pages 198–209. Springer, 2014. [13] Z. Liu and S. Cho. Characterizing machines and workloads on a google cluster. In 2012 41st International Conference on Parallel Processing Workshops, pages 397–403. IEEE, 2012. [14] T. D. Miller and I. L. Crawford Jr. Terminating a non-clustered workload in response to a failure of a system with a clustered workload, Jan. 26 2010. US Patent 7,653,833. [15] J. F. Murray, G. F. Hughes, and K. Kreutz-Delgado. Machine learning methods for predicting failures in hard drives: A multiple-instance application. Journal of Machine Learning Research, 6(May):783–816, 2005. [16] D. M. Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. 2011. [17] C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In SOCC, page 7. ACM, 2012. [18] C. Reiss, J. Wilkes, and J. L. Hellerstein. Google cluster-usage traces: format + schema. Technical report, Google Inc., Mountain View, CA, USA, Nov. 2011. Revised 2014-11-17 for version 2.1. Posted at https://github.com/google/cluster-data. [19] B. Scholkopf, K.-K. Sung, C. J. Burges, F. Girosi, P. Niyogi, T. Poggio, and V. Vapnik. Comparing support vector machines with gaussian kernels to radial basis function classifiers. IEEE transactions on Signal Processing, 45(11):2758–2765, 1997. [20] Y. Tan and X. Gu. On predictability of system anomalies in real world. In 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pages 133–140. IEEE, 2010. [21] C. Van-Rijsbergen. Information Retrieval. Butterworths, London, England, 2nd edition, 1979. [22] A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J.Wilkes. Largescale cluster management at google with borg. In Proceedings of the Tenth European Conference on Computer Systems, page 18. ACM, 2015. [23] Q. Zhang, J. Hellerstein, and R. Boutaba. Characterizing task usage shapes in google compute clusters. 2011. |