|
[1] L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research, vol. 9, no. 86, pp. 2579–2605, 2008. [2] F.-C. Chen, K.-D. Chen, and Y.-W. Liu, “Domestic sound event detection by shift consistency mean-teacher training and adversarial domain adaptation,” in International Congress on Acoustics, Oct 2022. [3] R. Giri, S. V. Tenneti, K. Helwani, F. Cheng, U. Isik, and A. Krishnaswamy, “Unsupervised anomalous sound detection using self-supervised classification and group masked autoencoder for density estimation,” tech. rep., DCASE2020 Challenge, July 2020. [4] M. Deng, T. Meng, J. Cao, S. Wang, J. Zhang, and H. Fan, “Heart sound classification based on improved mfcc features and convolutional recurrent neural networks,” Neural Networks, vol. 130, pp. 22–32, 2020. [5] V. Srinivasan, V. Ramalingam, and P. Arulmozhi, “Artificial neural network based pathological voice classification using mfcc features,” International Journal of Science, Environment and Technology, vol. 3, no. 1, pp. 291–302, 2014. [6] Y. Liu, J. Guan, Q. Zhu, and W. Wang, “Anomalous sound detection using spectral-temporal information fusion,” in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820, IEEE, 2022. [7] J. Lopez, G. Stemmer, and P. Lopez-Meyer, “Ensemble of complementary anomaly detectors under domain shifted conditions,” tech. rep., DCASE2021 Challenge, July 2021. [8] Y. Su, K. Zhang, J. Wang, and K. Madani, “Environment sound classification using a two-stream cnn based on decision-level fusion,” Sensors, vol. 19, no. 7, 2019. [9] Y. Su, K. Zhang, J. Wang, D. Zhou, and K. Madani, “Performance analysis of multiple aggregated acoustic features for environment sound classification,” Applied Acoustics, vol. 158, p. 107050, 2020. [10] R. Cohn and E. Holm, “Unsupervised machine learning via transfer learning and k-means clustering to classify materials image data,” Integrating Materials and Manufacturing Innovation, vol. 10, pp. 231–244, apr 2021. [11] M. Li, S. Gururangan, T. Dettmers, M. Lewis, T. Althoff, N. A. Smith, and L. Zettlemoyer, “Branch-train-merge: Embarrassingly parallel training of expert language models,” in International Conference on Learning Representations, May 2023. [12] Y. Koizumi, Y. Kawaguchi, K. Imoto, T. Nakamura, Y. Nikaido, R. Tanabe, H. Purohit, K. Suefusa, T. Endo, M. Yasuda, and N. Harada, “Description and discussion on DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition monitoring,” in Proceedings of the 37 Detection and Classification of Acoustic Scenes and Events 2020 Workshop (DCASE2020), pp. 81–85, Nov 2020. [13] P. Daniluk, M. Gozdziewski, S. Kapka, and M. Kosmider, “Ensemble of autoencoder based systems for anomaly detection,” tech. rep., DCASE2020 Challenge, July 2020. [14] J. Yamashita, H. Mori, S. Tamura, and S. Hayamizu, “Vae-based anomaly detection with domain adaptation,” tech. rep., DCASE2021 Challenge, July 2021. [15] A. Ribeiro, L. Matos, P. Pereira, E. Nunes, A. Ferreira, P. Cortez, and A. Pilastri, “Deep dense and convolutional autoencoders for unsupervised anomaly detection in machine condition sounds,” tech. rep., DCASE2020 Challenge, July 2020. [16] M.-H. Nguyen, D.-Q. Nguyen, D.-Q. Nguyen, C.-N. Pham, D. Bui, and H.-D. Han, “Deep convolutional variational autoencoder for anomalous sound detection,” in 2020 IEEE Eighth International Conference on Communications and Electronics (ICCE), pp. 313–318, Jan 2021. [17] P. Primus, “Reframing unsupervised machine condition monitoring as a supervised classification task with outlier-exposed classifiers,” tech. rep., DCASE2020 Challenge, July 2020. [18] Y. Deng, J. Liu, and J. Ma, “AITHU system for unsupervised anomalous sound detection,” tech. rep., DCASE2021 Challenge, July 2021. [19] Y. Zeng, H. Liu, L. Xu, Y. Zhou, and L. Gan, “Robust anomaly sound detection framework for machine condition monitoring,” tech. rep., DCASE2022 Challenge, July 2022. [20] B. McFee, C. Raffel, D. Liang, D. P. Ellis, M. McVicar, E. Battenberg, and O. Nieto, “librosa: Audio and music signal analysis in python,” in Proceedings of the 14th python in science conference, vol. 8, pp. 18–25, 2015. [21] K. Palanisamy, D. Singhania, and A. Yao, “Rethinking CNN models for audio classification,” CoRR, vol. abs/2007.11154, 2020. [22] P. Pedersen, “The mel scale,” Journal of Music Theory, vol. 9, no. 2, pp. 295– 308, 1965. [23] M. Sahidullah and G. Saha, “Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition,” Speech Communication, vol. 54, no. 4, pp. 543–565, 2012. [24] M. Müller and S. Ewert, “Chroma Toolbox: Matlab implementations for extracting variants of chroma-based audio features,” in Proceedings of the 12th International Society for Music Information Retrieval Conference, pp. 215– 220, ISMIR, Oct. 2011. 38 [25] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, 2009. [26] K. Simonyan and A. Zisserman, “Very deep convolutional networks for largescale image recognition,” in International Conference on Learning Representations, 2015. [27] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015. [28] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” CoRR, vol. abs/1512.00567, 2015. [29] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” CoRR, vol. abs/1610.02357, 2016. [30] G. Huang, Z. Liu, and K. Q. Weinberger, “Densely connected convolutional networks,” CoRR, vol. abs/1608.06993, 2016. [31] M. Tan and Q. V. Le, “Efficientnetv2: Smaller models and faster training,” CoRR, vol. abs/2104.00298, 2021. [32] P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987. [33] D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-1, no. 2, pp. 224–227, 1979. [34] T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Wu, “An efficient k-means clustering algorithm: analysis and implementation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881–892, 2002. [35] F. Murtagh and P. Contreras, “Algorithms for hierarchical clustering: an overview,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 2, no. 1, pp. 86–97, 2012. [36] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004. [37] M. P. Sampat, Z. Wang, S. Gupta, A. C. Bovik, and M. K. Markey, “Complex wavelet structural similarity: A new image similarity index,” IEEE Transactions on Image Processing, vol. 18, no. 11, pp. 2385–2401, 2009. [38] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, 39 B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015. Software available from tensorflow.org. [39] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” in International Conference on Learning Representations, 2016. [40] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein GAN,” CoRR, vol. abs/1701.07875, 2017. [41] I. Haloui, J. S. Gupta, and V. Feuillard, “Anomaly detection with wasserstein GAN,” CoRR, vol. abs/1812.02463, 2018. [42] J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. [43] J. Wolleb, F. Bieder, R. Sandkühler, and P. C. Cattin, “Diffusion models for medical anomaly detection,” in International Conference on Medical image computing and computer-assisted intervention, pp. 35–45, Springer, 2022. [44] A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. W. Senior, and K. Kavukcuoglu, “Wavenet: A generative model for raw audio,” in The 9th ISCA Speech Synthesis Workshop, p. 125, ISCA, 2016. [45] E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discriminative domain adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7167–7176, 2017. [46] K. Zhou, Z. Liu, Y. Qiao, T. Xiang, and C. C. Loy, “Domain generalization: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, pp. 4396–4415, 2022 |