利用複合式縮放強化異常聲音檢測模型之性能__國立清華大學博碩士論文全文影像系統

帳號：guest(216.73.216.146) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	王昌聞
作者(外文):	Wang, Chang-Wen
論文名稱(中文):	利用複合式縮放強化異常聲音檢測模型之性能
論文名稱(外文):	Extended STgram: Improved Anomalous Sound Detection by Compound Scaling
指導教授(中文):	林裕訓
指導教授(外文):	Lin, Yu-Hsun
口試委員(中文):	劉建良林春成
口試委員(外文):	Chien, Liang-Liu Chun, Cheng-Lin
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	工業工程與工程管理學系
學號:	110034525
出版年(民國):	112
畢業學年度:	111
語文別:	英文
論文頁數:	26
中文關鍵詞:	異常聲音檢測、自監督學習
外文關鍵詞:	Anomalous Sound Detection、Self-Supervised Learning、STgram
相關次數:	推薦:0 點閱:69 評分: 下載:0 收藏:0

異常聲音檢測的任務是使用機器學習或深度學習的演算法從特定事件(如工廠中的機器)中識別出異常的聲音。
通過提前檢測到異常信號，我們可以對機器進行預測性維護，防止可能的故障。並且由於我們能夠藉由對聲學訊號進行智慧分析，確保製程順利進行。因此提高異常聲音檢測模型的性能對於智慧製造(工業4.0)有著重要的作用。為了增強異常聲音檢測的性能，我們運用了電腦視覺領域中一項重要概念「複合式縮放」。透過引入複合式縮放的概念，我們提出的檢測模型在低誤報率下的檢測效能優於其他人工智慧之模型。除此之外，將電腦視覺領域中的概念(即複合式縮放)應用於聲學數據的處理，對於未來跨領域之應用的研究上也是重要的貢獻。

Anomalous sound detection task identifies the anomaly sound from a target event (e.g., factory machines).
By detecting the abnormal signal in advance, we can conduct predictive maintenance to prevent the possible machine failures.
Therefore, the factory activities can be guarded by incorporating intelligent acoustic sensor data processing.
Hence, the improvement of anomalous sound detection plays an important role for smart manufacturing (e.g., Industry 4.0).
In order to improve the performance of anomalous sound detection, we utilize an influential concept called compound scaling from computer vision research domain. With the aid of integrating compound scaling, our proposed method outperforms other AI models for AUC under low false positive rate.
In addition to the outstanding performance, the findings of applying the concept (i.e., compound scaling) from computer vision to acoustic data are also important assets for the future studies in cross domain applications.

摘要iii
Abstract v
1 Introduction 1
2 Related Work 5
2.1 Unsupervised Learning for ASD . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Self-supervised Learning for ASD . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Compound Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Proposed Method 9
3.1 STgram: Baseline Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Extended STgram: Compound Scaling of STgram . . . . . . . . . . . . . . . . 10
3.2.1 Scaling of Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.2 Scaling of Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.3 Scaling of Feature Tensor . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Experiments and Results 15
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Training Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Ablation Study 19
5.1 Sampling Rate of Audio Clip . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2 Number of Mel-bins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3 Dilated Convolution in TgramNet . . . . . . . . . . . . . . . . . . . . . . . . 21
5.4 Number of Tgram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6 Conclusion 23
References 25

[1] P. Foggia, N. Petkov, A. Saggese, N. Strisciuglio, and M. Vento, “Audio surveillance
of roads: A system for detecting anomalous sounds,” IEEE transactions on intelligent
transportation systems, vol. 17, no. 1, pp. 279–288, 2015.
[2] Y. Li, X. Li, Y. Zhang, M. Liu, and W. Wang, “Anomalous sound detection using deep
audio representation and a blstm network for audio surveillance of roads,” Ieee Access,
vol. 6, pp. 58043–58055, 2018.
[3] Y. Kawaguchi and T. Endo, “How can we detect anomalies from subsampled audio signals?,”
in 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing
(MLSP), pp. 1–6, IEEE, 2017.
[4] Y. Koizumi, Y. Kawaguchi, K. Imoto, T. Nakamura, Y. Nikaido, R. Tanabe, H. Purohit,
K. Suefusa, T. Endo, M. Yasuda, and N. Harada, “Description and discussion on
DCASE2020 challenge task2: Unsupervised anomalous sound detection for machine condition
monitoring,” in Proceedings of the Detection and Classification of Acoustic Scenes
and Events 2020 Workshop (DCASE2020), pp. 81–85, November 2020.
[5] Y. Liu, J. Guan, Q. Zhu, and W. Wang, “Anomalous sound detection using spectraltemporal
information fusion,” in ICASSP 2022-2022 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820, IEEE, 2022.
[6] M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,”
in International conference on machine learning, pp. 6105–6114, PMLR, 2019.
[7] J. Tian, M. H. Azarian, and M. Pecht, “Anomaly detection using self-organizing mapsbased
k-nearest neighbor algorithm,” in PHM society European conference, vol. 2, 2014.
[8] M. Zhao and V. Saligrama, “Anomaly detection with score functions based on nearest
neighbor graphs,” Advances in neural information processing systems, vol. 22, 2009.
[9] J. Tian, C. Morillo, M. H. Azarian, and M. Pecht, “Motor bearing fault detection using
spectral kurtosis-based feature extraction coupled with k-nearest neighbor distance analysis,”
IEEE Transactions on Industrial Electronics, vol. 63, no. 3, pp. 1793–1803, 2015.
[10] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations
by error propagation,” tech. rep., California Univ San Diego La Jolla Inst for Cognitive
Science, 1985.
[11] S. Perez-Castanos, J. Naranjo-Alcazar, P. Zuccarello, and M. Cobos, “Anomalous sound
detection using unsupervised and semi-supervised autoencoders and gammatone audio
representation,” arXiv preprint arXiv:2006.15321, 2020.
25
[12] Y. Koizumi, S. Saito, H. Uematsu, Y. Kawachi, and N. Harada, “Unsupervised detection of
anomalous sound based on deep learning and the neyman–pearson lemma,” IEEE/ACM
Transactions on Audio, Speech, and Language Processing, vol. 27, no. 1, pp. 212–224,
2018.
[13] K. Suefusa, T. Nishida, H. Purohit, R. Tanabe, T. Endo, and Y. Kawaguchi, “Anomalous
sound detection based on interpolation deep neural network,” in ICASSP 2020-2020 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–
275, IEEE, 2020.
[14] A. S. Edun, C. LaFlamme, S. R. Kingston, C. M. Furse, M. A. Scarpulla, and J. B. Harley,
“Anomaly detection of disconnects using sstdr and variational autoencoders,” IEEE Sensors
Journal, vol. 22, no. 4, pp. 3484–3492, 2022.
[15] P. Daniluk, M. Gozdziewski, S. Kapka, and M. Kosmider, “Ensemble of auto-encoder
based systems for anomaly detection,” tech. rep., DCASE2020 Challenge, July 2020.
[16] R. Giri, S. V. Tenneti, K. Helwani, F. Cheng, U. Isik, and A. Krishnaswamy, “Unsupervised
anomalous sound detection using self-supervised classification and group masked
autoencoder for density estimation,” tech. rep., DCASE2020 Challenge, July 2020.
[17] P. Primus, “Reframing unsupervised machine condition monitoring as a supervised classification
task with outlier-exposed classifiers,” tech. rep., DCASE2020 Challenge, July
2020.
[18] K. Wilkinghoff, “Sub-cluster adacos: Learning representations for anomalous sound detection,”
in 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8,
2021.
[19] S. Chen, Y. Liu, X. Gao, and Z. Han, “Mobilefacenets: Efficient cnns for accurate realtime
face verification on mobile devices,” in Biometric Recognition: 13th Chinese Conference,
CCBR 2018, Urumqi, China, August 11-12, 2018, Proceedings 13, pp. 428–438,
Springer, 2018.
[20] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for
deep face recognition,” in Proceedings of the IEEE/CVF conference on computer vision
and pattern recognition, pp. 4690–4699, 2019.
[21] Y. Koizumi, S. Saito, H. Uematsu, N. Harada, and K. Imoto, “ToyADMOS: A dataset of
miniature-machine operating sounds for anomalous sound detection,” in Proceedings of
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA),
pp. 308–312, November 2019.
[22] H. Purohit, R. Tanabe, T. Ichige, T. Endo, Y. Nikaido, K. Suefusa, and Y. Kawaguchi,
“MIMII Dataset: Sound dataset for malfunctioning industrial machine investigation and
inspection,” in Proceedings of the Detection and Classification of Acoustic Scenes and
Events 2019 Workshop (DCASE2019), pp. 209–213, November 2019.

電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文