帳號:guest(3.138.137.199)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):林士智
作者(外文):Lin, SHIH-CHIH
論文名稱(中文):俱動態加權多重合成的遮罩式注意力 ConvNeXt Unet 模型於異常偵測與分割
論文名稱(外文):Masked Attention ConvNeXt Unet with Multi-Synthesis Dynamic Weighting for Anomaly Detection and Localization
指導教授(中文):賴尚宏
指導教授(外文):Lai, Shang-Hong
口試委員(中文):李哲榮
林彥宇
鄭嘉珉
口試委員(外文):Lee, Che-Rung
Lin, Yen-Yu
Cheng, Chia-Ming
學位類別:碩士
校院名稱:國立清華大學
系所名稱:智慧製造跨院高階主管碩士在職學位學程
學號:108005508
出版年(民國):112
畢業學年度:111
語文別:英文
論文頁數:49
中文關鍵詞:異常偵測與分割自監督學習
外文關鍵詞:Anomaly detection and Localizationself-supervised learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:305
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
近年來,自監督模型在異常檢測中受到很大的關注。先前的研究已證明自監督學習所學到的特徵能夠對異常檢測有所助益。在本研究中,我們採用新的模型,名為俱動態加權多重合成的遮罩注意力 ConvNeXt UNet(MACoW)模型,其利用動態加權(MSdW)演算法來將多種合成方法組合出最大化效能,該多種合成方法函蓋近年所提出的自監督方法,如Cutpaste、Mask、NSA 和 Perlin。此外,本研究採用最新的 ConvNeXtV2 和 UNet 網絡作為模型的重建子網絡,並嵌入自監督預測卷積區塊多重注意力(SSPCBMA)機制來增強特徵提取能力。最後,本研究對三種數據集進行異常偵測和分割任務進行評估,包括 MVTecAD、BTAD 及 KSDD2。結果顯示,所提出的模型在像素 AP 和 PRO 指標等指標皆優於現有的方法,取得目前為止的最佳效能。
In recent years, self-supervised models have gained much attention in anomaly detection. Previous studies have demonstrated that the learned representation from self-supervision can benefit anomaly detection. In this work, we introduce a novel pipeline, named Masked Attention ConvNeXt UNet with Multi-Synthesis dynamic Weighting (MACoW), to leverage multiple popular self-supervised methods, such as Cutpaste, Mask, NSA, and Perlin, using the Multi-Synthesis dynamic Weighting (MSdW) algorithm to maximize performance. Furthermore, we employ the latest ConvNeXtV2 and UNet networks as our reconstructive subnetwork and embedded Self-Supervised Predictive Convolutional Block with a Multi-Attention (SSPCBMA) mechanism to enhance feature extraction capability. We evaluate our proposed approach on various datasets for anomaly detection and segmentation tasks, including MVTecAD, BTAD, and KSDD2. The results demonstrate that the proposed model outperforms the state-of-the-art methods, especially in Pixel AP and PRO metrics.
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work 5
2.1 Reconstruction-Based Approaches . . . . . . . . . . . . . . . . . . 5
2.2 Multi-Loss Dynamic Weighting Methods . . . . . . . . . . . . . . 6
3 Proposed Method 9
3.1 Reconstructive Subnetwork . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Discriminative Subnetwork . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Self-Supervised Predictive Convolutional Block with Multi-Attention(SSPCBMA) 11
3.4 Simulated Anomaly Generation . . . . . . . . . . . . . . . . . . . 12
3.5 Multi-Synthesis dynamic Weighting . . . . . . . . . . . . . . . . . 12
3.6 Loss function explanation . . . . . . . . . . . . . . . . . . . . . . . 14
3.6.1 Smooth L1 . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.6.2 SSIM Loss . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.6.3 Focal Loss . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.6.4 Dice Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.7 Training Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Experiments 25
4.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Anomaly Detection and Localization on MVTecAD . . . . . . . . . 27
4.3 Qualitative Examples on MVTecAD . . . . . . . . . . . . . . . . . 29
4.4 Qualitative Comparison on MVTecAD . . . . . . . . . . . . . . . . 29
4.5 Visualization of Weight Adjustment for Multiple Synthesis Methods 30
5 Ablation study 38
5.1 Comparison of SSPCBMA and SSPCAB . . . . . . . . . . . . . . 38
5.2 Comparison of Multi-Synthesis dynamic Weighting(MSdW) and MultiLoss dynamic Weighting . . . . . . . . . . . . . . . . . . . . . . . 38
5.3 The Impact Analysis of Components Capability on Our MACoW
Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.4 The Loss Function for Localization . . . . . . . . . . . . . . . . . . 41
6 Conclusions 44
References 46
References
[1] Bergmann, P., Fauser, M., Sattlegger, D., and Steger, C. Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2019), pp. 9592–9600.
[2] Bergmann, P., Fauser, M., Sattlegger, D., and Steger, C. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2020), pp. 4183–4192.
[3] Božič, J., Tabernik, D., and Skočaj, D. Mixed supervision for surface-defect detection: From weakly to fully supervised learning. Computers in Industry 129 (2021), 103459.
[4] Chen, Z., Badrinarayanan, V., Lee, C.-Y., and Rabinovich, A. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In International conference on machine learning (2018), PMLR, pp. 794–803.
[5] Cui, Y., Liu, Z., and Lian, S. A survey on unsupervised industrial anomaly detection algorithms. arXiv preprint arXiv:2204.11161 (2022).
[6] Deng, H., and Li, X. Anomaly detection via reverse distillation from one-class embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2022), pp. 9737–9746.
[7] Groenendijk, R., Karaoglu, S., Gevers, T., and Mensink, T. Multi-loss weighting with coefficient of variations. In Proceedings of the IEEE/CVF winter
conference on applications of computer vision (2021), pp. 1469–1478.
[8] Hao, Y., Li, J., Wang, N., Wang, X., and Gao, X. Spatiotemporal consistencyenhanced network for video anomaly detection. Pattern Recognition 121 (2022), 108232.
[9] He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and
pattern recognition (2016), pp. 770–778.
[10] Huang, C., Ye, F., Cao, J., Li, M., Zhang, Y., and Lu, C. Attribute restoration framework for anomaly detection, 2020.
[11] Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., et al. A large chest radiograph dataset with uncertainty labels and expert comparison. In Proc
AAAI Conf Artif Intell, vol. 33.
[12] Jiang, J., Zhu, J., Bilal, M., Cui, Y., Kumar, N., Dou, R., Su, F., and Xu, X. Masked swin transformer unet for industrial anomaly detection. IEEE Transactions on Industrial Informatics 19, 2 (2022), 2200–2209.
[13] Kendall, A., Gal, Y., and Cipolla, R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE
conference on computer vision and pattern recognition (2018), pp. 7482–7491.
46[14] Lee, J.-H., and Kim, C.-S. Multi-loss rebalancing algorithm for monocular depth estimation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16 (2020),
Springer, pp. 785–801.
[15] Li, C.-L., Sohn, K., Yoon, J., and Pfister, T. Cutpaste: Self-supervised learning for anomaly detection and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 9664–9674.
[16] Liang, Y., Zhang, J., Zhao, S., Wu, R., Liu, Y., and Pan, S. Omni-frequency channel-selection representations for unsupervised anomaly detection. arXiv
preprint arXiv:2203.00259 (2022).
[17] Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (2017), pp. 2980–2988.
[18] Liu, T., Li, B., Zhao, Z., Du, X., Jiang, B., and Geng, L. Reconstruction from edge image combined with color and gradient difference for industrial surface
anomaly detection. arXiv preprint arXiv:2210.14485 (2022).
[19] Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., and Xie, S. A convnet for the 2020s, 2022.
[20] Madan, N., Ristea, N.-C., Ionescu, R. T., Nasrollahi, K., Khan, F. S., Moeslund, T. B., and Shah, M. Self-supervised masked convolutional transformer
block for anomaly detection. arXiv preprint arXiv:2209.12148 (2022).
[21] Mathian, E., Liu, H., Fernandez-Cuesta, L., Samaras, D., Foll, M., and Chen, L. Haloae: An halonet based local transformer auto-encoder for anomaly detection and localization. arXiv preprint arXiv:2208.03486 (2022).
[22] Milletari, F., Navab, N., and Ahmadi, S.-A. V-net: Fully convolutional neural networks for volumetric medical image segmentation, 2016.
[23] Mishra, P., Verk, R., Fornasier, D., Piciarelli, C., and Foresti, G. L. Vt-adl: A vision transformer network for image anomaly detection and localization.
In 2021 IEEE 30th International Symposium on Industrial Electronics (ISIE) (2021), IEEE, pp. 01–06.
[24] Perlin, K. An image synthesizer. ACM Siggraph Computer Graphics 19, 3 (1985), 287–296.
[25] Puccio, B., Pooley, J. P., Pellman, J. S., Taverna, E. C., and Craddock, R. C. The preprocessed connectomes project repository of manually corrected skullstripped t1-weighted anatomical mri data. Gigascience 5, 1 (2016), s13742–
016.
[26] Ren, S., He, K., Girshick, R., and Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information
processing systems 28 (2015).
47[27] Rippel, O., Mertens, P., and Merhof, D. Modeling the distribution of normal data in pre-trained deep features for anomaly detection. In 2020 25th International Conference on Pattern Recognition (ICPR) (2021), IEEE, pp. 6726–
6733.
[28] Ristea, N.-C., Madan, N., Ionescu, R. T., Nasrollahi, K., Khan, F. S., Moeslund, T. B., and Shah, M. Self-supervised predictive convolutional attentive
block for anomaly detection–supplementary.
[29] Ristea, N.-C., Madan, N., Ionescu, R. T., Nasrollahi, K., Khan, F. S., Moeslund, T. B., and Shah, M. Self-supervised predictive convolutional attentive
block for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), pp. 13576–13586.
[30] Ronneberger, O., Fischer, P., and Brox, T. U-net: Convolutional networks for
biomedical image segmentation. In Medical Image Computing and ComputerAssisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18 (2015), Springer,
pp. 234–241.
[31] Roth, K., Pemula, L., Zepeda, J., Schölkopf, B., Brox, T., and Gehler, P. Towards total recall in industrial anomaly detection. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), pp. 14318–14328.
[32] Schlüter, H. M., Tan, J., Hou, B., and Kainz, B. Natural synthetic anomalies for self-supervised anomaly detection and localization. In Computer Vision–
ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXI (2022), Springer, pp. 474–489.
[33] Shamshad, F., Khan, S., Zamir, S. W., Khan, M. H., Hayat, M., Khan, F. S., and Fu, H. Transformers in medical imaging: A survey. Medical Image Analysis
(2023), 102802.
[34] Shi, Y., Yang, J., and Qi, Z. Unsupervised anomaly segmentation via deep feature reconstruction. Neurocomputing 424 (2021), 9–22.
[35] Tao, X., Gong, X., Zhang, X., Yan, S., and Adak, C. Deep learning for unsupervised anomaly localization in industrial images: A survey. IEEE Transactions
on Instrumentation and Measurement (2022).
[36] Venkataramanan, S., Peng, K.-C., Singh, R. V., and Mahalanobis, A. Attention guided anomaly localization in images, 2020.
[37] Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. Image quality assessment: from error visibility to structural similarity. IEEE transactions on
image processing 13, 4 (2004), 600–612.
[38] Welford, B. Note on a method for calculating corrected sums of squares and products. Technometrics 4, 3 (1962), 419–420.
48[39] Woo, S., Debnath, S., Hu, R., Chen, X., Liu, Z., Kweon, I. S., and Xie, S. Convnext v2: Co-designing and scaling convnets with masked autoencoders.
arXiv preprint arXiv:2301.00808 (2023).
[40] Woo, S., Park, J., Lee, J.-Y., and Kweon, I. S. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer
vision (ECCV) (2018), pp. 3–19.
[41] Xia, X., Pan, X., Li, N., He, X., Ma, L., Zhang, X., and Ding, N. Gan-based anomaly detection: a review. Neurocomputing (2022).
[42] Zavrtanik, V., Kristan, M., and Skočaj, D. Draem-a discriminatively trained reconstruction embedding for surface anomaly detection. In Proceedings of the
IEEE/CVF International Conference on Computer Vision (2021), pp. 8330–8339.
[43] Zavrtanik, V., Kristan, M., and Skočaj, D. Reconstruction by inpainting for visual anomaly detection. Pattern Recognition 112 (2021), 107706.
[44] Zavrtanik, V., Kristan, M., and Skočaj, D. Dsr–a dual subspace re-projection network for surface anomaly detection. In Computer Vision–ECCV 2022: 17th
European Conference, Tel Aviv, Israel, October 23– 27, 2022, Proceedings, Part XXXI (Berlin, Heidelberg, 2022), Springer-Verlag, p. 539–554.
[45] Zhang, H., Wang, Z., Wu, Z., and Jiang, Y.-G. Diffusionad: Denoising diffusion for anomaly detection. arXiv preprint arXiv:2303.08730 (2023).
[46] Zhang, H., Wu, Z., Wang, Z., Chen, Z., and Jiang, Y.-G. Prototypical residual networks for anomaly detection and localization, 2023.
[47] Zhang, J., Shen, X., Zhuo, T., and Zhou, H. Brain tumor segmentation based on refined fully convolutional neural networks with a hierarchical dice loss.
arXiv preprint arXiv:1712.09093 (2017).
[48] Zimmerer, D., Isensee, F., Petersen, J., Kohl, S., and Maier-Hein, K. Unsupervised anomaly localization using variational auto-encoders, 2019.
[49] Zou, Y., Jeong, J., Pemula, L., Zhang, D., and Dabeer, O. Spot-the-difference self-supervised pre-training for anomaly detection and segmentation. In European Conference on Computer Vision (2022), Springer, pp. 392–408.


 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *