帳號:guest(18.118.19.189)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):游靖雯
作者(外文):Yu, Ching-Wen
論文名稱(中文):基於Transformer之單張影像去雨
論文名稱(外文):Efficient Transformer for Single Image Deraining
指導教授(中文):林嘉文
指導教授(外文):Lin, Chia-Wen
口試委員(中文):林彥宇
許志仲
陳駿丞
口試委員(外文):Lin, Yen-Yu
Hsu, Chih-Chunng
Chen, Jun-Cheng
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:109061619
出版年(民國):112
畢業學年度:111
語文別:英文
論文頁數:28
中文關鍵詞:單張圖像去雨變壓器網路
外文關鍵詞:Single Image DerainingTransformer Network
相關次數:
  • 推薦推薦:0
  • 點閱點閱:341
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
在圖像修復的任務中,基於深度學習的有效發展,諸多方法均透過注意力機制以加強模型對圖片特徵的提取,進而最後將影像進行還原,但最終可能會因低品質不清晰的圖像或因雨水、霧氣影響而被遮擋的圖像無法對高品質的清晰影像和其細節部分做細緻還原。現今大多數圖像去雨的方法均以訓練在合成雨的數據集為主,因此將模型直接運用在真實雨的情況下去雨效果不是很理想,所以如何在有限樣本下訓練模型以達到理想去雨效果已成為一項挑戰。
在傳統影像修復任務中,基本利用卷積神經網路(CNNs)學習提取不同大小尺度的圖像的整體輪廓特徵與其細節資訊,以達到提升圖像還原技術效能,而鑒於近期深度學習快速發展下,卷積神經網路(CNNs) 幾乎統治了計算機視覺領域,並在包括圖像修復在內的不同等級的視覺任務中取得相當大的成功。然而,最近基於Transformer的模型也顯示出令人印象深刻的性能表現,甚至超過了以CNN為基礎的方法,成為目前高級視覺任務中的最先進方法。
在本論文中,我們首先提出一個基於Transformer的恢復架構,它使用U-Net架構做為基礎並將Transformer層取代原卷積層作為基礎模塊,以進行圖像去雨。
Many approaches for image restoration task employ the attention mechanism to improve the model's extraction of picture characteristics and then restore the image in the end, based on the successful growth of deep learning. However, due to low-quality unclear photographs or images obstructed by rain and fog, it is eventually impossible to restore high-quality clear images and their features in full. Most existing single image deraining algorithms are mostly trained on synthetic rain datasets, thus the effect of applying the model to actual rain is less than ideal, therefore training the model with limited samples to obtain the optimal deraining effect has become a difficulty.
Convolutional Neural Networks (CNNs) are mostly employed in conventional image restoration tasks to learn and extract the overall contour features and detailed information of images of various scales in order to improve the performance of image restoration technology. Convolutional Neural Networks (CNNs) have effectively dominated computer vision in recent years, achieving significant success in many levels of vision tasks including such image restoration. However, in recent years, Transformer-based models have outperformed CNN-based approaches, becoming the state-of-the-art alternatives for high-level vision tasks.
In this paper, we first propose a Transformer-based restoration architecture, which uses the U-Net architecture as the basis and replaces the original machine layer with the Transformer layer as the basic module for image deraining.
Contents

摘要------------------------------------1
Abstract-------------------------------2
Contents-------------------------------3
Introduction---------------------------4
Related Work---------------------------7
2.1 Image Deraining-------------------7
2.2 Vision Transformer----------------7
Methodology----------------------------9
3.1 Overall Pipeline-------------------10
3.2 Multi-Dconv Feature Attention------11
3.3 Dconv Feed-Forward Network---------12
3.4 Selective Kernel Feature Fusion----13
3.5 Progressive Learning---------------14
3.6 Loss Function----------------------15
Experiments-----------------------------16
4.1 Datasets and Evaluation Protocol---16
4.2 Implementation Details-------------16
4.3 Single Image Deraining Results-----17
4.4 Ablation Studies-------------------21
Conclusion------------------------------24
Reference-------------------------------25
[1] Dongwei Ren, Wangmeng Zuo, Qinghua Hu, Pengfei Zhu, and Deyu Meng. Progressive image deraining networks: A better and simpler baseline. In CVPR, 2019.
[2] Kui Jiang, Zhongyuan Wang, Peng Yi, Baojin Huang, Yimin Luo, Jiayi Ma, and Junjun Jiang. Multi-scale progressive fusion network for single image deraining. In CVPR, 2020. [3] Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Multi-stage progressive image restoration. In CVPR, 2021.
[4] Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, and Chengpeng Chen. Hinet: Half instance normalization network for image restoration. In CVPRW, 2021.
[5] Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient Transformer for High-Resolution Image Restoration. In CVPR, 2022.
[6] Orest Kupyn, Tetiana Martyniuk, Junru Wu, and Zhangyang Wang. Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In ICCV, 2019.
[7] Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. Learning to see in the dark. In CVPR, 2018.
[8] Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. TIP, 2017.
[9] Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu. Residual dense network for image restoration. TPAMI, 2020.
[10] Andrey Ignatov, Nikolay Kobyshev, Radu Timofte, Kenneth Vanhoey, and Luc Van Gool. DSLR-quality photos on mobile devices with deep convolutional networks. In ICCV, 2017.
[11] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
[12] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NeurIPS, pages 5998–6008, 2017.
[13] Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (GELUs). arXiv:1606.08415, 2016.
[14] Abdullah Abuolaim and Michael S Brown. Defocus deblurring using dual-pixel data. In ECCV, 2020.
[15] Sung-Jin Cho, Seo-Won Ji, Jun-Pyo Hong, Seung-Won Jung, and Sung-Jea Ko. Rethinking coarse-to-fine approach in single image deblurring. In ICCV, 2021.
[16] Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. Pre-trained image processing transformer. In CVPR, 2021.
[17] Jiezhang Cao, Yawei Li, Kai Zhang, and Luc Van Gool. Video super-resolution transformer. arXiv preprint arXiv:2106.06847, 2021.
[18] Xiang Li, Wenhai Wang, Xiaolin Hu, and Jian Yang. Selective kernel networks. In CVPR, 2019.
[19] Orest Kupyn, Tetiana Martyniuk, Junru Wu, and Zhangyang Wang. DeblurGAN-v2: Deblurring (orders-of magnitude) faster and better. In ICCV, 2019.
[20] Zhendong Wang, Xiaodong Cun, Jianmin Bao, and Jianzhuang Liu. Uformer: A general u-shaped transformer for image restoration. In CVPR, 2022.
[21] Zongsheng Yue, Qian Zhao, Lei Zhang, and Deyu Meng. Dual adversarial network: Toward real-world noise removal and noise generation. In ECCV, 2020.
[22] Shuhang Gu, Yawei Li, Luc Van Gool, and Radu Timofte. Self-guided network for fast image denoising. In ICCV, 2019.
[23] Xing Liu, Masanori Suganuma, Zhun Sun, and Takayuki Okatani. Dual residual networks leveraging the potential of paired operations for image restoration. In CVPR, 2019.
[24] Xia Li, Jianlong Wu, Zhouchen Lin, Hong Liu, and Hongbin Zha. Recurrent squeeze-and-excitation context aggregation net for single image deraining. In ECCV, 2018.
[25] Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Learning enriched features for real image restoration and enhancement. In ECCV, 2020.
[26] Yulun Zhang, Kunpeng Li, Kai Li, Bineng Zhong, and Yun Fu. Residual non-local attention networks for image restoration. In ICLR, 2019.
[27] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, 2021.
[28] Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. SwinIR: Image restoration using swin transformer. In IEEE Int. Conf. Comput. Vis. Worksh., 2021.
[29] Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, Zehan Wang. Real-time single image and video super-resolution 10 using an efficient sub-pixel convolutional neural network. In CVPR, 2016.
[30] Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
[31] Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. Image super-resolution using very deep residual channel attention networks. In ECCV, 2018.
[32] Saeed Anwar and Nick Barnes. Real image denoising with feature attention. In ICCV, 2019.
[33] Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, Lei Zhang. Second-order attention network for single image super-resolution. In CVPR, 2019.
[34] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In CVPR, 2018.
[35] Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon. CBAM: Convolutional block attention module. In ECCV, 2018.
[36] Xueyang Fu, Jiabin Huang, Delu Zeng, Yue Huang, Xinghao Ding, and John Paisley. Removing rain from single images via a deep detail network. In CVPR, 2017.
[37] He Zhang and Vishal M Patel. Density-aware single image de-raining using a multi-stream dense network. In CVPR, 2018.
[38] Wenhan Yang, Robby T Tan, Jiashi Feng, Jiaying Liu, Zongming Guo, and Shuicheng Yan. Deep joint rain detection and removal from a single image. In CVPR, 2017.
[39] He Zhang, Vishwanath Sindagi, and Vishal M Patel. Image de-raining using a conditional generative adversarial network. TCSVT, 2019.
[40] Wei Wei, Deyu Meng, Qian Zhao, Zongben Xu, and Ying Wu. Semi-supervised transfer learning for image rain removal. In CVPR, 2019.
[41] Ilya Loshchilov and Frank Hutter. SGDR: Stochastic gradient descent with warm restarts. In ICLR, 2017.
[42] Kuldeep Purohit, Maitreya Suin, AN Rajagopalan, and Vishnu Naresh Boddeti. Spatially-adaptive image restoration using distortion-guided networks. In ICCV, 2021.
[43] Anish Mittal, Rajiv Soundararajan, and Alan C. Bovik. Making a “completely blind ” image quality analyzer. IEEE Signal Process. Lett, 20(3):209–212, 2013.
[44] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. UNet: convolutional networks for biomedical image segmentation. In MICCAI, 2015.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *