帳號:guest(3.15.14.86)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):黃崇誌
作者(外文):Huang, Chung-Chih
論文名稱(中文):基於人臉先驗與感知損失之人臉強化影像超分辨率
論文名稱(外文):Face-Enhanced Single Image Super-Resolution Based on Facial Priors and Perceptual Loss
指導教授(中文):林嘉文
指導教授(外文):Lin, Chia-Wen
口試委員(中文):蔡文錦
胡敏君
施皇嘉
口試委員(外文):Tsai, Wen-Jiin
Hu, Min-Chun
Shih, Huang-Chia
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:107061527
出版年(民國):109
畢業學年度:109
語文別:英文
論文頁數:31
中文關鍵詞:單圖像超分辨率人臉超分辨率人臉先驗知識感知損失
外文關鍵詞:Single Image Super-ResolutionFace Super-ResolutionFacial prior knowledgePerceptual loss
相關次數:
  • 推薦推薦:0
  • 點閱點閱:791
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
近年來,深度學習在圖像超分辨率任務取得了出色的表現,包含單圖像對一班物件的超解析率,以及針對人臉的人臉超分辨率。人臉超分辨率被視為超解析率任務的一個特例,所以其使用的方法與一般超解析率並不相同。然而,在真實場景中,有許多同時包含人臉跟其他物體的圖片,像是在會議室以及像Facebook及Instagram 等社群網路上傳的圖片。雖然人臉在圖片中是重要的部分,但其他地方像是背景或一般物體有時也很重要。換句話說,我們希望可以增強人臉的部分,同時保持其他地方的品質。在大多數超分辨率的方法中,處理人臉的方法跟一般物體的方法並不相同,所以必須分別對對應的地方做處理。但在實際應用中,用兩個不同模型對同一張圖片處理並不是個好方法,因為這需要為同一個目標付出雙倍的運算資源。換句話說,其實我們可以用一個模型處理這兩種類似的任務。
在我們的方法中,我們考慮了人臉先驗,使用人臉相關任務的預訓練模型計算感知損失,並把這兩者結合在單圖像超分辨率方法中。人臉先驗幫助我們強化圖像中人臉的部分; 用FaceNet計算感知損失比起過往使用VGG網路更為有效; 使用單圖像超分辨率方法作為基礎模型幫助我們從低分辨率輸入圖像提取特徵。結果顯示我們強化了圖像中人臉的部分,同時保持了在一般物體的表現,並且可以實時處理單張圖像。
Recently, deep learning has achieved great performance in image super-resolution (SR) task, including single-image super-resolution (SISR) for general SR task and face super-resolution for human face images. Face super-resolution task is seen as a special case for general super-resolution so that different methods are used for these two tasks. However, in real-world scenario, there are many images that including human faces and the objects, such as images in meeting room or taken for social media like Facebook and Instagram. Although human faces are the important part in the image, but the other part like background or objects sometimes are also important. In other words, we want to enhance the quality of human faces in the images, but keep the quality at the other part. In the most SR methods, they handle human faces and general objects as different tasks so that if we want to enhance human faces part we have to use face SR method and the other part general SR method. But in reality, using two models to deal with the same image is a bad solution because this solution means we use almost double computational resources for the same goal. In other words, we can handle these two similar tasks in single model.
In our work, we put into facial prior knowledge into consideration, use pretrained model for human-face-related tasks to calculate perceptual loss, and combine them into general single image super-resolution method. Using facial prior helps us to refine human faces in the image; using perceptual loss from FaceNet[1] is more effective and efficient than using VGG[2] network; and using single image super-resolution method as our backbone model helps us extracting features in input low-resolution image. Results show that we enhance human faces part in the image, keep the quality in the other part, and can deal with single image in real-time.
摘要 i
Abstract ii
Content 1
Chapter 1 Introduction 3
Chapter 2 Related Work 7
2.1 Single Image Super-Resolution 7
2.2 Face Super-Resolution 8
2.3 Facial Prior Knowledge 9
2.4 Perceptual Loss 9
2.5 Summary of related works 10
Chapter 3 Proposed Method 11
3.1 Overview 11
3.2 Feature Extraction Network 13
3.3 Face Segmentation Network 14
3.4 Label Correction with Ensemble Learning Method 15
3.5 Feature Refinement Network 16
3.6 Loss Function 16
Chapter 4 Experiments 18
4.1 Training and Testing Datasets 18
4.2 Experimental Setup 18
4.3 Objective scores results 19
4.4 Subjective images results 25
Chapter 5 Conclusion 27
Reference 28

[1] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 815–823, 2015.
[2] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. Int. Conf. Learn. Representations, 2015.
[3] LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
[4] Y. Chen, Y. Tai, X. Liu, C. Shen, and J. Yang, “FSRNet: End-to-end learning face super-resolution with facial priors,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018.
[5] Chunwei Tian, Yong Xu, Wangmeng Zuo, Bob Zhang, Lunke Fei,and Chia-Wen Lin. "Coarse-to-fine cnn for image super-resolution." IEEE Transactions on Multimedia (2020).
[6] O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz et al., "Attention U-Net: Learning Where to Look for the Pancreas." arXiv preprint arXiv:1804.03999, 2018.
[7] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. "Attention is all you need." In Advances in Neural Information Processing Systems, pages 6000–6010.
[8] Rudin, Leonid I., Stanley Osher, and Emad Fatemi. "Nonlinear total variation based noise removal algorithms." Physica D: nonlinear phenomena 60.1-4 (1992): 259-268.
[9] Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. "Image style transfer using convolutional neural networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[10] C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep convolutional network for image super-resolution. In Computer Vision–ECCV 2014, pages 184–199. Springer, 2014.
[11] Shi, Wenzhe, et al. "Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[12] Kim, Jiwon, Jung Kwon Lee, and Kyoung Mu Lee. "Accurate image super-resolution using very deep convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[13] Kim, Jiwon, Jung Kwon Lee, and Kyoung Mu Lee. "Deeply-recursive convolutional network for image super-resolution." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[14] Tai, Ying, Jian Yang, and Xiaoming Liu. "Image super-resolution via deep recursive residual network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[15] Zhang, Yulun, et al. "Image super-resolution using very deep residual channel attention networks." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
[16] Dai, Tao, et al. "Second-order attention network for single image super-resolution." Proceedings of the IEEE conference on computer vision and pattern recognition. 2019.
[17] S. Zhu, S. Liu, C. Loy, and X. Tang. Deep cascaded binetwork for face hallucination. In ECCV, 2016.
[18] Y. Song, J. Zhang, S. He, L. Bao, and Q. Yang. Learning to hallucinate face images via component generation and enhancement. In IJCAI, 2017.
[19] C.-Y. Yang, S. Liu, and M.-H. Yang. Structured face hallucination. In CVPR, 2013.
[20] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. "Photo-realistic single image super-resolution using a generative adversarial network." arXiv preprint arXiv:1609.04802, 2016.
[21] O. Ronneberger, P. Fischer, and T. Brox. "U-net: Convolutional networks for biomedical image segmentation." In MICCAI, pages 234–241. Springer, 2015.
[22] E. Agustsson and R. Timofte. "Ntire 2017 challenge on single image super-resolution: Dataset and study." In CVPR Workshops, July 2017.
[23] V. Le, J. Brandt, Z. Lin, L. D. Bourdev, and T. S. Huang. "Interactive facial feature localization." In ECCV, pages 679–692, 2012.
[24] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel. "Low-complexity single-image super-resolution based on nonnegative neighbor embedding." BMVC, 2012.
[25] Jianchao Yang, John Wright, Thomas S. Huang, and Yi Ma. "Image super-resolution via sparse representation." IEEE transactions on image processing 19.11 (2010): 2861-2873.
[26] D. Martin, C. Fowlkes, D. Tal, and J. Malik, "A Database of Human Segmented Natural Images and Its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics." Proc. IEEE Int’l Conf. Computer Vision, July 2001.
[27] Huang, Jia-Bin, Abhishek Singh, and Narendra Ahuja. "Single image super-resolution from transformed self-exemplars." In CVPR, 2015.
[28] Diederik Kingma and Jimmy Ba. "Adam: A method for stochastic optimization." In ICLR, 2015.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *