帳號:guest(18.189.182.31)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):曾景暐
作者(外文):Tseng, Ching-Wei
論文名稱(中文):泛用輕量型條件生成對抗網路於影像補全
論文名稱(外文):General Deep Image Completion with Lightweight Conditional Generative Adversarial Networks
指導教授(中文):賴尚宏
指導教授(外文):Lai, Shang-Hong
口試委員(中文):劉庭祿
陳煥宗
許秋婷
口試委員(外文):Chen, Hwann-Tzong
Hsu, Chiu-Ting
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學號:104062538
出版年(民國):106
畢業學年度:105
語文別:英文
論文頁數:72
中文關鍵詞:影像補全條件式生成對抗網路深度學習
外文關鍵詞:InpaintingGANsDCNNs
相關次數:
  • 推薦推薦:0
  • 點閱點閱:961
  • 評分評分:*****
  • 下載下載:71
  • 收藏收藏:0
近年來因深度生成對抗網路的興起,在影像復原的問題上都能得到相較於傳統方法更好與更真實的結果。然而,一般深度學習的方法需要巨量的訓練參數,以及無法應用於補全多種形式的影像破損與遺失。除此之外,一般深度自編碼器與對抗網路結合時容易面臨不穩定的訓練過程,或是只學習到將所有的輸入影像都對應至某特定結果。在這篇論文中,我們提出透過建構輕量型的條件生成對抗網路,以及結合更穩定的對抗訓練方式,來應用在解決各式各樣的影像破損情況,將其修補為更真實,完美的影像。另外,我們也提出新的訓練策略來促使深度模型學習擷取影像具有代表性的特徵,以便修復各種不同的破損。在實驗中,我們驗證了論文所提出的深度模型相較於其他深度學習方法所需的訓練參數是最少的。而且在量化數據以及視覺化上,我們提出的方法在各種類型的資料集中都優於傳統以及深度學習方法。在應用面,我們也展示了能在不同解析度以及使用者自定義的破損影像中,依然能夠修補完整。
Recent image completion researches using deep neural networks approaches have shown remarkable progress by using generative adversarial networks (GANs). However, these approaches still suffer from the problems of large model sizes and lack of generality for various types of corruptions. In addition, the conditional GANs often suffer from the mode collapse and unstable training problems. In this thesis, we overcome these shortcomings in the previous models by proposing a lightweight model of conditional GANs and integrating a stable adversarial training strategy. Moreover, we present a new training strategy to train the model to learn how to complete different types of corruptions or missing regions in images. Experimental results demonstrate qualitatively and quantitatively that the proposed model provides significant improvement over state-of-the-art image completion methods on public datasets. In addition, we show that our model requires much less model parameters to achieve superior results for different types of unseen corruption masks.
Contents
Chapter 1. Introduction 1
1.1 Motivation 1
1.2 Problem Description 2
1.3 Main Contributions 3
Chapter 2. Related Works 5
2.1 Conventional Image Completion 5
2.2 Deep Image Generation and Completion Model 6
2.3 Inspiration from Previous Deep Models 7
Chapter 3. Proposed Model 10
3.1 Network Architecture 11
3.1.1 Autoencoder G 12
3.1.2 Discriminator D 13
3.2 Objective Function 13
3.3 Training 16
3.4 Implementation Details 19
Chapter 4. Experimental Evaluation 21
4.1 Datasets 22
4.2 Baselines 23
4.3 Ablation Study on Loss Functions 25
4.4 Deep Model Comparisons 28
4.5 Results on Local Fragments Corruptions 29
4.5.1 Single vs Integrated Model 29
4.5.2 Quantitative and Qualitative Results on CUB, Flowers, and MSCOCO 32
4.5.3 Higher-Resolution Evaluation 41
4.5.4 Results on User-Specified Masks 47
4.6 Results on Arbitrary Polygons Corruptions 51
4.6.1 Quantitative and Qualitative Results on CelebA 51
4.6.2 User Study Verification 57
4.6.3 Extension on Face editing 60
Chapter 5. Demo System 62
Chapter 6. Conclusion 67
References 68

References
[1] Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan Goldman. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics-TOG, 28(3):24, 2009.
[2] CRIMINISI, Antonio; PÉREZ, Patrick; TOYAMA, Kentaro. Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on image processing, 2004, 13.9: 1200-1212.
[3] Iddo Drori, Daniel Cohen-Or, and Hezy Yeshurun. Fragment-based image completion. In ACM Transactions on graphics (TOG), volume 22, pages 303–312. ACM, 2003.
[4] Simon Korman and Shai Avidan. Coherency sensitive hashing. In 2011 International Conference on Computer Vision, pages 1607–1614. IEEE, 2011.
[5] Marta Wilczkowiak, Gabriel J Brostow, Ben Tordoff, and Roberto Cipolla. Hole filling through photomontage. In BMVC, volume 5, pages 492–501, 2005.
[6] Yao Hu, Debing Zhang, Jieping Ye, Xuelong Li, and Xiaofei He. Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE transactions on pattern analysis and machine intelligence, 35(9):2117–2130, 2013.
[7] Jia-Bin Huang, Sing Bing Kang, Narendra Ahuja, and Johannes Kopf. Image completion using planar structure guidance. ACM Transactions on Graphics (TOG), 33(4):129, 2014.
[8] Nikos Komodakis and Georgios Tziritas. Image completion using efficient belief propagation via priority scheduling and dynamic pruning. IEEE Transactions on Image Processing, 16(11):2649–2661, 2007.
[9] Stefan Roth and Michael J Black. Fields of experts. International Journal of Computer Vision, 82(2):205–229, 2009.
[10] J. Hays and A. A. Efros. Scene completion using millions of photographs. In ACM Transactions on Graphics (TOG), volume 26, page 4. ACM, 2007.
[11] A. Dosovitskiy, J. Tobias Springenberg, and T. Brox. Learning to generate chairs with convolutional neural networks.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1538–1546, 2015.
[12] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.
[13] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
[14] Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016.
[15] Chuan Li and Michael Wand. Precomputed real-time texture synthesis with markovian generative adversarial networks. In European Conference on Computer Vision, pages 702–716. Springer, 2016.
[16] Yijun Li, Sifei Liu, Jimei Yang, and Ming-Hsuan Yang. Generative face completion. arXiv preprint arXiv:1704.05838, 2017.
[17] Raymond Yeh, Chen Chen, Teck Yian Lim, Mark Hasegawa-Johnson, and Minh N Do. Semantic image inpainting with perceptual and contextual losses. arXiv preprint arXiv:1607.07539, 2016.
[18] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004, 2016.
[19] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scaleimage recognition. CoRR, abs/1409.1556, 2014.
[20] Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros. Context encoders: Feature learning by inpainting. arXiv preprint arXiv:1604.07379, 2016.
[21] C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang, and H. Li. High-resolution image inpainting using multi-scale neural patch synthesis. arXiv preprint arXiv:1611.09969, 2016.
[22] Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289, 2015.
[23] Xiao-Jiao Mao, Chunhua Shen, and Yu-Bin Yang. Image restoration using convolutional auto-encoders with symmetric skip connections. arXiv preprint arXiv:1606.08921, 2016.
[24] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 234–241. Springer, 2015.
[25] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
[26] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
[27] Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, and Zhen Wang. Least squares generative adversarial networks. arXiv preprint ArXiv:1611.04076, 2016.
[28] Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
[29] David Berthelot, Tom Schumm, and Luke Metz. Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717, 2017.
[30] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. The Caltech-UCSDBirds-200-2011 Dataset. Technical report, 2011.
[31] M-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, Dec 2008.
[32] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), 2015.
[33] K. Zhao, W.-S. Chu, F. De la Torre, J. F. Cohn, and H. Zhang. Joint patch and multi-label learning for facial action unit detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2207–2216, 2015.
[34] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European Conference on Computer Vision, pages 740–755. Springer, 2014.
[35] John D’Errico. inpaint_nans, MATLAB Central File Exchange, Retrieved March 1, 2017.
[36] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
[37] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[38] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
[39] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proc. 8th Int’l Conf. Computer Vision, volume 2, pages 416–423, July 2001.
[40] D. Yoo, N. Kim, S. Park, A. S. Paek, and I. S. Kweon. Pixel level domain transfer. In European Conference on Computer Vision, pages 517–532. Springer, 2016.
[41] R. Zhang, P. Isola, and A. A. Efros. Colorful image colorization. In European Conference on Computer Vision, pages 649–666. Springer, 2016.
[42] BMVC 2017. Retrieved from https://bmvc2017.london/, April 29, 2017.
[43] ARCHITECTURURAL DIGEST, London Travel Guide. Retrieved from http://www.architecturaldigest.com/london-travel-guide, April 29,2017.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *