帳號:guest(216.73.216.146)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):盧聖約
作者(外文):Lu, Sheng-Yueh
論文名稱(中文):照片吸引力增強器:使用條件式生成對抗網路產生符合人類感官喜好的影像
論文名稱(外文):Appealing Photo Enhancer: Image Enhancement Aligned with Human Perceptual Preferences Using Conditional GANs
指導教授(中文):蘇豐文
指導教授(外文):Soo, Von-Wun
口試委員(中文):邱瀞德
沈之涯
口試委員(外文):Chiu, Ching-Te
Shen, Chih-Ya
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:104062542
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:23
中文關鍵詞:卷積卷積網絡資料擴充特徵抽取生成對抗網絡模型生成對抗網絡訓練生成對抗網絡生成器影像增強影像品質影像到影像翻譯訓練人工智慧反向傳播條件式生成對抗網絡卷積神經網絡增強影像神經網絡影像處理操作輸入影像學習框架誤差函數最小化機器學習美學誤差感知品質監督式學習無監督式學習
外文關鍵詞:CNNConvolutionConvolutional NetworkData AugmentationFeature extractionGAN modelGAN trainingGANsGenerative adversarial networksGeneratorsImage enhancementImage qualityImage-to-Image TranslationTrainingartificial intelligenceback-propagationbackpropagationconditional GANsconvolutionconvolutional neural netsconvolutional neural networkdata augmentationenhanced imagefeedforward neural netsgenerative adversarial networkimage enhancementimage processing operatorsimage-to-image translationinput imagelearning frameworkloss function minimizationmachine learningaesthetic regularizerperceptual qualitysupervised learningunsupervised learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:1753
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
我們提出了一種用於影像增強的影像轉換模型。現有的影像增強模型缺乏增強影像的能力,讓這些增強的影像在視覺上使大量的人滿意。在本論文中,我們提出了一種新穎的影像增強模型,稱為照片吸引力增強器(APE),該模型利用條件式生成對抗網絡(cGANs)將輸入的影像增強為一個在視覺上令人喜歡的影像。 APE 經由一個影像評估模型指導,來增強出符合人類感官喜好的影像。首先,我們訓練了一個影像美學評估模型,該模型可以觀察眾多用戶的感官喜好。其次,我們訓練了基於cGANs 模型的影像增強器,以向人類專家學習影像增強的對應關係。最後,評估模型被用作美學正規化約束,以指導cGANs 生成器來產生符合人類感官喜好的影像。總而言之,定量實驗和視覺結果表明,APE 可以有效地增強影像。
We propose an image translation model for image enhancement. Existing image enhancement models lack the ability to enhance images that are visually pleasing to a huge diversity of people. In this paper, we propose a novel image enhancement model, called the APE (Appealing Photo Enhancer), which employs the conditional adversarial networks (cGANs) to enhance an input image into a visually pleasing one. The APE is guided by an image assessment model to enhance images aligned with human perceptual preferences. First, we’ve trained an image aesthetic assessment model which can perceive multi-user perceptual preferences. Second, we trained an image enhancer based on the cGANs model to learn the enhancement mapping from human experts. Finally, the assessment model is used as an aesthetic regularizer to guide the cGANs generator to enhance images aligned with human perceptual preferences. Altogether, quantitative experiments and visual results show that the APE can enhance images effectively.
1 Introduction 1
2 Related Work 4
2.1 Image Assessment Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Image Translation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 The APE Model 6
3.1 Main Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Model Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.1 Assessment Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.2 Image Enhancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Experiments 13
4.1 Results of the Assessment Model . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.1 AVA: A Large-Scale Database for Aesthetic Visual Analysis [13] . . 13
4.1.2 Binary Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Visual Results of Enhanced Images . . . . . . . . . . . . . . . . . . . . . . 15
4.2.1 Aesthetic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 User study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Aesthetic Regularizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Conclusions 17
References 21
[1] H. Talebi and P. Milanfar. “Learned perceptual image enhancement”. In: 2018 IEEE
International Conference on Computational Photography (ICCP). 2018, pp. 1–13.
doi: 10.1109/ICCPHOT.2018.8368474.
[2] Y. Chen et al. “Deep Photo Enhancer: Unpaired Learning for Image Enhancement
from Photographs with GANs”. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, pp. 6306–6314. doi: 10.1109/CVPR.2018.
00660.
[3] S. Wang et al. “Naturalness Preserved Enhancement Algorithm for Non-Uniform
Illumination Images”. In: IEEE Transactions on Image Processing 22.9 (2013),
pp. 3538–3548. issn: 1941-0042. doi: 10.1109/TIP.2013.2261309.
[4] V. Bychkovsky et al. “Learning photographic global tonal adjustment with a database
of input / output image pairs”. In: CVPR 2011. 2011, pp. 97–104. doi: 10.1109/
CVPR.2011.5995332.
[5] Ian Goodfellow et al. “Generative Adversarial Nets”. In: Advances in Neural Information Processing Systems 27. Ed. by Z. Ghahramani et al. Curran Associates, Inc.,
2014, pp. 2672–2680. url: http://papers.nips.cc/paper/5423-generativeadversarial-nets.pdf.
[6] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. “U-Net: Convolutional Networks for Biomedical Image Segmentation”. In: Medical Image Computing and
Computer-Assisted Intervention – MICCAI 2015. Ed. by Nassir Navab et al. Cham:
Springer International Publishing, 2015, pp. 234–241. isbn: 978-3-319-24574-4.
[7] K. He et al. “Deep Residual Learning for Image Recognition”. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, pp. 770–778.
doi: 10.1109/CVPR.2016.90.
21
[8] Y. LeCun et al. “Gradient-Based Learning Applied to Document Recognition”. In:
Intelligent Signal Processing. IEEE Press, 2001, pp. 306–351.
[9] L. Kang et al. “Convolutional Neural Networks for No-Reference Image Quality Assessment”. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition.
2014, pp. 1733–1740. doi: 10.1109/CVPR.2014.224.
[10] S. Bosse et al. “A deep neural network for image quality assessment”. In: 2016 IEEE
International Conference on Image Processing (ICIP). 2016, pp. 3773–3777. doi:
10.1109/ICIP.2016.7533065.
[11] N. Murray, L. Marchesotti, and F. Perronnin. “AVA: A large-scale database for aesthetic visual analysis”. In: 2012 IEEE Conference on Computer Vision and Pattern
Recognition. 2012, pp. 2408–2415. doi: 10.1109/CVPR.2012.6247954.
[12] Y. Kao, C. Wang, and K. Huang. “Visual aesthetic quality assessment with a regression model”. In: 2015 IEEE International Conference on Image Processing (ICIP).
2015, pp. 1583–1587. doi: 10.1109/ICIP.2015.7351067.
[13] B. Jin, M. V. O. Segovia, and S. Süsstrunk. “Image aesthetic predictors based on
weighted CNNs”. In: 2016 IEEE International Conference on Image Processing
(ICIP). 2016, pp. 2291–2295. doi: 10.1109/ICIP.2016.7532767.
[14] Hui Zeng, Lei Zhang, and Alan C. Bovik. A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction. 2017. arXiv: 1708.08190 [cs.CV].
[15] H. Talebi and P. Milanfar. “NIMA: Neural Image Assessment”. In: IEEE Transactions on Image Processing 27.8 (2018), pp. 3998–4011. issn: 1941-0042. doi:
10.1109/TIP.2018.2831899.
[16] K. Lata, M. Dave, and K. N. Nishanth. “Image-to-Image Translation Using Generative Adversarial Network”. In: 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA). 2019, pp. 186–189. doi:
10.1109/ICECA.2019.8822195.
[17] K. Schwarz, P. Wieschollek, and H. P. A. Lensch. “Will People Like Your Image?
Learning the Aesthetic Space”. In: 2018 IEEE Winter Conference on Applications of
Computer Vision (WACV). 2018, pp. 2048–2057. doi: 10.1109/WACV.2018.00226.
[18] Andrew Howard et al. Searching for MobileNetV3. 2019. arXiv: 1905.02244 [cs.CV].
22
[19] J. Deng et al. “ImageNet: A large-scale hierarchical image database”. In: 2009 IEEE
Conference on Computer Vision and Pattern Recognition. 2009, pp. 248–255. doi:
10.1109/CVPR.2009.5206848.
[20] Mark Sandler et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. 2018.
arXiv: 1801.04381 [cs.CV].
[21] K. He et al. “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual
Recognition”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence
37.9 (2015), pp. 1904–1916. issn: 1939-3539. doi: 10.1109/TPAMI.2015.2389824.
[22] Sergey Ioffe and Christian Szegedy. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”. In: Proceedings of the 32nd
International Conference on Machine Learning. Ed. by Francis Bach and David
Blei. Vol. 37. Proceedings of Machine Learning Research. Lille, France: PMLR,
2015, pp. 448–456. url: http://proceedings.mlr.press/v37/ioffe15.html.
[23] Prajit Ramachandran, Barret Zoph, and Quoc V. Le. Searching for Activation Functions. 2017. arXiv: 1710.05941 [cs.NE].
[24] Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer Normalization.
2016. arXiv: 1607.06450 [stat.ML].
[25] Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. “BinaryConnect:
Training Deep Neural Networks with binary weights during propagations”. In: Advances in Neural Information Processing Systems 28. Ed. by C. Cortes et al. Curran
Associates, Inc., 2015, pp. 3123–3131. url: http://papers.nips.cc/paper/5647-
binaryconnect - training - deep - neural - networks - with - binary - weights -
during-propagations.pdf.
[26] Martín Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous
Distributed Systems. 2016. arXiv: 1603.04467 [cs.DC].
[27] Christian Szegedy et al. Inception-v4, Inception-ResNet and the Impact of Residual
Connections on Learning. 2017. url: https://www.aaai.org/ocs/index.php/
AAAI/AAAI17/paper/view/14806.
[28] Shu Kong et al. “Photo Aesthetics Ranking Network with Attributes and Content
Adaptation”. In: CoRR abs/1606.01621 (2016). arXiv: 1606.01621. url: http:
//arxiv.org/abs/1606.01621.
(此全文未開放授權)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *