照片吸引力增強器：使用條件式生成對抗網路產生符合人類感官喜好的影像_

帳號：guest(216.73.216.146) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	盧聖約
作者(外文):	Lu, Sheng-Yueh
論文名稱(中文):	照片吸引力增強器：使用條件式生成對抗網路產生符合人類感官喜好的影像
論文名稱(外文):	Appealing Photo Enhancer: Image Enhancement Aligned with Human Perceptual Preferences Using Conditional GANs
指導教授(中文):	蘇豐文
指導教授(外文):	Soo, Von-Wun
口試委員(中文):	邱瀞德沈之涯
口試委員(外文):	Chiu, Ching-Te Shen, Chih-Ya
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊工程學系
學號:	104062542
出版年(民國):	109
畢業學年度:	108
語文別:	英文
論文頁數:	23
中文關鍵詞:	卷積、卷積網絡、資料擴充、特徵抽取、生成對抗網絡模型、生成對抗網絡訓練、生成對抗網絡、生成器、影像增強、影像品質、影像到影像翻譯、訓練、人工智慧、反向傳播、條件式生成對抗網絡、卷積神經網絡、增強影像、神經網絡、影像處理操作、輸入影像、學習框架、誤差函數最小化、機器學習、美學誤差、感知品質、監督式學習、無監督式學習
外文關鍵詞:	CNN、Convolution、Convolutional Network、Data Augmentation、Feature extraction、GAN model、GAN training、GANs、Generative adversarial networks、Generators、Image enhancement、Image quality、Image-to-Image Translation、Training、artificial intelligence、back-propagation、backpropagation、conditional GANs、convolution、convolutional neural nets、convolutional neural network、data augmentation、enhanced image、feedforward neural nets、generative adversarial network、image enhancement、image processing operators、image-to-image translation、input image、learning framework、loss function minimization、machine learning、aesthetic regularizer、perceptual quality、supervised learning、unsupervised learning
相關次數:	推薦:0 點閱:1753 評分: 下載:0 收藏:0

我們提出了一種用於影像增強的影像轉換模型。現有的影像增強模型缺乏增強影像的能力，讓這些增強的影像在視覺上使大量的人滿意。在本論文中，我們提出了一種新穎的影像增強模型，稱為照片吸引力增強器（APE），該模型利用條件式生成對抗網絡（cGANs）將輸入的影像增強為一個在視覺上令人喜歡的影像。 APE 經由一個影像評估模型指導，來增強出符合人類感官喜好的影像。首先，我們訓練了一個影像美學評估模型，該模型可以觀察眾多用戶的感官喜好。其次，我們訓練了基於cGANs 模型的影像增強器，以向人類專家學習影像增強的對應關係。最後，評估模型被用作美學正規化約束，以指導cGANs 生成器來產生符合人類感官喜好的影像。總而言之，定量實驗和視覺結果表明，APE 可以有效地增強影像。

We propose an image translation model for image enhancement. Existing image enhancement models lack the ability to enhance images that are visually pleasing to a huge diversity of people. In this paper, we propose a novel image enhancement model, called the APE (Appealing Photo Enhancer), which employs the conditional adversarial networks (cGANs) to enhance an input image into a visually pleasing one. The APE is guided by an image assessment model to enhance images aligned with human perceptual preferences. First, we’ve trained an image aesthetic assessment model which can perceive multi-user perceptual preferences. Second, we trained an image enhancer based on the cGANs model to learn the enhancement mapping from human experts. Finally, the assessment model is used as an aesthetic regularizer to guide the cGANs generator to enhance images aligned with human perceptual preferences. Altogether, quantitative experiments and visual results show that the APE can enhance images effectively.

1 Introduction 1
2 Related Work 4
2.1 Image Assessment Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Image Translation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 The APE Model 6
3.1 Main Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Model Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.1 Assessment Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.2 Image Enhancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Experiments 13
4.1 Results of the Assessment Model . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.1 AVA: A Large-Scale Database for Aesthetic Visual Analysis [13] . . 13
4.1.2 Binary Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Visual Results of Enhanced Images . . . . . . . . . . . . . . . . . . . . . . 15
4.2.1 Aesthetic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 User study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Aesthetic Regularizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Conclusions 17
References 21

[1] H. Talebi and P. Milanfar. “Learned perceptual image enhancement”. In: 2018 IEEE
International Conference on Computational Photography (ICCP). 2018, pp. 1–13.
doi: 10.1109/ICCPHOT.2018.8368474.
[2] Y. Chen et al. “Deep Photo Enhancer: Unpaired Learning for Image Enhancement
from Photographs with GANs”. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, pp. 6306–6314. doi: 10.1109/CVPR.2018.
00660.
[3] S. Wang et al. “Naturalness Preserved Enhancement Algorithm for Non-Uniform
Illumination Images”. In: IEEE Transactions on Image Processing 22.9 (2013),
pp. 3538–3548. issn: 1941-0042. doi: 10.1109/TIP.2013.2261309.
[4] V. Bychkovsky et al. “Learning photographic global tonal adjustment with a database
of input / output image pairs”. In: CVPR 2011. 2011, pp. 97–104. doi: 10.1109/
CVPR.2011.5995332.
[5] Ian Goodfellow et al. “Generative Adversarial Nets”. In: Advances in Neural Information Processing Systems 27. Ed. by Z. Ghahramani et al. Curran Associates, Inc.,
2014, pp. 2672–2680. url: http://papers.nips.cc/paper/5423-generativeadversarial-nets.pdf.
[6] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. “U-Net: Convolutional Networks for Biomedical Image Segmentation”. In: Medical Image Computing and
Computer-Assisted Intervention – MICCAI 2015. Ed. by Nassir Navab et al. Cham:
Springer International Publishing, 2015, pp. 234–241. isbn: 978-3-319-24574-4.
[7] K. He et al. “Deep Residual Learning for Image Recognition”. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, pp. 770–778.
doi: 10.1109/CVPR.2016.90.
21
[8] Y. LeCun et al. “Gradient-Based Learning Applied to Document Recognition”. In:
Intelligent Signal Processing. IEEE Press, 2001, pp. 306–351.
[9] L. Kang et al. “Convolutional Neural Networks for No-Reference Image Quality Assessment”. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition.
2014, pp. 1733–1740. doi: 10.1109/CVPR.2014.224.
[10] S. Bosse et al. “A deep neural network for image quality assessment”. In: 2016 IEEE
International Conference on Image Processing (ICIP). 2016, pp. 3773–3777. doi:
10.1109/ICIP.2016.7533065.
[11] N. Murray, L. Marchesotti, and F. Perronnin. “AVA: A large-scale database for aesthetic visual analysis”. In: 2012 IEEE Conference on Computer Vision and Pattern
Recognition. 2012, pp. 2408–2415. doi: 10.1109/CVPR.2012.6247954.
[12] Y. Kao, C. Wang, and K. Huang. “Visual aesthetic quality assessment with a regression model”. In: 2015 IEEE International Conference on Image Processing (ICIP).
2015, pp. 1583–1587. doi: 10.1109/ICIP.2015.7351067.
[13] B. Jin, M. V. O. Segovia, and S. Süsstrunk. “Image aesthetic predictors based on
weighted CNNs”. In: 2016 IEEE International Conference on Image Processing
(ICIP). 2016, pp. 2291–2295. doi: 10.1109/ICIP.2016.7532767.
[14] Hui Zeng, Lei Zhang, and Alan C. Bovik. A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction. 2017. arXiv: 1708.08190 [cs.CV].
[15] H. Talebi and P. Milanfar. “NIMA: Neural Image Assessment”. In: IEEE Transactions on Image Processing 27.8 (2018), pp. 3998–4011. issn: 1941-0042. doi:
10.1109/TIP.2018.2831899.
[16] K. Lata, M. Dave, and K. N. Nishanth. “Image-to-Image Translation Using Generative Adversarial Network”. In: 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA). 2019, pp. 186–189. doi:
10.1109/ICECA.2019.8822195.
[17] K. Schwarz, P. Wieschollek, and H. P. A. Lensch. “Will People Like Your Image?
Learning the Aesthetic Space”. In: 2018 IEEE Winter Conference on Applications of
Computer Vision (WACV). 2018, pp. 2048–2057. doi: 10.1109/WACV.2018.00226.
[18] Andrew Howard et al. Searching for MobileNetV3. 2019. arXiv: 1905.02244 [cs.CV].
22
[19] J. Deng et al. “ImageNet: A large-scale hierarchical image database”. In: 2009 IEEE
Conference on Computer Vision and Pattern Recognition. 2009, pp. 248–255. doi:
10.1109/CVPR.2009.5206848.
[20] Mark Sandler et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. 2018.
arXiv: 1801.04381 [cs.CV].
[21] K. He et al. “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual
Recognition”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence
37.9 (2015), pp. 1904–1916. issn: 1939-3539. doi: 10.1109/TPAMI.2015.2389824.
[22] Sergey Ioffe and Christian Szegedy. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift”. In: Proceedings of the 32nd
International Conference on Machine Learning. Ed. by Francis Bach and David
Blei. Vol. 37. Proceedings of Machine Learning Research. Lille, France: PMLR,
2015, pp. 448–456. url: http://proceedings.mlr.press/v37/ioffe15.html.
[23] Prajit Ramachandran, Barret Zoph, and Quoc V. Le. Searching for Activation Functions. 2017. arXiv: 1710.05941 [cs.NE].
[24] Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer Normalization.
2016. arXiv: 1607.06450 [stat.ML].
[25] Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. “BinaryConnect:
Training Deep Neural Networks with binary weights during propagations”. In: Advances in Neural Information Processing Systems 28. Ed. by C. Cortes et al. Curran
Associates, Inc., 2015, pp. 3123–3131. url: http://papers.nips.cc/paper/5647-
binaryconnect - training - deep - neural - networks - with - binary - weights -
during-propagations.pdf.
[26] Martín Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous
Distributed Systems. 2016. arXiv: 1603.04467 [cs.DC].
[27] Christian Szegedy et al. Inception-v4, Inception-ResNet and the Impact of Residual
Connections on Learning. 2017. url: https://www.aaai.org/ocs/index.php/
AAAI/AAAI17/paper/view/14806.
[28] Shu Kong et al. “Photo Aesthetics Ranking Network with Attributes and Content
Adaptation”. In: CoRR abs/1606.01621 (2016). arXiv: 1606.01621. url: http:
//arxiv.org/abs/1606.01621.

(此全文未開放授權)
電子全文
中英文摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文