帳號:guest(3.133.145.136)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):羅右鈞
作者(外文):Lo, Yu-Chun
論文名稱(中文):利用多任務指針生成器網絡從圖表圖像中提取樣式
論文名稱(外文):Style Extraction from Chart Images with Multi-Task Pointer-Generator Networks
指導教授(中文):陳煥宗
指導教授(外文):Chen, Hwann-Tzong
口試委員(中文):劉庭祿
許秋婷
口試委員(外文):Liu, Tyng-Luh
Hsu, Chiou-Ting
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:105062509
出版年(民國):108
畢業學年度:107
語文別:英文
論文頁數:38
中文關鍵詞:深度學習計算機視覺資料視覺化風格轉換
外文關鍵詞:Deep LearningComputer VisionData VisualizationStyle Transfer
相關次數:
  • 推薦推薦:0
  • 點閱點閱:379
  • 評分評分:*****
  • 下載下載:7
  • 收藏收藏:0
在本論文中,我們提出一套從圖表圖像提取風格樣式的端對端解決方法。與 之前的相關領域中的方法不同,我們的方法並不需要針對特定圖表類型採取 假設和經驗法則。在我們的大量風格豐富化的圖表資料集的驅動下,我們的 模型很好地適應了各種風格的圖表。在實驗中,我們呈現了我們的模型在幾 個具有視覺分佈偏移的測試集上實現了泛化性。此外,通過我們現成的風格 樣式提取方法,我們展示其將圖像上的風格樣式轉換到圖表上的附加應用。
In this work, we propose an end-to-end solution for extracting styles from chart images. Unlike previous approach in related domain, our method does not make assumptions or rely on heuristics for specific chart types. Driven by our large-scale style-enriched chart corpus, our model adapts well for charts with various styles. In our experiments, we show that our model achieves generalization on several test sets with visual distribution shifts. In addition, we present an add-on application for transferring styles from images to charts with our off-the-shelf style extraction method.
List of Tables ......................................... 5
List of Figures ......................................... 6
摘要 ......................................... 8
Abstract ......................................... 9
1 Introduction ......................................... 10
2 Related work ......................................... 13
2.1 Chart Redesign ......................................... 13
2.2 Chart VQA ........................................... 14
2.3 Neural Image Captioning .................................... 15
2.4 Neural Style Transfer ...................................... 15
3 Dataset ......................................... 16
3.1 Visual Styles .......................................... 17
3.1.1 Axes Style ....................................... 17
3.1.2 Background Color ................................... 17
3.1.3 Foreground Color .................................... 18
3.1.4 Has Border ....................................... 18
3.1.5 Legend Position .................................... 19
3.1.6 Texture ......................................... 19
3.2 Text Labels ........................................... 19
3.3 Numerical Data......................................... 19
4 Our Approach ......................................... 20
4.1 Overview ............................................ 20
4.2 Model .............................................. 22
4.2.1 ImageEncoder ..................................... 22
4.2.2 ColorEmbedder .................................... 23
4.2.3 Task-Shared Representation .............................. 23
4.2.4 Attention ........................................ 24
4.2.5 Task-Specific Pointer-Generator ............................ 24
5 Experiments ......................................... 27
5.1 Settings ............................................. 27
5.1.1 Datasets ......................................... 27
5.1.2 Preprocessing ...................................... 27
5.1.3 Models ......................................... 28
5.1.4 Implementation Detail ................................. 29
5.1.5 Evaluation ....................................... 29
5.2 Quantitative Results ....................................... 29
5.2.1 Multi-Head Attention Remedies Negative Transfer in Multi-task Model ..... 29
5.2.2 Comparison on Foreground Color Baseline ...................... 30
5.2.3 Performance on Different Chart Types ......................... 30
5.3 Qualitative Results ....................................... 31
6 Conclusion and Future Work ......................................... 35
7 Bibliography ......................................... 36
[1] S. Ebrahimi Kahou, V. Michalski, A. Atkinson, A. Kadar, A. Trischler, and Y. Bengio. FigureQA: An Annotated Figure Dataset for Visual Reasoning. arXiv e-prints, page arXiv:1710.07300, Oct 2017.
[2] D. Haehn, J. Tompkin, and H. Pfister. Evaluating ’graphical perception’ with cnns.
IEEE Trans. Vis. Comput. Graph., 25(1):641–650, 2019.
[3] J. Harper and M. Agrawala. Deconstructing and restyling D3 visualizations. In The 27th Annual ACM Symposium on User Interface Software and Technology, UIST ’14, Honolulu, HI, USA, October 5-8, 2014, pages 253–262, 2014.
[4] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
[5] Y. Jing, Y. Yang, Z. Feng, J. Ye, Y. Yu, and M. Song. Neural Style Transfer: A Review. arXiv e-prints, page arXiv:1705.04058, May 2017.
[6] K. Kafle, B. L. Price, S. Cohen, and C. Kanan. DVQA: understanding data visual- izations via question answering. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 5648–5656, 2018.
[7] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In 3rd Inter- national Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
[8] G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter. Self-normalizing neural networks. In Advances in Neural Information Processing Systems 30: Annual Con- ference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pages 972–981, 2017.
[9] G. Klein, Y. Kim, Y. Deng, J. Senellart, and A. M. Rush. OpenNMT: Open-source toolkit for neural machine translation. In Proc. ACL, 2017.
[10] J. Lu, J. Yang, D. Batra, and D. Parikh. Neural baby talk. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 7219–7228, 2018.
[11] T. Luong, H. Pham, and C. D. Manning. Effective approaches to attention-based neu- ral machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages 1412–1421, 2015.
[12] B. McCann, N. S. Keskar, C. Xiong, and R. Socher. The natural language decathlon: Multitask learning as question answering. CoRR, abs/1806.08730, 2018.
[13] J. Poco and J. Heer. Reverse-engineering visualizations: Recovering visual encodings from chart images. In Computer Graphics Forum, volume 36, pages 353–363. Wiley Online Library, 2017.
[14] J. Poco, A. Mayhua, and J. Heer. Extracting and retargeting color mappings from bitmap images of visualizations. IEEE Trans. Vis. Comput. Graph., 24(1):637–646, 2018.
[15] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.
[16] S. Ruder. An overview of multi-task learning in deep neural networks. CoRR, abs/1706.05098, 2017.
[17] M. Savva, N. Kong, A. Chhajta, F. Li, M. Agrawala, and J. Heer. Revision: automated classification, analysis and redesign of chart images. In Proceedings of the 24th An-
nual ACM Symposium on User Interface Software and Technology, Santa Barbara, CA, USA, October 16-19, 2011, pages 393–402, 2011.
[18] A. See, P. J. Liu, and C. D. Manning. Get to the point: Summarization with pointer- generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pages 1073–1083, 2017.
[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. In Advances in Neural Information Pro- cessing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pages 6000–6010, 2017.
[20] O. Vinyals, M. Fortunato, and N. Jaitly. Pointer networks. In Advances in Neu- ral Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 2692–2700, 2015.
[21] M. Wang and W. Deng. Deep visual domain adaptation: A survey. Neurocomputing, 312:135–153, 2018.
[22] T. Yao, Y. Pan, Y. Li, and T. Mei. Incorporating copying mechanism in image cap- tioning for learning novel objects. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 5263–5271, 2017.
[23] M. Zakir Hossain, F. Sohel, M. Fairuz Shiratuddin, and H. Laga. A Compre- hensive Survey of Deep Learning for Image Captioning. arXiv e-prints, page arXiv:1810.04020, Oct 2018.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *