帳號:guest(3.128.203.75)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):馬智豪
作者(外文):Ma, Chih Hao
論文名稱(中文):利用深度卷積特徵及密集對齊之非參數場景分析
論文名稱(外文):Nonparametric Scene Parsing with Deep Convolutional Features and Dense Alignment
指導教授(中文):許秋婷
指導教授(外文):Hsu, Chiou Ting
口試委員(中文):簡仁宗
孫民
口試委員(外文):Chien, Jen Tzung
Sun, Min
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:102062515
出版年(民國):104
畢業學年度:103
語文別:英文
論文頁數:34
中文關鍵詞:場景分析物體視窗深度卷積式網路SIFT flow
外文關鍵詞:scene parsingobject windowdeep convolutional networkSIFT flow
相關次數:
  • 推薦推薦:0
  • 點閱點閱:678
  • 評分評分:*****
  • 下載下載:6
  • 收藏收藏:0
這篇論文主要討論非參數場景分析方法中影響正確率的兩個重要因素: (1)影像檢索的品質;(2) 標籤轉移的準確度。因為非參數方法是從檢索影像中轉移標籤至測試影像,所以檢索影像和測試影像必須是“語意相似”的。當擁有一個好的檢索影像集合後,標籤轉移必須有像素等級的準確度。這篇論文中我們改進了上述的兩點觀察,以提升非參數影像標註的正確率。我們使用深度卷積特徵當作視覺描述子以及藉由語意描述子將檢索影像重新排序,因此得到品質更好的檢索影像集合。除此之外,我們在馬可夫隨機場模型中加入密集空間對齊,以提高標籤轉移在像素等級上的準確率。接下來我們將初次產生的標註結果當作擴展查詢,以得到更好品質的檢索影像集合,並根據此一更新的檢索影像集合執行第二輪標籤轉移。最後將兩輪的標籤轉移結果結合,再次透過馬可夫隨機場模型得到更好的標註成果。在實驗中,我們透過SIFT Flow與LMSun資料庫進行驗證,實驗結果顯示我們的方法均優於過去的非參數場景標註方法。
This thesis addresses two key issues which concern the performance of nonparametric scene parsing: (1) the semantic quality of image retrieval; and (2) the accuracy in label transfer. First, because nonparametric methods annotate a query image through transferring labels from retrieved images, the task of image retrieval should find a set of “semantically similar” images to the query. Second, with the retrieval set, a good strategy should be developed to transfer semantic labels in pixel-level accuracy. In this thesis, we focus on improving scene parsing accuracy in these two issues. We propose using the state-of-the-art deep convolutional features as visual descriptors to improve the semantic quality of retrieved images. In addition, we include dense alignment into the Markov Random Field (MRF) inference framework to transfer labels at pixel-level accuracy. Next, we utilize the derived semantic labels as queries to expand the retrieval set and then conduct the second-round label transfer. Finally, we combine label transferring cues of two rounds into the MRF model to improve the labeling results. Our experiments on the SIFT Flow dataset and LMSun dataset show the improvement of the proposed approach over other nonparametric methods.
中文摘要 1
Abstract 3
1. Introduction 5
2. Related Work 7
2.1 Nonparametric Methods 7
2.1.1 Pixel-Based Methods 7
2.1.2 Superpixel-Based Methods 8
2.1.3 Window-Based Methods 9
2.2 Query Expansion 10
3. Background and Motivation 12
4. Proposed Method 14
4.1 Image Retrieval Using Deep Convolutional Features 14
4.2 Detection and Matching of Object Windows 14
4.3 Dense Alignment via SIFT Flow 15
4.4 MRF Labeling 17
4.5 Query Expansion 17
5. Experimental Results 22
5.1 Database and Setting 22
5.2 Evaluation of Different Components 23
5.2.1 Dense Alignment 23
5.2.2 Query Expansion 23
5.3 Evaluation of Different Retrieval Sizes 24
5.4 Comparison with State-of-the-Art Methods 24
6 Limitation and Discussion 31
7 Conclusions 32
8 References 33
[1] C. Farabet, C. Couprie, L. Najman, and Y. LeCun, “Learning Hierarchical Features for Scene Labeling,” IEEE Trans. PAMI, vol. 35, no. 8, Aug. 2013, pp.1915-1929.
[2] J. Tighe and S. Lazebnik, “Finding Things: Image Parsing with Regions and Per-Exemplar Detectors,” In Proc. CVPR, 2013.
[3] C. Liu, J. Yuen, and A. Torralba, “Nonparametric Scene Parsing via Label Transfer,” IEEE Trans. PAMI, vol. 33, no. 12, Dec. 2011, pp. 2368-2382.
[4] J. Tighe and S. Lazebnik, “SuperParsing: Scalable Nonparametric Image Parsing with Superpixels,” International Journal of Computer Vision, vol. 101, no. 2, Jan. 2013, pp. 329-349.
[5] J. Yang, B. Price, S. Cohen and M. Yang, “Context Driven Scene Parsing with Attention to Rare Classes,” In Proc. CVPR, 2014.
[6] F. Tung and J. J. Little, “CollageParsing: Nonparametric Scene Parsing by Adaptive Overlapping Windows.” In Proc. ECCV, 2014.
[7] B. Alexe, T. Deselaers, and V. Ferrari, “Measuring the Objectness of Image Windows,” IEEE Trans. PAMI, vol. 34, no. 11, Nov. 2012, pp. 2189-2202.
[8] C. Liu, J. Yuen, and A. Torralba, “SIFT Flow: dense correspondence across scenes and its applications,” IEEE Trans. PAMI, vol. 33, no. 5, May 2011, pp. 978-994.
[9] P. F. Felzenszwalb, and D. P. Huttenlocher, “Efficient Graph-Based Image Segmentation,” International Journal of Computer Vision, vol. 59, no. 2, Sep. 2004, pp. 167-181.
[10] G. Singh, and J. Kosecka, “Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context,” In Proc. CVPR, 2013.
[11] A. Oliva, and A. Torralba, “Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope,” International Journal of Computer Vision, vol. 42, no. 3, Jan. 2001, pp. 145-175.
[12] D.G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, vol. 60, no. 2, Jan. 2004, pp. 91-110.
[13] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” In Proc. CVPR, 2005.
[14] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional Architecture for Fast Feature Embedding,” http://caffe.berkeleyvision.org/, 2014.
[15] J.R.R. Uijlings, K.E.A. van de Sande, T. Gevers and A.W.M. Smeulders, “Selective Search for Object Recognition,” International Journal of Computer Vision, vol. 104, no. 2, April 2013, pp. 154-171.
[16] Y. Boykov, O. Veksler, and R. Zabin, “Fast Approximate Energy Minimization via Graph Cuts,” IEEE Trans. PAMI, vol. 23, no. 11, Nov. 2001, pp. 1222-1239.
[17] J. Cai, Z. Zha, M. Wang, S. Zhang, and Q. Tian, “An Attribute-Assisted Reranking Model for Web Image Search,” IEEE Trans. TIP, vol. 24, no. 1, Jan. 2015, pp. 261-272.
[18] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” In Proc. NIPS, 2012.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *