帳號:guest(3.144.48.9)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):孫槐駿
作者(外文):SUN, HUAI-JUN
論文名稱(中文):通過AI模型對單張圖片生成的3D模型網格細化
論文名稱(外文):Mesh Refinement for Single Image Generated 3D Models Using the Hints from Different AI Models
指導教授(中文):李哲榮
指導教授(外文):Lee, Che-Rung
口試委員(中文):李潤容
朱宏國
口試委員(外文):Lee, Ruen-Rone
Chu, Hung-Kuo
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:110062466
出版年(民國):112
畢業學年度:111
語文別:英文
論文頁數:42
中文關鍵詞:單視圖3D模型特徵提取深度神經網路
外文關鍵詞:Single-view 3D ModelFeature ExtractionDeep Neural Network
相關次數:
  • 推薦推薦:0
  • 點閱點閱:41
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
從單一圖像中建構3D模型是計算機視覺中的一個經典問題。機器學習的進步,例如擴散模型,對於解決這個問題取得了很大的進展。然而,這些模型通常存在著解析度和準確性的問題。最近的一項基於組件分解的工作不僅可以從單一圖像重建3D模型,還能夠理解每個組件的屬性及其對象之間的關係。然而,目前的組件分解方法無法很好地重建對象的細節。

在本論文中,我們提出了一種網格細化方法,用於平滑組件分解方法構建的3D物體表面,以更好地逼近原始圖像中的對象。我們的方法利用從其他深度學習網絡獲得的參考3D模型的曲率信息,因為它們通常對於曲面有更好的效果。我們的方法從不同角度投影參考模型以獲取它們的2D輪廓,並利用這些信息計算表面的3D曲率。生成的模型具有顯著的優勢,因為它只需要提取一些特徵,因此我們可以在不擔心解析度、準確性和模型退化問題的情況下獲得更多的輸入和輸出自由度。

實驗結果表明,所提出的方法在MontageNet數據集上優於其他方法。相較於最先進的模型,它提高了6.28\%的IoU和2.45\%的平均歐氏距離。
Constructing 3D models from single images is a classical problem in computer vision. The advance of machine learning, such as diffusion models, makes a great progress toward solving this problem. However, these models usually have problems with resolution and accuracy, as well as the model degradation problems. A recent work based on part segmentation cannot only reconstruct the 3D models from a single image, but also comprehends the attributes of each component and their relations of objects. However, the current implementation of part segmentation method cannot reconstruct the details of objects well.

In this thesis, we propose a mesh refinement method to smooth the surfaces of the 3D objects built by the part segmentation method, so that they can better approximate to the objects in the original image. Our method leverages the curvature information from the reference 3D models obtained from the other deep learning networks, because they usually give better results for curvy surfaces. Our method projects the reference models from different angles to obtain their 2D silhouettes, and utilizes those information to calculate the 3D curvature of surfaces.
The generated models have significant advantages because it only needs to extract some features, so we can have more freedom of input and output without worrying about resolution, accuracy and model degradation problems.

Experimental results show that the proposed method is superior to others on the MontageNet dataset. It improves 6.28 \% of IoU and 2.45 \% of average Euclidean Distances comparing to the state-of-the-art models.
中文摘要 1
Abstract 2
List of Figures 5
List of Tables 6
1 Introduction 7
2 Related Work 10
2.1 Traditional Computer Vision Method . . . . . . . . . . . . . . . . . . . 10
2.2 Deep Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Voxel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Point Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 Mesh as an output . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.4 Implicit as an output . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Method 17
3.1 Shape Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.1 Share With Thy Neighbors: Single-View Reconstruction by CrossInstance Consistency . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.2 AutoSDF: Shape Priors for 3D Completion, Reconstruction,
and Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.3 Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D
Reconstruction(3DSS) . . . . . . . . . . . . . . . . . . . . . . 20
3.1.4 Point·E: A System for Generating 3D Point Clouds from Complex Prompts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.5 Shap·E: Generating Conditional 3D Implicit Functions . . . . . 20
3.2 Feature Extractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 OBJ file format . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2 Read OBJ files . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.3 Obtain the curvature of the corresponding point . . . . . . . . . 23
3.3 Feature Applicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.1 Adding new Points . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.2 Adding new Surfaces . . . . . . . . . . . . . . . . . . . . . . . . 30
4 Experiments 32
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Comparison to Other Models . . . . . . . . . . . . . . . . . . . . . . . 34
5 Conclusion and Future Work 38
References 39
Appendix 41
5.1 MontageNet Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Comparsion Of Multiple Division . . . . . . . . . . . . . . . . . . . . . 42
[1] Christopher B Choy et al. “3d-r2n2: A unified approach for single and multiview 3d object reconstruction”. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII 14. Springer, pp. 628–644. isbn: 3319464833.
[2] Alex Nichol et al. “Point-E: A System for Generating 3D Point Clouds from Complex Prompts”. arXiv preprint arXiv:2212.08751 (2022).
[3] Nanyang Wang et al. “Pixel2mesh: Generating 3d mesh models from single rgb images”. Proceedings of the European conference on computer vision (ECCV), pp. 52–67.
[4] Charles R Qi et al. “Pointnet: Deep learning on point sets for 3d classification and segmentation”. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660.
[5] Charles Ruizhongtai Qi et al. “Pointnet++: Deep hierarchical feature learning
on point sets in a metric space”. Advances in neural information processing systems 30 (2017).
[6] Chun-Liang Li et al. “Point cloud gan”. arXiv preprint arXiv:1810.05795 (2018).
[7] Panos Achlioptas et al. “Learning representations and generative models for 3d point clouds”. International conference on machine learning. PMLR, pp. 40–49. isbn: 2640-3498.
[8] Abhishek Kar et al. “Category-specific object reconstruction from a single image”. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1966–1974.
[9] Paritosh Mittal et al. “Autosdf: Shape priors for 3d completion, reconstruction and generation”. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 306–315.
[10] Richard Hartley and Andrew Zisserman. Multiple view geometry in computer
vision. Cambridge university press, 2003. isbn: 0521540518.
[11] Yasutaka Furukawa and Carlos Hernández. “Multi-view stereo: A tutorial”. Foundations and Trends® in Computer Graphics and Vision 9.1-2 (2015), pp. 1–148. issn: 1572-2740.
[12] Yongbin Sun et al. “Pointgrow: Autoregressively learned point cloud generation with self-attention”. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 61–70.
[13] Chen Kong, Chen-Hsuan Lin, and Simon Lucey. “Using locally corresponding cad models for dense 3d reconstructions from a single image”. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4857–4865.
[14] Yangyan Li et al. “Fpnn: Field probing neural networks for 3d data”. Advances in neural information processing systems 29 (2016).
[15] Jeong Joon Park et al. “Deepsdf: Learning continuous signed distance functions for shape representation”. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 165–174.
[16] Lars Mescheder et al. “Occupancy networks: Learning 3d reconstruction in function space”. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4460–4470.
[17] Tom Monnier et al. “Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency”. Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part I. Springer, pp. 285–303.
[18] Kalyan Vasudev Alwala, Abhinav Gupta, and Shubham Tulsiani. “Pre-train, self-train, distill: A simple recipe for supersizing 3d reconstruction”. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3773–3782.
[19] Kaiming He et al. “Deep residual learning for image recognition”. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.
[20] Eric R Chan et al. “pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis”. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5799–5809.
[21] Vincent Sitzmann et al. “Implicit neural representations with periodic activation functions”. Advances in Neural Information Processing Systems 33 (2020), pp. 7462–7473.
[22] Ethan Perez et al. “Film: Visual reasoning with a general conditioning layer”. Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32. isbn: 2374-3468.
[23] Herbert Edelsbrunner and Ernst P Mücke. “Three-dimensional alpha shapes”. ACM Transactions On Graphics (TOG) 13.1 (1994), pp. 43–72. issn: 0730-0301.
[24] Leif E Peterson. “K-nearest neighbor”. Scholarpedia 4.2 (2009), p. 1883. issn: 1941-6016.
[25] Angel X Chang et al. “Shapenet: An information-rich 3d model repository”. arXiv preprint arXiv:1512.03012 (2015).
[26] Xingyuan Sun et al. “Pix3d: Dataset and methods for single-image 3d shape modeling”. Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2974–2983.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *