帳號:guest(3.133.125.148)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):蘇政瑋
作者(外文):Su, Jheng-Wei
論文名稱(中文):運用2D/3D模態之多元尺度室內格局重建
論文名稱(外文):Exploiting 2D/3D Modalities for Indoor Layout Reconstruction across Varied Scales
指導教授(中文):朱宏國
指導教授(外文):Chu, Hung-Kuo
口試委員(中文):廖弘源
胡敏君
姚智原
葉家宏
賴尚宏
莊永裕
口試委員(外文):Liao, Hong-Yuan
Hu, Min-Chun
Yao, Chih-Yuan
Yeh, Chia-Hung
Lai, Shang-Hong
Chuang, Yung-Yu
學位類別:博士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:107062554
出版年(民國):113
畢業學年度:112
語文別:英文
論文頁數:94
中文關鍵詞:機器學習場景理解格局預測
外文關鍵詞:Machine LearningScene UnderstandingLayout Reconstruction
相關次數:
  • 推薦推薦:0
  • 點閱點閱:72
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
重建來自2D影像或非結構化3D點雲數據的室內格局在近年來受到了重視,並在場景理解、機器人技術、虛擬實境(VR)、擴增實境(AR)、混合實境(XR)等領域中創造了許多應用。在本論文中,我們提出了一系列的方法,利用不同尺度的輸入來重建室內格局。首先,我們通過更好的模型改進了三種現有的單視角室內格局預測網路,並提出了一個單視角室內格局資料集,以從不同方面評估這三種網路。然後,我們發現從單一視角重建格局的主要侷限有兩個方面:遮擋和解析度不足。接下來,我們利用多視角室內格局重建來解決單視角方法的問題。在這一部分中,我們提出了一種新穎的Transformer架構,可以同時預測像機姿態並預測多視角一致的室內格局。然而,我們發現即使我們可以為每個單獨的房間重建格局,我們仍然需要手動拼貼所有的房間,以生成一個完整的樓層平面圖,因為跨不同房間的像機姿態預測是困難的。最後,我們利用3D點雲作為輸入,提出了一個基於Transformer的網路,以全新的方框表示法重建樓層平面圖。致此,我們提出了一系列方法,可以從單視角單房間、多視角單房間到整個樓層平面圖的室內格局預測。通過一系列實驗,我們提出的方法可以提高格局預測的整體準確度與真實程度。
Reconstructing the layout from 2D images or unstructured 3D point cloud data has received significant attention in recent years and created lots of applications in scene understanding, robotics, VR/AR/XR, etc. In this proposal, we present a series of methods that reconstruct the layout exploiting different types of input data across varied scales. First, we refine the several existing single view layout estimation networks with the better backbone and propose a single-view layout dataset to evaluate these three methods from different angles. We then find the major disadvantages of reconstructing layout from single view are two-fold: occlusion and insufficient resolution. Next, we leverage multi-view layout reconstruction to solve the problems of single-view methods. In this part, we propose a novel transformer architecture that can register camera pose and predict multi-view consistent layout simultaneously. However, we find that even if we can reconstruct layout for every individual room, we still need to collage all the rooms manually to generate a complete floorplan, since the camera registration across different rooms is difficult. Finally, we leverage the 3D point cloud data as input and propose a transformer-based network to reconstruct the floorplan layout with a novel box representation. To this end, we propose a series of methods that can estimate the indoor layout from single-view per room, multi-view per room, to the whole floorplan. Along with a series of experiments, our proposed methods can improve the overall quality and realism of the layout reconstructions.
Abstract (Chinese) ......................................... i
Abstract ................................................... ii
Acknowledgements ........................................... iii
Contents ................................................... iv
List of Figures ............................................ vii
List of Tables ............................................. xiii
1 Introduction ............................................. 1
2 Related Work ............................................. 6
2.1 Optimization-based Method .............................. 7
2.2 Learning-based Method .................................. 7
2.3 Depth-assisted Method .................................. 8
2.4 Panorama registration .................................. 9
2.5 Scene reconstruction using sparse panoramas ............ 10
2.6 Floorplan reconstruction using 3D data ................. 11
3 Datasets ................................................. 13
3.1 PanoContext Dataset .................................... 13
3.2 Stanford2D-3D Dataset .................................. 14
3.3 MatterportLayout Dataset ............................... 14
3.4 ZillowIndoor Dataset ................................... 15
3.5 Gibsonlayout ........................................... 15
4 Method - Single-view Layout Reconstruction ............... 17
4.1 Motivation and Overview ................................ 17
4.2 General framework ...................................... 18
4.3 Input and Pre-processing ............................... 18
4.4 Encoder ................................................ 22
4.5 Layout Prediction ...................................... 22
4.6 Loss Function .......................................... 25
4.7 Structured Layout Fitting .............................. 26
4.8 Implementation Details ................................. 31
4.9 Data augmentation ...................................... 31
4.10 Training Scheme and Parameters ........................ 32
4.11 Summarization of Modifications ........................ 33
5 Evaluation - Single-view Layout Reconstruction ........... 35
5.1 Experiments and Discussions ............................ 35
5.2 Evaluation Setup ....................................... 36
5.3 Performance on PanoContext and Stanford2D-3D ........... 37
5.4 Performance on MatterportLayout ........................ 44
5.5 Discussions ............................................ 45
5.6 Limitation and Future work ............................. 47
6 Method - Multi-view Layout Reconstruction ................ 49
6.1 Motivation and Overview ................................ 49
6.2 Network architecture ................................... 50
6.3 Loss functions ......................................... 52
6.4 Extracting boundary and correspondence information ..... 55
6.5 Layout fusion .......................................... 56
7 Evaluation - Multi-view Layout Reconstruction ............ 57
7.1 Experimental Settings .................................. 57
7.2 Layout reconstruction performance ...................... 60
7.3 Panorama registration performance ...................... 60
7.4 Ablation Study ......................................... 60
7.5 Limitation and Future work ............................. 62
8 Method - Floorplan Reconstruction ........................ 63
8.1 Motivation and Overview ................................ 63
8.2 Pre-processing ......................................... 65
8.3 Floorplan representation ............................... 66
8.4 Network architecture ................................... 67
8.5 Post-processing ........................................ 69
8.6 Loss functions ......................................... 70
9 Evaluation - Floorplan Reconstruction .................... 73
9.1 Results of Floorplan Reconstruction .................... 73
9.1.1 Experimental settings................................. 73
9.1.2 Floorplan reconstruction performance ................. 75
9.1.3 Ablation Studies ..................................... 77
9.2 Limitation and Future work.............................. 78
10 Conclusion .............................................. 80
Reference .................................................. 81
[1] Capture, share, and collaborate the built world in immersive 3d. https:// matterport.com/. Accessed: 2024-01-12.
[2] Polycam room mode. https://learn.poly.cam/room-mode. Ac- cessed: 2024-01-12.
[3] Roomplan overview - augmented reality - apple developer. https: //developer.apple.com/augmented-reality/roomplan/. Ac- cessed: 2024-01-12.
[4] Waleed Abdulla. Mask r-cnn for object detection and instance segmentation on keras and tensorflow. https://github.com/matterport/Mask_RCNN, 2017.
[5] Antonio Adan and Daniel Huber. 3d reconstruction of interior wall surfaces un- der occlusion and clutter. In 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, pages 275–281, 2011.
[6] I. Armeni, A. Sax, A. R. Zamir, and S. Savarese. Joint 2D-3D-Semantic Data for Indoor Scene Understanding. ArXiv e-prints, February 2017.
[7] Maarten Bassier and Maarten Vergauwen. Unsupervised reconstruction of build- ing information modeling wall objects from point cloud data. Automation in Construction, 120:103338, 2020.
[8] Angela Budroni and Jan Boehm. Automated 3d reconstruction of interiors from point clouds. International Journal of Architectural Computing, 8:55–73, 01 2010.
[9] Ricardo Cabral and Yasutaka Furukawa. Piecewise planar and compact floorplan reconstruction from images. In CVPR, pages 628–635, 2014.
[10] Ricardo Cabral and Yasutaka Furukawa. Piecewise planar and compact floorplan reconstruction from images. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 628–635, 2014.
[11] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformers. In ECCV, pages 213–229. Springer, 2020.
[12] Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niess- ner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matter- port3d: Learning from rgb-d data in indoor environments. arXiv preprint arXiv:1709.06158, 2017.
[13] Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niess- ner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments. 3DV, 2017.
[14] Kunal Chelani, Chitturi Sidhartha, and Venu Madhav Govindu. Towards auto- mated floorplan generation. In Proceedings of the 11th Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP 2018, New York, NY, USA, 2020. Association for Computing Machinery.
[15] Jiacheng Chen, Chen Liu, Jiaye Wu, and Yasutaka Furukawa. Floor-sp: Inverse cad for floorplans by sequential room-wise shortest path. In The IEEE Interna- tional Conference on Computer Vision (ICCV), 2019.
[16] Jiacheng Chen, Yiming Qian, and Yasutaka Furukawa. Heat: Holistic edge atten- tion transformer for structured reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3866– 3875, June 2022.
[17] KefanChen,NoahSnavely,andAmeeshMakadia.Wide-baselinerelativecamera pose estimation with directional learning. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pages 3258–3268, June 2021.
[18] James M Coughlan and Alan L Yuille. Manhattan world: Compass direction from a single image by bayesian inference. In ICCV, volume 2, pages 941–947. IEEE, 1999.
[19] Steve Cruz, Will Hutchcroft, Yuguang Li, Naji Khosravan, Ivaylo Boyadzhiev, and Sing Bing Kang. Zillow indoor dataset: Annotated floor plans with 360deg panoramas and 3d room layouts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2133–2143, June 2021.
[20] Steve Cruz, Will Hutchcroft, Yuguang Li, Naji Khosravan, Ivaylo Boyadzhiev, and Sing Bing Kang. Zillow indoor dataset: Annotated floor plans with 360o panoramas and 3d room layouts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2133–2143, June 2021.
[21] Saumitro Dasgupta, Kuan Fang, Kevin Chen, and Silvio Savarese. Delay: Robust spatial layout estimation for cluttered indoor scenes. In CVPR, pages 616–624, 2016.
[22] Luca Del Pero, Joshua Bowdish, Daniel Fried, Bonnie Kermgard, Emily Hartley,
and Kobus Barnard. Bayesian geometric modeling of indoor scenes. In CVPR, pages 2719–2726, 2012.
[23] Luca Del Pero, Joshua Bowdish, Bonnie Kermgard, Emily Hartley, and Kobus Barnard. Understanding bayesian rooms using composite 3d object models. In CVPR, pages 153–160, 2013.
[24] Erick Delage, Honglak Lee, and Andrew Y Ng. A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image. In CVPR, volume 2, pages 2418–2428. IEEE, 2006.
[25] Ricardo Fabbri, Timothy Duff, Hongyi Fan, Margaret H. Regan, David da Costa de Pinho, Elias Tsigaridas, Charles W. Wampler, Jonathan D. Hauenstein, Peter J. Giblin, Benjamin Kimia, Anton Leykin, and Tomas Pajdla. Trplp - trifo- cal relative pose from lines at points. In Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), June 2020.
[26] Alex Flint, Christopher Mei, David Murray, and Ian Reid. A dynamic program- ming approach to reconstructing building interiors. In European Conference on Computer Vision, pages 394–407. Springer, 2010.
[27] Kosuke Fukano, Yoshihiko Mochizuki, Satoshi Iizuka, Edgar Simo-Serra, Aki- hiro Sugimoto, and Hiroshi Ishikawa. Room reconstruction from a single spher- ical image by higher-order energy minimization. 2016 23rd International Con- ference on Pattern Recognition (ICPR), pages 1768–1773, 2016.
[28] Yasutaka Furukawa, Brian Curless, Steven M. Seitz, and Richard Szeliski. Re- constructing building interiors from images. In 2009 IEEE 12th International Conference on Computer Vision, pages 80–87, 2009.
[29] Richard Hartley and Andrew Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, USA, 2 edition, 2003.
[30] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learn- ing for image recognition. arXiv preprint arXiv:1512.03385, 2015.
[31] Varsha Hedau, Derek Hoiem, and David Forsyth. Recovering the spatial layout of cluttered rooms. In ICCV, 2009.
[32] Varsha Hedau, Derek Hoiem, and David Forsyth. Thinking inside the box: Using appearance models and context based on room geometry. ECCV, pages 224–237, 2010.
[33] Derek Hoiem, Alexei A Efros, and Martial Hebert. Geometric context from a single image. In ICCV, volume 1, pages 654–661. IEEE, 2005.
[34] Derek Hoiem, Alexei A Efros, and Martial Hebert. Recovering surface layout from an image. International Journal of Computer Vision, 75(1):151–172, 2007.
[35] Will Hutchcroft, Yuguang Li, Ivaylo Boyadzhiev, Zhiqiang Wan, Haiyan Wang, and Sing Bing Kang. Covispose: Co-visibility pose transformer for wide-baseline relative pose estimation in 360◦ indoor panoramas. In ECCV, 2022.
[36] S. Ikehata, H. Yang, and Y. Furukawa. Structured indoor modeling. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 1323–1331, Los Alamitos, CA, USA, dec 2015. IEEE Computer Society.
[37] Hamid Izadinia, Qi Shan, and Steven M Seitz. Im2cad. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5134– 5143, 2017.
[38] Shunping Ji, Zijie Qin, Jie Shan, and Meng Lu. Panoramic slam from a multi- ple fisheye camera rig. ISPRS Journal of Photogrammetry and Remote Sensing, 159:169–183, 2020.
[39] Wei Jiang, Eduard Trulls, Jan Hosang, Andrea Tagliasacchi, and Kwang Moo Yi. Cotr: Correspondence transformer for matching across images. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 6207–6217, October 2021.
[40] Zhigang Jiang, Zhongzheng Xiang, Jinhua Xu, and Ming Zhao. Lgt-net: Indoor panoramic room layout estimation with geometry-aware transformer network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1654–1663, June 2022.
[41] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[42] Chen-Yu Lee, Vijay Badrinarayanan, Tomasz Malisiewicz, and Andrew Rabi- novich. Roomnet: End-to-end room layout estimation. In Proceedings of the IEEE International Conference on Computer Vision, pages 4865–4874, 2017.
[43] David Lee, Abhinav Gupta, Martial Hebert, and Takeo Kanade. Estimating spa- tial layout of rooms using volumetric reasoning about objects and surfaces. In NIPS, pages 1288–1296, 2010.
[44] David C Lee, Martial Hebert, and Takeo Kanade. Geometric reasoning for single image structure recovery. In CVPR, pages 2136–2143. IEEE, 2009.
[45] Minglei Li, Peter Wonka, and Liangliang Nan. Manhattan-world urban recon- struction from point clouds. In ECCV, 2016.
[46] Gahyeon Lim and Nakju Doh. Automatic reconstruction of multi-level indoor spaces from point cloud and trajectory. Sensors (Basel), 2021.
[47] Chen Liu, Pushmeet Kohli, and Yasutaka Furukawa. Layered scene decomposi- tion via the occlusion-crf. In CVPR, pages 165–173, 2016.
[48] Chen Liu, Jiaye Wu, and Yasutaka Furukawa. Floornet: A unified framework for floorplan reconstruction from 3d scans. In Proceedings of the European Confer- ence on Computer Vision (ECCV), pages 201–217, 2018.
[49] Chen Liu, Jiaye Wu, and Yasutaka Furukawa. Floornet: A unified framework for floorplan reconstruction from 3d scans. In Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss, editors, Computer Vision – ECCV 2018, pages 203–219, Cham, 2018. Springer International Publishing.
[50] Chenxi Liu, Alexander G Schwing, Kaustav Kundu, Raquel Urtasun, and Sanja Fidler. Rent3d: Floor-plan priors for monocular layout estimation. In CVPR, pages 3413–3421, 2015.
[51] Jiachen Liu, Yuan Xue, Jose Duarte, Krishnendra Shekhawat, Zihan Zhou, and Xiaolei Huang. End-to-end graph-constrained vectorized floorplan generation with panoptic refinement. In ECCV, 2022.
[52] Josep Llado ́s, Jaime Lo ́pez-Krahe, and Enric Mart ́ı. A system to understand hand-drawn floor plans using subgraph isomorphism and hough transform. Mach. Vision Appl., 10(3):150–158, aug 1997.
[53] Arun Mallya and Svetlana Lazebnik. Learning informative edge maps for indoor scene layout prediction. In ICCV, pages 936–944, 2015.
[54] Aron Monszpart, Nicolas Mellado, Gabriel J Brostow, and Niloy J Mitra. Rapter: rebuilding man-made scenes with regular arrangements of planes. ACM Trans. Graph., 34(4):103–1, 2015.
[55] Aron Monszpart, Nicolas Mellado, Gabriel J. Brostow, and Niloy J. Mitra. Rapter: Rebuilding man-made scenes with regular arrangements of planes. ACM Trans. Graph., 34(4), jul 2015.
[56] Pierre Moulon, Pascal Monasse, Romuald Perrot, and Renaud Marlet. Open- mvg: Open multiple view geometry. In International Workshop on Reproducible Research in Pattern Recognition, pages 60–74. Springer, 2016.
[57] Liangliang Nan and Peter Wonka. Polyfit: Polygonal surface reconstruction from point clouds. In Proceedings of the IEEE International Conference on Computer Vision, pages 2353–2361, 2017.
[58] Richard A Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J Davison, Pushmeet Kohli, Jamie Shotton, Steve Hodges, and An- drew W Fitzgibbon. Kinectfusion: Real-time dense surface mapping and track- ing. In ISMAR, volume 11, pages 127–136, 2011.
[59] Sebastian Ochmann, Richard Vock, Raoul Wessel, and Reinhard Klein. Auto- matic reconstruction of parametric building models from indoor point clouds. Computers & Graphics, 54:94–103, 2016. Special Issue on CAD/Graphics 2015.
[60] A. Pagani and D. Stricker. Structure from motion using full spherical panoramic cameras. In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pages 375–382, 2011.
[61] G.Pintore,F.Ganovelli,R.Pintus,R.Scopigno,andE.Gobbetti.3dfloorplanre- covery from overlapping spherical images. Computational Visual Media, 4:367– 383, 2018.
[62] Giovanni Pintore, Marco Agus, and Enrico Gobbetti. AtlantaNet: Inferring the 3D indoor layout from a single 360 image beyond the Manhattan world assump- tion. In Proc. ECCV, August 2020.
[63] Giovanni Pintore, Fabio Ganovelli, Alberto Jaspe Villanueva, and Enrico Gob- betti. Automatic modeling of cluttered multi-room floor plans from panoramic images. Computer Graphics Forum, 38(7):347–358, 2019.
[64] Giovanni Pintore, Valeria Garro, Fabio Ganovelli, Enrico Gobbetti, and Marco Agus. Omnidirectional image capture on mobile devices for fast automatic gen- eration of 2.5 d indoor maps. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1–9. IEEE, 2016.
[65] SrikumarRamalingam,JaishankerKPillai,ArpitJain,andYuichiTaguchi.Man- hattan junction catalogue for spatial reasoning of indoor scenes. In CVPR, pages 3065–3072, 2013.
[66] Yuzhuo Ren, Shangwen Li, Chen Chen, and C-C Jay Kuo. A coarse-to-fine in- door layout estimation (cfile) method. In Asian Conference on Computer Vision, pages 36–51. Springer, 2016.
[67] Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, and Silvio Savarese. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 658–666, 2019.
[68] Herbert Robbins and Sutton Monro. A stochastic approximation method. The annals of mathematical statistics, pages 400–407, 1951.
[69] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
[70] Y. Salau ̈n, R. Marlet, and P. Monasse. Line-based robust sfm with little image overlap. In 2017 International Conference on 3D Vision (3DV), pages 195–204, 2017.
[71] Victor Sanchez and Avideh Zakhor. Planar 3d modeling of building interiors from point cloud data. In 2012 19th IEEE International Conference on Image Processing, pages 1777–1780, 2012.
[72] Johannes Lutz Scho ̈nberger and Jan-Michael Frahm. Structure-from-motion re- visited. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[73] Alexander G Schwing, Tamir Hazan, Marc Pollefeys, and Raquel Urtasun. Effi- cient structured prediction for 3d indoor scene understanding. In 2012 IEEE Con- ference on Computer Vision and Pattern Recognition, pages 2815–2822. IEEE, 2012.
[74] Alexander G Schwing and Raquel Urtasun. Efficient exact inference for 3d in- door scene understanding. In European Conference on Computer Vision, pages 299–313. Springer, 2012.
[75] Mohammad Amin Shabani, Weilian Song, Makoto Odamaki, Hirochika Fujiki, and Yasutaka Furukawa. Extreme structure from motion for indoor panoramas without visual overlaps. In Proceedings of the IEEE/CVF International Confer- ence on Computer Vision (ICCV), October 2021.
[76] Sinisa Stekovic, Mahdi Rad, Friedrich Fraundorfer, and Vincent Lepetit. Monte- Floor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans. In International Conference on Computer Vision (ICCV 2021), Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV, pages 16034– 16043, Online, Canada, October 2021.
[77] Jheng-Wei Su, Chi-Han Peng, Peter Wonka, and Hung-Kuo Chu. Gpr-net: Multi- view layout estimation via a geometry-aware panorama registration network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6468–6477, 2023.
[78] Jheng-Wei Su, Kuei-Yu Tung, Chi-Han Peng, Peter Wonka, and Hung-Kuo Chu. SLIBO-net: Floorplan reconstruction via slicing box representation with local geometry regularization. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
[79] Cheng Sun, Chi-Wei Hsiao, Min Sun, and Hwann-Tzong Chen. Horizonnet: Learning room layout with 1d representation and pano stretch data augmenta- tion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1047–1056, 2019.
[80] Aparna Taneja, Luca Ballan, and Marc Pollefeys. Registration of spherical panoramic images with cadastral 3d models. In 2012 Second International Con- ference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pages 479–486, 2012.
[81] Rafael Grompone Von Gioi, Jeremie Jakubowicz, Jean-Michel Morel, and Gre- gory Randall. Lsd: A fast line segment detector with a false detection control. IEEE transactions on pattern analysis and machine intelligence, 32(4):722–732, 2008.
[82] Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, and Yi-Hsuan Tsai. Led2-net: Monocular 360deg layout estimation via differentiable depth render- ing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 12956–12965, June 2021.
[83] Haiyan Wang, Will Hutchcroft, Yuguang Li, Zhiqiang Wan, Ivaylo Boyadzhiev, Yingli Tian, and Sing Bing Kang. Psmnet: Position-aware stereo merging net- work for room layout estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8616–8625, June 2022.
[84] Fei Xia, Amir R. Zamir, Zhi-Yang He, Alexander Sax, Jitendra Malik, and Silvio Savarese. Gibson env: real-world perception for embodied agents. In Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on. IEEE, 2018.
[85] Fei Xia, Amir R. Zamir, Zhiyang He, Alexander Sax, Jitendra Malik, and Silvio Savarese. Gibson env: Real-world perception for embodied agents. In Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[86] Xuehan Xiong, Antonio Adan, Burcu Akinci, and Daniel Huber. Automatic cre- ation of semantically rich 3d building models from laser scanner data. Automa- tion in Construction, 31:325–337, 2013.
[87] Jiu Xu, Bjo ̈rn Stenger, Tommi Kerola, and Tony Tung. Pano2cad: Room layout from a single panorama image. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 354–362. IEEE, 2017.
[88] Hao Yang and Hui Zhang. Efficient 3d room shape recovery from a single panorama. In Proceedings of the IEEE Conference on Computer Vision and Pat- tern Recognition, pages 5422–5430, 2016.
[89] Shang-Ta Yang, Chi-Han Peng, Peter Wonka, and Hung-Kuo Chu. Panoannota- tor: a semi-automatic tool for indoor panorama layout annotation. In SIGGRAPH Asia 2018 Posters, page 34. ACM, 2018.
[90] Shang-Ta Yang, Fu-En Wang, Chi-Han Peng, Peter Wonka, Min Sun, and Hung- Kuo Chu. Dula-net: A dual-projection network for estimating room layouts from a single rgb panorama. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3363–3372, 2019.
[91] Yang Yang, Shi Jin, Ruiyang Liu, Sing Bing Kang, and Jingyi Yu. Automatic 3d indoor scene modeling from single panorama. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[92] Yuanwen Yue, Theodora Kontogianni, Konrad Schindler, and Francis Engel- mann. Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
[93] Jian Zhang, Chen Kan, Alexander G Schwing, and Raquel Urtasun. Estimating the 3d layout of indoor scenes and its clutter from depth sensors. In ICCV, pages 1273–1280, 2013.
[94] Yinda Zhang, Shuran Song, Ping Tan, and Jianxiong Xiao. Panocontext: A whole-room 3d context model for panoramic scene understanding. In European conference on computer vision, pages 668–686. Springer, 2014.
[95] Yibiao Zhao and Song-Chun Zhu. Scene parsing by integrating function, geome- try and appearance models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3119–3126, 2013.
[96] Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. Structured3d: A large photo-realistic dataset for structured 3d modeling. In Pro- ceedings of The European Conference on Computer Vision (ECCV), 2020.
[97] Chuhang Zou, Alex Colburn, Qi Shan, and Derek Hoiem. Layoutnet: Recon- structing the 3d room layout from a single rgb image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2051– 2059, 2018.
[98] Chuhang Zou, Ruiqi Guo, Zhizhong Li, and Derek Hoiem. Complete 3d scene parsing from an rgbd image. International Journal of Computer Vision, 127(2):143–162, 2019.
[99] Chuhang Zou, Jheng-Wei Su, Chi-Han Peng, Alex Colburn, Qi Shan, Peter Wonka, Hung-Kuo Chu, and Derek Hoiem. Manhattan room layout reconstruc- tion from a single 360◦ image: A comparative study of state-of-the-art methods. IJCV, Feb 2021.
[100] Chuhang Zou, Jheng-Wei Su, Chi-Han Peng, Alex Colburn, Qi Shan, Peter Wonka, Hung-Kuo Chu, and Derek Hoiem. Manhattan room layout reconstruc- tion from a single 360◦ image: A comparative study of state-of-the-art methods. International Journal of Computer Vision, 129(5):1410–1431, 2021.
[101]OnurO ̈zyes ̧il,VladislavVoroninski,RonenBasri,andAmitSinger.Asurveyof structure from motion. Acta Numerica, 26:305–364, 2017.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *