帳號:guest(13.59.3.29)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):樊慶玲
作者(外文):Fan, Ching-Ling
論文名稱(中文):最佳化沈浸式影片串流至頭戴式顯示器
論文名稱(外文):Optimizing Immersive Video Streaming to Head-Mounted Virtual Reality
指導教授(中文):徐正炘
指導教授(外文):Hsu, Cheng-Hsin
口試委員(中文):許健平
李哲榮
黃俊穎
陳健
逄愛君
口試委員(外文):Sheu, Jang-Ping
Lee, Che-Rung
Huang, Chun-Ying
Chen, Chien
Pang, Ai-Chun
學位類別:博士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:103062555
出版年(民國):109
畢業學年度:109
語文別:英文
論文頁數:129
中文關鍵詞:虛擬實境多媒體串流使用者體驗頭戴式顯示器360度影片
外文關鍵詞:Virtual RealityMultimedia StreamingQuality of ExperienceHead-Mounted Display360-Degree Video
相關次數:
  • 推薦推薦:0
  • 點閱點閱:479
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
隨著科技日新月異,人們不再滿足於僅僅使用平面顯示器觀看高清(Full High Definition)、超清(Ultra-High Definition)串流影片,而開始追求沈浸式(immersive)的觀看體驗。因此,能提供使用者沈浸式體驗的360度影片蔚為潮流,例如,知名影音串流平台如YouTube及Facebook皆已支援360度影片串流。此外使用頭戴式顯示器(HMD)觀看360度影片,更能讓使用者得到身歷其境的體驗,因為使用者能透過轉頭自然地改變觀看角度,猶如親身處在影片的虛擬環境中。然而,串流360度影片至頭戴式顯示器並非易事。首先360度影片為提供使用者頭戴式顯示器中擬真的畫面,需要極高的解析度而造成相當可觀得檔案大小,這將使頻寬不堪負荷而造成額外的延遲及差強人意的使用者體驗。此外,由於360度影片需投影到二維影片後才能進行壓縮,所造
成的變形使現存的影片品質指標,如峰值信噪比(Peak Signal-to-Noise Ratio, PSNR)及結構相似性(Structural SIMilarity Index, SSIM)皆難以準確衡量360度影片的觀看品質,更遑論考慮人類複雜的視覺系統及使用者多元的觀看行為。這些困難阻礙了以使用者體驗為導向的360度影片串流最佳化發展。為了解決上述的挑戰,本論文解決了360度影片串流至頭戴式顯示器的三個核心問題,這三個問題分別處於串流的三個階段:串流傳輸、壓縮與包裝,以及顯示與觀看。首先,我們設計並開發了一個神經網路,運用感測資料及影片分析進行訓練,以預測使用者未來視野。我們所提出的預測網路有效地減少360度影片傳輸所需頻寬,但仍維持相當好的影片品質。接下來,我們利用影片模型、觀看機率及客戶端頻寬分佈,來計算最佳化編碼階梯(Encoding Ladder),以決定應儲存哪些影片版本在有儲存空間限制的伺服器上,藉此最佳化
異質客戶端的觀看品質。最後,我們設計並進行使用者研究與分析,調查並量化各式影響使用者體驗的因子,最終參考這些因子來建立觀看360度影片的使用者體驗模型。我們所研究的這三個核心問題可以有效地最佳化360度影片串流系統,這些開發的技術以及經驗,也將成為未來如虛擬實境、擴增實境、混合實境、以及延伸實境,這些創新應用的基石。
Immersive videos, a.k.a. 360◦ videos, have become increasingly more
popular. 360° deliver more immersive viewing experience to end users because of the freedom of changing viewports. Streaming immersive videos
to Head-Mounted Displays (HMDs) offer even more immersive experience by allowing users to arbitrary rotate their heads to change the viewports as if they are physically in virtual worlds. However, streaming high-quality 360° videos to HMDs is quite challenging. First, 360° videos contain much more information than conventional videos, and thus are much larger in resolutions and size. This may introduce additional delay and degraded user experience due to insufficient network bandwidth. Second, existing quality metrics are less applicable to 360° videos, which is due to the complex human visual systems and diverse viewing behaviors. This inhibits the development of QoE-orientated optimization for 360° videos. To address these challenges, we study three core problems to optimize the: (i) delivery, (ii) production, and (iii) consumption of immersive video content in the emerging streaming systems to HMDs. First, we design a neural network that leverages sensor and content features to predict the future viewports of HMD viewers watching
immersive tiled videos. Our proposed prediction network effectively reduces the bandwidth consumption while offering comparable video quality. Second, we develop a divide-and-conquer approach to optimize the encoding ladder of immersive tiled videos considering the video models, viewing probabilities, and client distribution. Our proposed algorithm aims to maximize the overall viewing quality of clients under the limits of server storage and heterogeneous client bandwidths. Last, we design and conduct a user study to investigate and quantify the impacts of various QoE factors. We then use these factors to build QoE models for the immersive videos. The outcomes of these three studies result in better optimized immersive video streaming systems to HMDs. Our developed technologies and accumulated experience will be the cornerstone of the upcoming Virtual Reality (VR), Mixed Reality (MR), and Augmented Reality (AR), collectively referred to as Extended Reality (XR), applications.
Acknowledgments i
致謝ii
中文摘要iii
Abstract iv
1 Introduction 1
1.1 Delivery Optimization: Fixation Prediction . . . . . . . . . . . . . . . . 3
1.2 Production Optimization: Optimal Laddering . . . . . . . . . . . . . . . 3
1.3 Consumption Optimization: QoE Modeling . . . . . . . . . . . . . . . . 4
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Background and Related Work 8
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.1 Off-the-Shelf Hardware . . . . . . . . . . . . . . . . . . . . . . 8
2.1.2 Existing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.3 General 360 Video Streaming Framework . . . . . . . . . . . . . 13
2.1.4 Tiled 360◦ Videos . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Delivery: Fixation Prediction . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 2D Image/Video Saliency . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 360◦ Image/Video Saliency . . . . . . . . . . . . . . . . . . . . . 18
2.2.3 Fixation/Head Movement Prediction in HMDs . . . . . . . . . . 19
2.3 Production: Optimal Laddering . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Viewport-Adaptive Tiled Streaming . . . . . . . . . . . . . . . . 21
2.3.2 Adaptive BitRate (ABR) Algorithms . . . . . . . . . . . . . . . . 22
2.3.3 Bitrate Allocation and Optimal Laddering Algorithms . . . . . . 23
2.4 Consumption: QoE Modeling . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.1 QoE Measurements . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.2 QoE Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.3 QoE Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 Delivery Optimization: Fixation Prediction 27
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.1 360◦ Video Streaming Systems . . . . . . . . . . . . . . . . . . . 28
3.1.2 Viewport and Modeling . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Fixation Prediction Networks . . . . . . . . . . . . . . . . . . . . . . . . 30

3.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.2 Orientation-Based Network . . . . . . . . . . . . . . . . . . . . 31
3.2.3 Tile-Based Network . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.4 Future-Aware Network . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Datasets and Network Implementations . . . . . . . . . . . . . . . . . . 32
3.3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.2 Network Implementations . . . . . . . . . . . . . . . . . . . . . 33
3.4 Overlapping Virtual Viewports . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.1 Projection Models . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.2 Overlapping Virtual Viewport (OVV) . . . . . . . . . . . . . . . 38
3.4.3 Validations with Real Computer Vision Algorithms . . . . . . . . 39
3.4.4 Fixation Prediction with OVV . . . . . . . . . . . . . . . . . . . 40
3.4.5 Validation with Additional Videos/Viewers . . . . . . . . . . . . 41
3.5 Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5.1 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.5.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5.4 A Small-Scale User Study . . . . . . . . . . . . . . . . . . . . . 49
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4 Production Optimization: Optimal Laddering 51
4.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Optimal Laddering Problem . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.2 Video Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 Problem Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4 Per-Class Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4.1 Per-Class Formulation . . . . . . . . . . . . . . . . . . . . . . . 59
4.4.2 Lagrangian-Based Algorithm: PC-LBA . . . . . . . . . . . . . . 60
4.4.3 Greedy-based Algorithm: PC-GBA . . . . . . . . . . . . . . . . 63
4.5 Global Optimization for the Optimal Ladders . . . . . . . . . . . . . . . 64
4.6 Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.6.1 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.6.3 Per-Class Optimization Results . . . . . . . . . . . . . . . . . . 70
4.6.4 Optimal Laddering Results . . . . . . . . . . . . . . . . . . . . . 73
4.6.5 Summary of the Key Findings . . . . . . . . . . . . . . . . . . . 75
4.7 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.7.1 Comparisons with the Optimal Solution . . . . . . . . . . . . . . 76
4.7.2 Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.8 Proofs of Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5 Consumption Optimization: QoE Modeling 85
5.1 QoE of 360◦ Tiled Videos Streamed to HMDs . . . . . . . . . . . . . . . 87
5.1.1 QoE Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.1.2 QoE Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.2 A User Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2.1 Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2.2 Dataset and Subjects . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2.4 Viewing Behaviors and Video Classification . . . . . . . . . . . . 92
5.2.5 The Overall QoE and QoE Features . . . . . . . . . . . . . . . . 93
5.2.6 Diverse QoE Models . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3 MOS Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3.1 Regressor Selection . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.3.2 Derived Model Performance . . . . . . . . . . . . . . . . . . . . 96
5.4 IS Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6 Conclusions and FutureWork 103
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.2.1 Live 360◦ Video Streaming . . . . . . . . . . . . . . . . . . . . . 104
6.2.2 6DoF Content Streaming . . . . . . . . . . . . . . . . . . . . . . 105
6.2.3 VR Gaming with Multiple Observers . . . . . . . . . . . . . . . 105
6.2.4 Movie Creation for XR Content . . . . . . . . . . . . . . . . . . 106
Bibliography 107
[1] Is 360-degree and VR video the future of marketing?, July 2017. https://www.marketingtechnews.net/news/2017/jul/06/360-degree-and-vr-video-future-marketing/.
[2] M. Abdallah, C. Griwodz, K. Chen, G. Simon, P. Wang, and C. Hsu. Delay sensitive video computing in the cloud: A survey. ACM Transactions on Multimedia Computing, Communications, and Applications, 14(3s):54:1–54:29, 2018.
[3] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. S¨usstrunk. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11):2274–2282, 2012.
[4] P. Alface, M. Aerts, D. Tytgat, S. Lievens, C. Stevens, N. Verzijp, and J.Macq. 16K cinematic VR streaming. In Proc. of ACM International Conference on Multimedia (MM’17), pages 1105–1112, Mountain View, CA, October 2017.
[5] P. Alface, J. Macq, and N. Verzijp. Interactive omnidirectional video delivery: A bandwidth-effective approach. Bell Labs Technical Journal, 16(4):135–147, March 2012.
[6] T. Alshawi, Z. Long, and G. AlRegib. Understanding spatial correlation in eye fixation maps for visual attention in videos. In Proc. of IEEE International Conference on Multimedia and Expo (ICME’16), pages 1–6, Seattle, WA, July 2016.
[7] R. Anderson, D. Gallup, J. Barron, J. Kontkanen, N. Snavely, C. Hernandez, S. Agarwal, and S. Seitz. Jump: Virtual reality video. ACM Transactions on Graphics, 35(6):198:1–198:13, November 2016.
[8] M. Anwar, J. Wang, W. Khan, A. Ullah, S. Ahmad, and Z. Fei. Subjective QoE of 360-degree virtual reality videos and machine learning predictions. IEEE Access, 8:148084–148099, August 2020.
[9] R. Aparicio-Pardo, K. Pires, A. Blanc, and G. Simon. Transcoding live adaptive video streams at a massive scale in the cloud. In Proc. of ACM International Conference on Multimedia Systems (MMSys’15), pages 49–60, Portland, OR, March 2015.
[10] S. Aroussi and A. Mellouk. Survey on machine learning-based QoE-QoS correlation models. In Proc. of IEEE International Conference on Computing, Management and Telecommunications (ComManTel’14), pages 200–204, Da Nang, Vietnam, April 2014.
[11] M. Assens, X. Giro-i-Nieto, K.McGuinness, and N. O’Connor. Saltinet: Scan-path prediction on 360-degree images using saliency volumes. In Proc. of IEEE International Conference on Computer Vision (ICCV’17), pages 2331–2338, Venice, Italy, October 2017.
[12] Y. Ban, L. Xie, Z. Xu, X. Zhang, Z. Guo, and Y. Wang. Cub360: Exploiting cross-users behaviors for viewport prediction in 360 video adaptive streaming. In Proc. of IEEE International Conference on Multimedia and Expo (ICME’18), pages 1–6, San Diego, CA, July 2018.
[13] Y. Bao, T. Zhang, A. Pande, H.Wu, and X. Liu. Motion-prediction-based multicast for 360-degree video transmissions. In Proc. of IEEE International Conference on Sensing, Communication, and Networking (SECON’17), pages 1–9, San Diego, CA, June 2017.
[14] D. Basak, S. Pal, and D. Patranabis. Support vector regression. Neural Information Processing-Letters and Reviews, 11(10):203–224, October 2007.
[15] A. Bentaleb, B. Taani, A. Begen, C. Timmerer, and R. Zimmermann. A survey on bitrate adaptation schemes for streaming media over HTTP. IEEE Communications Surveys & Tutorials, 21(1):562–585, 2018.
[16] Bert Hubert. tc, 2019. https://linux.die.net/man/8/tc.
[17] D. Bertsekas. Constrained Optimization and Lagrange Multiplier Methods. Academic Press, 2014.
[18] D. Bertsekas, R. Gallager, and P. Humblet. Data networks, volume 2. Prentice-Hall International New Jersey, 1992.
[19] M. Bessa, M. Melo, D. Narciso, L. Barbosa, and J. Vasconcelos-Raposo. Does 3D 360 video enhance user’s VR experience: An evaluation study. In Proc. of International Conference on Human Computer Interaction (Interaction’16), pages 16:1–16:4, Salamanca, Spain, September 2016.
[20] A. Borji, M. Cheng, H. Jiang, and J. Li. Salient object detection: A survey. arXiv preprint arXiv:1411.5878, 2014.
[21] L. Bottou. Large-scale machine learning with stochastic gradient descent. In Proc. of International Conference on Computational Statistics (COMPSTAT’10), pages 177–186. Paris, France, August 2010.
[22] K. Bouraqia, E. Sabir, M. Sadik, and L. Ladid. Quality of experience for streaming services: Measurements, challenges and insights. IEEE Access, 8:13341–13361, January 2020.
[23] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
[24] G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
[25] L. Bush, U. Hess, and G. Wolford. Transformations for within-subject designs: a monte carlo investigation. Psychological bulletin, 113(3):566, 1993.
[26] E. Canessa and L. Tenze. FishEyA: Live broadcasting around 360 degrees. In Proc. of ACM Symposium on Virtual Reality Software and Technology (VRST’14), pages 227–228, Edinburgh, Scotland, November 2014.
[27] S. Chaabouni, J. Benois-Pineau, and C. Amar. Transfer learning with deep networks for saliency prediction in natural video. In Proc. of IEEE International Conference on Image Processing (ICIP’16), pages 1604–1608, Phoenix, Arizona, September 2016.
[28] J. Chakareski, R. Aksu, X. Corbillon, G. Simon, and V. Swaminathan. Viewport-driven rate-distortion optimized 360◦ video streaming. In Proc. of IEEE International Conference on Communications (ICC’18), pages 1–7, Kansas, MO, May 2018.
[29] S. Channappayya, A. Bovik, C. Caramanis, and R. Heath. SSIM-optimal linear image restoration. In Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’08), pages 765–768, Las Vegas, NV, March 2008.
[30] X. Chen, A. Kasgari, and W. Saad. Deep learning for content-based personalized viewport prediction of 360 degree VR videos. IEEE Networking Letters, 2(2):81–84, June 2020.
[31] H. Cheng, C. Chao, J. Dong, H. Wen, T. Liu, and M. Sun. Cube padding for weakly-supervised saliency prediction in 360◦ videos. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18), pages 1420–1429, Salt Lake City, UT, June 2018.
[32] K. Choi and K. Jun. Real-Time panorama video system using networked multiple cameras. Journal of Systems Architecture, 64:110–121, March 2016.
[33] Cisco Inc. Cisco visual networking index: Forecast and trends, 2017—2022 white paper, 2017. https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-indexvni/complete-white-paper-c11-481360.html.
[34] Cisco Systems, Inc. The Zettabyte Era: Trends and Analysis, 2017. https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/vnihyperconnectivity-wp.html.
[35] C. Concolato, J. L. Feuvre, F. Denoual, E. Nassor, N. Ouedraogo, and J. Taquet. Adaptive streaming of HEVC tiled videos usingMPEG-DASH. IEEE Transactions on Circuits and Systems for Video Technology, PP(99):1–1, March 2017.
[36] X. Corbillon, A. Devlic, G. Simon, and J. Chakareski. Optimal set of 360-degree videos for viewport-adaptive streaming. In Proc. of ACM International Conference on Multimedia (MM’17), pages 943–951, Mountain View, CA, October 2017.
[37] X. Corbillon, A. Devlic, G. Simon, and J. Chakareski. Optimal set of 360-Degree videos for Viewport-Adaptive streaming. In Proc. of ACM International Conference on Multimedia (MM’17), pages 943–951, Mountain View, CA, October 2017.
[38] X. Corbillon, G. Simon, A. Devlic, and J. Chakareski. Viewport-adaptive navigable 360-degree video delivery. In Proc. of IEEE International Conference on Communications (ICC’17), pages 1–7, Paris, France, May 2017.
[39] R. Corless, G. Gonnet, D. Hare, D. Jeffrey, and D. Knuth. On the LambertW function. Advances in Computational mathematics, 5(1):329–359, 1996.
[40] M. Cornia, L. Baraldi, G. Serra, and R. Cucchiara. A deep multi-level network for saliency prediction. In Proc. of International Conference on Pattern Recognition (ICPR’16), pages 3488–3493, Cancun, Mexico, December 2016.
[41] S. Croci, C. Ozcinar, E. Zerman, J. Cabrera, and A. Smolic. Voronoi-based objective quality metrics for omnidirectional video. In Proc. of International Conference on Quality of Multimedia Experience (QoMEX’19), pages 1–6, Berlin, Germany, June 2019.
[42] S. Croci, C. Ozcinar, E. Zerman, S. Knorr, J. Cabrera, and A. Smolic. Visual attention-aware quality estimation framework for omnidirectional video using spherical voronoi diagram. Springer Quality and User Experience, 5(1):1–17, April 2020.
[43] L. D’Acunto, J. van den Berg, E. Thomas, and O. Niamut. Using MPEG DASH SRD for zoomable and navigable video. In Proc. of ACM International Conference on Multimedia Systems (MMSys’16), pages 34:1–34:4, Klagenfurt, Austria, May 2016.
[44] DASH Industry Forum. Guidelines for implementation: Dash-if interoperability points. DASH Industry Forum, November 2018.
[45] DeNA Co., Ltd. et al. H2O the optimized HTTP/1.x, HTTP/2 server, 2019. https://h2o.examp1e.net/.
[46] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09), pages 248–255, Miami, Florida, June 2009.
[47] F. Duanmu, E. Kurdoglu, S. Hosseini, Y. Liu, and Y. Wang. Prioritized buffer control in two-tier 360 video streaming. In Proc. of ACM SIGCOMM Workshop on Virtual Reality and Augmented Reality Network (VR/AR Network’17), pages 13–18, Los Angeles, CA, August 2017.
[48] D. Egan, S. Brennan, J. Barrett, Y. Qiao, C. Timmerer, and N. Murray. An evaluation of heart rate and electrodermal activity as an objective QoE evaluation method for immersive virtual reality environments. In Proc. of International Conference on Quality of Multimedia Experience (QoMEX’16), pages 1–6, Lisbon, Portugal, June 2016.
[49] T. El-Ganainy and M. Hefeeda. Streaming virtual reality content. arXiv preprint arXiv:1612.08350, December 2016.
[50] C. Fan, J. Lee, W. Lo, C. Huang, K. Chen, and C. Hsu. Fixation prediction for 360◦ video streaming in head-mounted virtual reality. In Proc. of ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV’17), pages 67–72, Taipei, Taiwan, June 2017.
[51] C. Fan,W. Lo, Y. Pai, and C. Hsu. A survey on 360◦ video streaming: Acquisition, transmission, and display. ACM Computing Surveys, 52(4):71:1–71:30, August 2019.
[52] C. Fan, S. Yen, C. Huang, and C. Hsu. Optimizing fixation prediction using recurrent neural networks for 360◦ video streaming in head-mounted virtual reality. IEEE Transactions on Multimedia, 22(3):744–759, March 2019.
[53] C. Fan, S. Yen, C. Huang, and C. Hsu. On the optimal encoding ladder of tiled 360◦ videos for head-mounted virtual reality. IEEE Transactions on Circuits and Systems for Video Technology, pages 1–14, 2020. Accepted to Appear.
[54] P. Felzenszwalb and D. Huttenlocher. Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2):167–181, September 2004.
[55] X. Feng, Y. Liu, and S. Wei. LiveDeep: Online viewport prediction for live virtual reality streaming using lifelong deep learning. In IEEE Conference on Virtual Reality and 3D User Interfaces (VR’20), pages 800–808, Atlanta, Georgia, March 2020.
[56] X. Feng, V. Swaminathan, and S. Wei. Viewport prediction for live 360-degree mobile video streaming using user-content hybrid motion tracking. Proc. of ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 3(2):1–22, June 2019.
[57] A. Fernandes and S. Feiner. Combating VR sickness through subtle dynamic field-of-view modification. In Proc. of IEEE Symposium on 3D User Interfaces (3DUI’16), pages 201–210, Greenville, SC, March 2016.
[58] A. Ferworn, B. Waismark, and M. Scanlan. CAT 360: Canine augmented technology 360-Degree video system. In Proc. of IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR’15), pages 1–4, West Lafayette, IN, October 2015.
[59] D. Freedman. Statistical models: theory and practice. Cambridge University Press, 2009.
[60] J. Friedman. Greedy function approximation: a gradient boosting machine. JSTOR Annals of Statistics, 29(5):1189–1232, October 2001.
[61] C. Fu, L. Wan, T. Wong, and C. Leung. The rhombic dodecahedron map: An efficient scheme for encoding panoramic video. IEEE Transactions on Multimedia, 11(4):634–644, June 2009.
[62] V. Gaddam, H. Ngo, R. Langseth, C. Griwodz, D. Johansen, and P. Halvorsen. Tiling of panorama video for interactive virtual cameras: Overheads and potential bandwidth requirement reduction. In Proc. of Picture Coding Symposium (PCS’15), pages 204–209, Cairns, Australia, May 2015.
[63] L. Gaemperle, K. Seyid, V. Popovic, and Y. Leblebici. An immersive telepresence system using a real-time omnidirectional camera and a virtual reality head-mounted display. In Proc. of IEEE International Symposium on Multimedia (ISM’14), pages 175–178, Taichung, Taiwan, December 2014.
[64] C. Galleguillos and S. Belongie. Context based object categorization: A critical survey. Computer Vision and Image Understanding, 114(6):712–722, June 2010.
[65] Google Inc. ExoPlayer : An extensible media player for android, 2017. https://github.com/google/ExoPlayer.
[66] M. Graf, C. Timmerer, and C. Mueller. Towards bandwidth efficient adaptive streaming of omnidirectional video over HTTP. In Proc. of ACM International Conference on Multimedia Systems (MMSys’17), pages 261–271, Taipei, Taiwan, June 2017.
[67] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 2.1, March 2019. http://cvxr.com/cvx.
[68] J. Hakkinen, T. Vuori, andM. Paakka. Postural stability and sickness symptoms after HMD use. In IEEE International Conference on Systems, Man and Cybernetics, volume 1, pages 147–152, October 2002.
[69] E. Hjelm°as and B. Low. Face detection: A survey. Computer vision and image understanding, 83(3):236–274, September 2001.
[70] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, November 1997.
[71] X. Hou, S. Dey, J. Zhang, and M. Budagavi. Predictive adaptive streaming to enable mobile 360-degree and VR experiences. IEEE Transactions on Multimedia, pages 1–16, 2020.
[72] X. Hou, Y. Lu, and S. Dey. Wireless VR/AR with edge/cloud computing. In Proc. of International Conference on Computer Communication and Networks (ICCCN’17), pages 1–8, Vancouver, Canada, July 2017.
[73] X. Hou, J. Zhang, M. Budagavi, and S. Dey. Head and body motion prediction to enable mobile VR experiences with low latency. In IEEE Global Communications Conference (GLOBECOM’19), pages 1–7, Waikoloa, HI, December 2019.
[74] C. Hsu, A. Chen, C. Hsu, C. Huang, C. Lei, and K. Chen. Is foveated rendering perceivable in virtual reality: Exploring the efficiency and consistency of quality assessment methods. In Proc. of ACM International Conference on Multimedia (MM’17), pages 55–63, Mountain View, CA, October 2017.
[75] HTC Corporation. VIVE: Discover virtual reality beyond imagination, 2017. https://www.vive.com/us/.
[76] HTC Corporation. Eye tracking SDK (SRanipal), 2020. https://developer.vive.com/resources/knowledgebase/vivesranipal-sdk/.
[77] HTC Corporation. VIVE PRO EYE, 2020. https://enterprise.vive.com/us/product/vive-pro-eye/.
[78] C. Huang, K. Chen, D. Chen, H. Hsu, and C. Hsu. GamingAnywhere: The first open source cloud gaming system. ACM Transactions on Multimedia Computing, Communications, and Applications, 10:1–10:24(1), Jan 2014.
[79] M. Huang, Q. Shen, Z. Ma, A. C. Bovik, P. Gupta, R. Zhou, and X. Cao. Modeling the perceptual quality of immersive images rendered on head mounted displays: Resolution and compression. IEEE Transactions on Image Processing, 27(12):6039–6050, December 2018.
[80] F. Huber and S. Satish. Adaptive code offloading for mobile cloud applications: Exploiting fuzzy sets and evidence-based learning. In Proc. of ACM Workshop on Mobile Cloud Computing and Services (MCS’13), pages 9–16, Taipei, Taiwan, June 2013.
[81] IBM Corp. IBM ILOG CPLEX optimizer, 2018. http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/.
[82] M. Inoue, H. Kimata, K. Fukazawa, and N. Matsuura. Interactive panoramic video streaming system over restricted bandwidth network. In Proc. of ACM International Conference on Multimedia (MM’10), pages 1191–1194, Firenze, Italy, October 2010.
[83] Information technology – dynamic adaptive streaming over HTTP (DASH) – part 1: Media presentation description and segment formats, March 2012.
[84] Algorithm descriptions of projection format conversion and video quality metrics in 360Lib. Standard, International Telecommunication Union, 2017.
[85] ITU. JVET - joint video experts team, 2019. https://www.itu.int/en/ITU-T/studygroups/2017-2020/16/Pages/video/jvet.aspx.
[86] Final report from the video quality experts group on the validation of objective models of video quality assessment. Technical report, VQEG, 2000.
[87] ITU-T Study Group 9. Subjective video quality assessment methods for multimedia applications. ITU Series P: Audiovisual quality in multimedia services, 1999.
[88] ITU Telecommunication Standardization Sector. Subjective video quality assessment methods for multimedia applications. ITU-T Recommendation, P.910, April 2008.
[89] ITU Telecommunication Standardization Sector. Methods for the subjective assessment of video quality, audio quality and audiovisual quality of internet video and distribution quality television in any environment. ITU-T Recommendation, P.913, March 2016.
[90] ITU Telecommunication Standardization Sector. Series p: Terminals and subjective and objective assessment methods. ITU-T Recommendation, P.800.2, July 2016.
[91] R. Jain, D. Chiu, and W. Hawe. A quantitative measure of fairness and discrimination. Eastern Research Laboratory, Digital Equipment Corporation, pages 1–37, September 1984.
[92] N. Jiang, V. Swaminathan, and S. Wei. Power evaluation of 360 VR video streaming on head mounted display devices. In Proc. of ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV’17), pages 55–60, Taipei, Taiwan, June 2017.
[93] B. John, P. Raiturkar, O. Le Meur, and E. Jain. A benchmark of four methods for generating 360◦ saliency maps from eye tracking data. World Scientific International Journal of Semantic Computing, 13(03):329–341, September 2019.
[94] R. Ju, J. He, F. Sun, J. Li, F. Li, J. Zhu, and L. Han. Ultra wide view based panoramic VR streaming. In Proc. of ACM SIGCOMM Workshop on Virtual Reality and Augmented Reality Network (VR/AR Network’17), pages 19–23, Los Angeles, CA, August 2017.
[95] P. Juluri. AStream, 2019. https://github.com/pari685/AStream.
[96] P. Juluri, V. Tamarapalli, and D. Medhi. Measurement of quality of experience of video-on-demand services: A survey. IEEE Communications Surveys & Tutorials, 18(1):401–418, February 2016.
[97] S. Jumisko-Pyykk¨o and T. Vainio. Framing the context of use for mobile HCI. IGI Global International Journal of Mobile Human Computer Interaction, 2(4):1–28, January 2010.
[98] Y. Kavak, E. Erdem, and A. Erdem. A comparative study for feature integration strategies in dynamic saliency estimation. Signal Processing: Image Communication, 51:13–25, February 2017.
[99] R. Kennedy, N. Lane, K. Berbaum, andM. Lilienthal. Simulator sickness questionnaire: An enhanced method for quantifying simulator sickness. The International Journal of Aviation Psychology, 3(3):203–220, November 1993.
[100] I. Ketyk´o, K. De Moor, T. De Pessemier, A. Verdejo, K. Vanhecke, W. Joseph, L. Martens, and L. De Marez. Qoe measurement of mobile YouTube video streaming. In Proc. of Workshop on Mobile Video Delivery, pages 27–32, Firenze, Italy, October 2010.
[101] H. Kim, H. Lim, S. Lee, and Y. Ro. VRSA net: VR sickness assessment considering exceptional motion for 360 VR video. IEEE Transactions on Image Processing, 28(4):1646–1660, April 2019.
[102] H. Kim, J. Yang, M. Choi, J. Lee, S. Yoon, Y. Kim, and W. Park. Immersive 360◦ VR tiled streaming system for Esports service. In Proc. of ACM International Conference on Multimedia Systems (MMSys’18), pages 541–544, Amsterdam, The Netherlands, June 2018.
[103] J. Kim, W. Kim, S. Ahn, J. Kim, and S. Lee. Virtual reality sickness predictor: Analysis of visual-vestibular conflict and VR contents. In Proc. of IEEE International Conference on Quality of Multimedia Experience (QoMEX’18), pages 1–6, Sardinia, Italy, May 2018.
[104] H. Kimata, M. Isogai, H. Noto, M. Inoue, K. Fukazawa, and N. Matsuura. Interactive panorama video distribution system. In Proc. of Technical Symposium at ITU Telecom World (ITUWT’11), pages 45–50, Geneva, Switzerland, October 2011.
[105] H. Kimata, D. Ochi, A. Kameda, H. Noto, K. Fukazawa, and A. Kojima. Mobile and multi-device interactive panorama video distribution system. In Proc. of IEEE Global Conference on Consumer Electronics (GCCE’12), pages 574–578, Tokyo, Japan, October 2012.
[106] K. Kitae, L. Jared, K. Seokwon, and H. Seon. Prediction based sub-task offloading in mobile edge computing. In Proc of IEEE International Conference on Information Networking (ICOIN’19), pages 448–452, January 2019.
[107] J. Kopf. 360◦ video stabilization. ACM Transactions on Graphics, 35(6):195:1–195:9, November 2016.
[108] J. Kua, G. Armitage, and P. Branch. A survey of rate adaptation techniques for dynamic adaptive streaming over HTTP. IEEE Communications Surveys & Tutorials, 19(3):1842–1866, 2017.
[109] E. Kuzyakov and D. Pio. Next-generation video encoding techniques for 360 video and VR, 2016. https://engineering.fb.com/virtualreality/next-generation-video-encoding-techniques-for-360-video-and-vr/.
[110] A. Langley, A. Riddoch, A. Wilk, A. Vicente, C. Krasic, D. Zhang, F. Yang, F. Kouranov, I. Swett, J. Iyengar, J. Bailey, J. Dorfman, J. Roskind, J. Kulik, P. Westin, R. Tenneti, R. Shade, R. Hamilton, V. Vasiliev, W. Chang, and Z. Shi. The QUIC transport protocol: Design and internet-scale deployment. In Proc. of the ACM International Conference on Special Interest Group on Data Communication (SIGCOMM’17), pages 183–196, Los Angeles, CA, August 2017.
[111] J. Le Feuvre and C. Concolato. Tiled-based adaptive streaming using MPEGDASH. In Proc. of ACM International Conference on Multimedia Systems (MMSys’16), pages 41:1–41:3, Klagenfurt, Austria, May 2016.
[112] T. Lee, J. Yoon, and I. Lee. Motion sickness prediction in stereoscopic videos using 3D convolutional neural networks. IEEE Transactions on Visualization and Computer Graphics, 25(5):1919–1927, May 2019.
[113] B. Li, H. Li, L. Li, and J. Zhang. λ domain rate control algorithm for high efficiency video coding. IEEE Transactions on Image Processing, 23(9):3841–3854, September 2014.
[114] B. Li, D. Zhang, H. Li, and J. Xu. QP determination by lambda value. In 9th Meeting of the JCT-VC, no. JCTVC-I0426, May 2012.
[115] G. Li and Y. Yu. Visual saliency based on multi-scale deep features. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15), pages 5455–5463, Boston, MA, June 2015.
[116] Z. Li, M. Drew, and J. Liu. Lossy Compression Algorithms. Springer, 2004.
[117] A. Liaw and M. Wiener. Classification and regression by randomforest. R news, 2(3):18–22, December 2002.
[118] S. Lim, J. Seok, J. Seo, and T. Kim. Tiled panoramic video transmission system based on MPEG-DASH. In Proc. of International Conference on Information and Communication Technology Convergence (ICTC’15), pages 719–721, Jeju, Korea, October 2015.
[119] K. Lin, S. Liu, L. Cheong, and B. Zeng. Seamless video stitching from hand-held camera inputs. Computer Graphics Forum, 35(2):479–487, May 2016.
[120] T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. Zitnick. Microsoft COCO: Common objects in context. In Proc. of European Conference on Computer Vision (ECCV’14), pages 740–755, Zurich, Switzerland, September 2014.
[121] F. Liu and M. Gleicher. Region enhanced scale-invariant saliency detection. In Proc. of IEEE International Conference on Multimedia and Expo (ICME’06), pages 1477–1480, Toronto, Canada, July 2006.
[122] T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, and H. Shum. Learning to detect a salient object. IEEE Transactions on Pattern analysis and machine intelligence, 33(2):353–367, February 2011.
[123] W. Lo, C. Fan, J. Lee, C. Huang, K. Chen, and C.-H. Hsu. 360◦ video viewing dataset in head-mounted virtual reality. In Proc. of ACM International Conference on Multimedia Systems (MMSys’17), pages 211–216, Taipei, Taiwan, June 2017.
[124] W. Lo, C. Fan, S. Yen, and C. Hsu. Performance measurements of 360◦ video streaming to head-mounted displays over live 4G cellular networks. In Proc. of Asia-Pacific Network Operations and Management Symposium (APNOMS’17), pages 205–210, Seoul, Korea, September 2017.
[125] K. Lu, A. Ortega, D. Mukherjee, and Y. Chen. Perceptually inspired weighted MSE optimization using irregularity-aware graph Fourier transform. arXiv preprint arXiv:2002.08558, 2020.
[126] B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In Proc. of International Joint Conference on Artificial Intelligence, pages 674–679, Vancouver, Canada, August 1981.
[127] Y.Ma and H. Zhang. Contrast-based image attention analysis by using fuzzy growing. In Proc. of ACM International Conference on Multimedia (MM’03), pages 374–381, Berkeley, CA, November 2003.
[128] Magnifyre. 360-degree video case study, 2017. https://www.magnifyre.com/360-degree-video-case-study/.
[129] A. Makhorin. GLPK (GNU linear programming kit), 2019. https://www.gnu.org/software/glpk/.
[130] MarketWatch. Global head mounted display (hmd) market - global countries data, insights, market size & growth, forecast to 2026, 2020. https://www.marketwatch.com/press-release/global-headmounted-display-hmd-market---global-countries-datainsights-market-size-growth-forecast-to-2026-2020-02-18.
[131] MATLAB - MathWorks, 2020. https://www.mathworks.com/products/matlab.html.
[132] A. Mavlankar and B. Girod. Video streaming with interactive pan/tilt/zoom. In M. Mrak, M. Grgic, and M. Kunt, editors, High-Quality Visual Experience, chapter 19, pages 431–455. Springer, June 2010.
[133] T. Mikolov, M. Karafiat, L. Burget, J.Cernocky, and S. Khudanpur. Recurrent neural network based language model. In Proc. of Conference on International Speech Communication Association (Interspeech’11), pages 1045–1048, Florence, Italy, August 2010.
[134] K. Misra, A. Segall, M. Horowitz, S. Xu, A. Fuldseth, and M. Zhou. An overview of tiles in HEVC. IEEE Journal of Selected Topics in Signal Processing, 7(6):969–977, December 2013.
[135] S. Moller, M. Waltermann, and M. Garcia. Features of quality of experience. In S.Moller and A. R. Editors, editors, Quality of Experience, chapter 5, pages 73–84. Springer US, 2014.
[136] R. Monroy, S. Lutz, T. Chalasani, and A. Smolic. Salnet360: Saliency maps for omni-directional images with CNN. Signal Processing: Image Communication, 69:26–34, Novemebr 2018.
[137] MPlayer. MPlayer: The movie player, 2017. http://www.mplayerhq.hu.
[138] B. Nardi. Context and Consciousness: Activity Theory and Human-Computer Interaction. The MIT Press, 1996.
[139] A. Nasrabadi, A. Mahzari, J. Beshay, and R. Prakash. Adaptive 360-degree video streaming using scalable video coding. In Proc. of ACM International Conference on Multimedia (MM’17), pages 1689–1697, Mountain View, CA, October 2017.
[140] A. Nasrabadi, A. Samiei, and R. Prakash. Viewport prediction for 360◦ videos: a clustering approach. In Proc. of the Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV’20), pages 34–39, Istanbul, Turkey, June 2020.
[141] Netflix Inc. NFLX dataset, 2016. https://drive.google.com/drive/u/0/folders/0B3YWNICYMBIweGdJbERlUG9zc0k.
[142] Netflix Inc. VMAF - video multi-method assessment fusion, 2019. https://github.com/Netflix/vmaf.
[143] Netflix Technology Blog. Per-title encode optimization, 2015. https://medium.com/netflix-techblog/per-title-encodeoptimization-7e99442b62a2.
[144] K. Ngo, R. Guntur, and W. Ooi. Adaptive encoding of zoomable video streams based on user access pattern. In Proc. of ACM Conference on Multimedia Systems (MMSys’11), pages 211–222, San Jose, CA, February 2011.
[145] A. Nguyen, Z. Yan, and K. Nahrstedt. Your attention is unique: Detecting 360- degree video saliency in head-mounted display for head movement prediction. In Proc. of ACM International Conference on Multimedia (MM’18), pages 1190–1198, Seoul, Korea, October 2018.
[146] D. Nguyen, T. Huyen, and T. Thang. An evaluation of tile selection methods for viewport adaptive streaming of 360-degree video. ACM Transactions on Multimedia Computing Communications and Applications, 16(1):1–24, 2020.
[147] D. Nguyen, H. T. Tran, A. Pham, and T. Thang. A new adaptation approach for viewport-adaptive 360-degree video streaming. In Proc. of IEEE International Symposium on Multimedia (ISM’17), pages 38–44, Taichung, Taiwan, December 2017.
[148] T. Nguyen,M. Xu, G. Gao,M. Kankanhalli, Q. Tian, and S. Yan. Static saliency vs. dynamic saliency: a comparative study. In Proc. of ACM International Conference on Multimedia (MM’13), pages 987–996, Barcelona, Spain, October 2013.
[149] O. Niamut, A. Kochale, J. Hidalgo, R. Kaiser, J. Spille, J. Macq, G. Kienast, O. Schreer, and B. Shirley. Towards a format-agnostic approach for production, delivery and rendering of immersive media. In Proc. of ACM International Conference onMultimedia Systems (MMSys’13), pages 249–260, Oslo, Norway, February 2013.
[150] nmsl-nthu. QoE-modeling-for-360-degree-videos-dataset, 2020. https://github.com/nmsl-nthu/QoE-Modeling-for-360-Degree-Videos-Dataset.
[151] NS-3 network simulator, 2018. http://www.nsnam.org/.
[152] An MPEG/DASH client-server module for simulating rate adaptation mechanisms over HTTP/TCP, 2018. https://github.com/djvergad/dash.
[153] D. Ochi, Y. Kunita, K. Fujii, A. Kojima, S. Iwaki, and J. Hirose. HMD viewing spherical video streaming system. In Proc. of the ACM International Conference on Multimedia (MM’14), pages 763–764, Orlando, FL, November 2014.
[154] D. Ochi, Y. Kunita, A. Kameda, A. Kojima, and S. Iwaki. Live streaming system for omnidirectional video. In Proc. of IEEE Virtual Reality (VR’15), pages 349–350, Arles, France, March 2015.
[155] Oculus VR, LLC. Oculus rift, 2017. https://www.oculus.com/.
[156] J. Ohm and G. Sullivan. High efficiency video coding: the next frontier in video compression [standards in a nutshell]. IEEE Signal Processing Magazine, 30(1):152–158, January 2013.
[157] Opensignal. Global state of mobile networks (february 2017), 2017. https://opensignal.com/reports/2017/02/global-stateof-the-mobile-network.
[158] M. Orduna, C. D´ıaz, L. Mu˜noz, P. P´erez, I. Benito, and N. Garc´ıa. Video multimethod assessment fusion (VMAF) on 360VR contents. IEEE Transactions on Consumer Electronics, 66(1):22–31, February 2019.
[159] C. Ozcinar, J. Cabrera, and A. Smolic. Visual attention-aware omnidirectional video streaming using optimal tiles for virtual reality. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(1):217–230, March 2019.
[160] C. Ozcinar, A. De Abreu, S. Knorr, and A. Smolic. Estimation of optimal encoding ladders for tiled 360 VR video in adaptive streaming systems. In IEEE International Symposium on Multimedia (ISM’17), pages 45–52, Taichiung, Taiwan, December 2017.
[161] C. Ozcinar, A. De Abreu, and A. Smolic. Viewport-aware adaptive 360◦ video streaming using tiles for virtual reality. In Proc. of IEEE International Conference on Image Processing (ICIP’17), pages 2174–2178, Beijing, China, November 2017.
[162] N. Padmanaban, T. Ruban, V. Sitzmann, A. Norcia, and G. Wetzstein. Towards a machine-learning approach for sickness prediction in 360 stereoscopic videos. IEEE Transactions on Visualization and Computer Graphics, 24(4):1594–1603, April 2018.
[163] F. Pedregosa, G. Varoquaux, A. Gramfort, B. Thirion, O. Grisel, M. Blondel, M. Blondel, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, October 2011.
[164] S. Petrangeli, V. Swaminathan, M. Hosseini, and F. De Turck. An HTTP/2-based adaptive streaming framework for 360 virtual reality videos. In Proc. of ACM International Conference on Multimedia (MM’17), pages 306–314, Mountain View, CA, October 2017.
[165] S. Petrangeli, J. van der Hooft, T. Wauters, R. Huysegems, P. Alface, T. Bostoen, and F. De Turck. Live streaming of 4k ultra-high definition video over the internet. In Proc. of ACM International Conference on Multimedia Systems (MMSys’16), pages 27:1–27:4, Klagenfurt, Austria, May 2016.
[166] B. Petry and J. Huber. Towards effective interaction with omnidirectional videos using immersive virtual reality headsets. In Proc. of ACM Augmented Human International Conference (AH’15), pages 217–218, Singapore, Singapore, March 2015.
[167] F. Qian, B. Han, Q. Xiao, and V. Gopalakrishnan. Flare: Practical viewport adaptive 360-degree video streaming for mobile devices. In Proc. of International
Conference on Mobile Computing and Networking (MobiCom’18), pages 99–114, New Delhi, India, October 2018.
[168] F. Qian, L. Ji, B. Han, and V. Gopalakrishnan. Optimizing 360 video delivery over cellular networks. In Proc. of Workshop on All Things Cellular Operations, Applications and Challenges (ATC’16), pages 1–6, New York, NY, October 2016.
[169] A. Raake, A. Singla, R. Rao, W. Robitza, and F. Hofmeyer. SiSiMo: Towards simulator sickness modeling for 360◦ videos viewed with an HMD. In Proc. of IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW’20), pages 583–584, Atlanta, GA, 2020.
[170] M. Rahman, A. El Saddik, and W. Gueaieb. Augmenting context awareness by combining body sensor networks and social networks. IEEE Transactions on Instrumentation and Measurement, 60(2):345–353, February 2010.
[171] Y. Rai, J. Guti´errez, and P. Le Callet. A dataset of head and eye movements for 360 degree images. In Proc. of ACM International Conference on Multimedia Systems (MMSys’17), pages 205–210, Taipei, Taiwan, June 2017.
[172] Y. Rai, P. Le Callet, and P. Guillotel. Which saliency weighting for omni directional image quality assessment? In Proc. of IEEE International Conference on Quality of Multimedia Experience (QoMEX’17), pages 1–6, Erfurt, Germany, May 2017.
[173] J. Redmon and A. Farhadi. YOLO9000: Better, faster, stronger. arXiv preprint arXiv:1612.08242, 2016.
[174] G. Regal, R. Schatz, J. Schrammel, and S. Suette. VRate: a Unity3D asset for integrating subjective assessment questionnaires in virtual environments. In Proc. of IEEE International Conference on Quality of Multimedia Experience (QoMEX’18), pages 1–3, Sardinia, Italy, May 2018.
[175] U. Reiter, K. Brunnstr ¨om, K. De Moor, M. Larabi, M. Pereira, A. Pinheiro, J. You, and A. Zgank. Factors influencing quality of experience. In S. Moller and A. R. Editors, editors, Quality of Experience, chapter 4, pages 55–72. Springer US, 2014.
[176] Research and Markets. 360-degree camera market: Global industry trends, share, size, growth, opportunity and forecast 2020-2025, 2020. https://www.
researchandmarkets.com/reports/5009145/360-degreecamera-market-global-industry-trends?utm_source=
dynamic&utm_medium=BW&utm_code=sfklg3&utm_campaign=1375885+-+Global+360-Degree+Camera+Market+Report%2C+2020-2025%3A+Trends%2C+Share%2C+Size%2C+Growth%2C+Opportunities%2C+Competition&utm_exec=joca220bwd.
[177] Y. Reznik, K. Lillevold, A. Jagannath, J. Greer, and J. Corley. Optimal design of encoding profiles for ABR streaming. In Proc. of ACM Workshop on Packet Video (PV’18), pages 43–47, Amsterdam, The Netherlands, June 2018.
[178] Samsung Electronics. The GearVR framework), 2017. https://github.com/Samsung/GearVRf.
[179] R. Schafer, P. Kauff, R. Skupin, Y. Sanchez, and C. weissig. Interactive streaming of panoramas and VR worlds. SMPTE Motion Imaging Journal, 126(1):35–42, January 2017.
[180] R. Schatz, G. Regal, S. Schwarz, S. Suettc, and M. Kempf. Assessing the QoE impact of 3D rendering style in the context of VR-based training. In Proc. of IEEE International Conference on Quality of Multimedia Experience (QoMEX’18), pages 1–6, Sardinia, Italy, June 2018.
[181] I. T. S. Sector. Mean opinion score (mos) terminology. ITU-T Recommendation, P.800.1, July 2016.
[182] R. Shiffler. Maximum z scores and outliers. The American Statistician, 42(1):79–80, 1988.
[183] O. Shouno. Photo-realistic video prediction on natural videos of largely changing frames. arXiv preprint arXiv:2003.08635, 2020.
[184] A. Singla, S. Fremerey, W. Robitza, P. Lebreton, and A. Raake. Comparison of subjective quality evaluation for HEVC encoded omnidirectional videos at different bit-rates for UHD and FHD resolution. In Proc. of ACM Multimedia Thematic Workshops Thematic Workshops’17)), pages 511–519, Mountain View, CA, October 2017.
[185] A. Singla, S. Fremerey,W. Robitza, and A. Raake. Measuring and comparing QoE and simulator sickness of omnidirectional videos in different head mounted displays. In Proc. of International Conference on Quality of Multimedia Experience (QoMEX’17), pages 1–6, Erfurt, Germany, May 2017.
[186] A. Singla, S. Goring, A. Raake, B. Meixner, R. Koenen, and T. Buchholz. Subjective quality evaluation of tile-based streaming for omnidirectional videos. In Proc. of ACM Conference on Multimedia Systems (MMSys’19), Amherst, MA, February 2019.
[187] R. Skupin, Y. Sanchez, Y. Wang, M. Hannuksela, J. Boyce, and M. Wien. Standardization status of 360 degree video coding and delivery. In Proc. of IEEE International Conference on Visual Communications and Image Processing (VCIP’17), pages 1–4, Taichung, Taiwan, December 2017.
[188] I. Sodagar. TheMPEG-DASH standard for multimedia streaming over the Internet. IEEE Multimedia, 18(4):62–67, April 2011.
[189] W. Song, Y. Xiao, D. Tjondronegoro, and A. Liotta. QoE modelling for VP9 and H.265 videos on mobile devices. In Proc. of ACM International Conference on Multimedia (MM’15), pages 501–510, Brisbane, Australia, October 2015.
[190] C. Spearman. The proof and measurement of association between two things. American journal of Psychology, 15(1):72–101, 1904.
[191] G. Sullivan, J. Ohm, W. Han, and T. Wiegand. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology, 22(12):1649–1668, December 2012.
[192] G. Sullivan and T. Wiegand. Rate-distortion optimization for video compression. IEEE Signal Processing Magazine, 15(6):74–90, 1998.
[193] K. Tcha-Tokey, E. Loup-Escande, O. Christmann, and S. Richir. A questionnaire to measure the user experience in immersive virtual environments. In Proc. of ACM Virtual Reality International Conference (VRIC’16), pages 1–5, Laval, France, March 2016.
[194] Telecom ParisTech. MP4Box, 2017. https://gpac.wp.imt.fr/mp4box/.
[195] Telecom ParisTech. MP4Client, 2017. https://gpac.wp.imt.fr/player.
[196] Trejkaz. Equi-angular cubemap skybox for Unity, 2020. https://github.com/trejkaz/EACSkyboxShader.
[197] I. Tucker. Perceptual video quality dimensions. Master’s thesis, Technische Universit¨at Berlin, Berlin, Germany, 2011.
[198] Unity, 2017. https://unity3d.com/.
[199] Unity Technologies. SteamVR plugin, 2020. https://assetstore.unity.com/packages/tools/integration/steamvr-plugin-32647.
[200] E. Upenik, M. Rerabek, and T. Ebrahimi. Testbed for subjective evaluation of omnidirectional visual content. In Proc. of Picture Coding Symposium (PCS’16), pages 1–5, Nuremberg, Germany, December 2016.
[201] E. Upenik, M. Rerabek, and T. Ebrahimi. On the performance of objective metrics for omnidirectional visual content. In Proc. of International Conference on Quality of Multimedia Experience (QoMEX’17), pages 1–6, Erfurt, Germany, May 2017.
[202] M. Varela, L. Skorin-Kapov, and T. Ebrahimi. Quality of service versus quality of experience. In S. Moller and A. R. Editors, editors, Quality of Experience, chapter 6, pages 85–96. Springer US, 2014.
[203] J. Vielhaben, H. Camalan,W. Samek, andM.Wenzel. Viewport forecasting in 360◦ virtual reality videos with machine learning. In IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR’19), pages 74–747, San Diego, CA, December 2019.
[204] M. Viitanen, A. Koivula, A. Lemmetti, A. Yla-Outinen, J. Vanne, and T. Hamalainen. Kvazaar: Open-source HEVC/H.265 encoder. In Proc. of ACM International Conference on Multimedia (MM’16), pages 1179–1182, Amsterdam, The Netherlands, October 2016.
[205] S. Vlahovic, M. Suznjevic, and L. Skorin-Kapov. Subjective assessment of different locomotion techniques in virtual reality environments. In Proc. of IEEE International Conference on Quality of Multimedia Experience (QoMEX’18), pages 1–3, Sardinia, Italy, June 2018.
[206] D. Wagner, A. Mulloni, T. Langlotz, and D. Schmalstieg. Real-time panoramic mapping and tracking on mobile phones. In Proc. of IEEE Conference on Virtual Reality (VR’10), pages 211–218, Waltham, MA, March 2010.
[207] H. Wang, M. C. Chan, and W. Ooi. Wireless multicast for zoomable video streaming. ACM Transactions on Multimedia Computing, Communications, and Applications, 12(1):5:1–5:23, August 2015.
[208] H. Wang, V. Nguyen, W. Ooi, and M. Chan. Mixing tile resolutions in tiled video: A perceptual quality assessment. In Proc. of ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV’14), pages 25:25–25:30, Singapore, Singapore, March 2014.
[209] K. Wang, L. Lin, J. Lu, C. Li, and K. Shi. PISA: Pixelwise image saliency by aggregating complementary appearance contrast measures with edge-preserving coherence. IEEE Transactions on Image Processing, 24(10):3019–3033, October 2015.
[210] M. Wang, K. Ngan, and H. Li. An efficient frame content based intra frame rate control for high efficiency video coding. IEEE Signal Processing Letters, 22(7):896–900, July 2015.
[211] WebVR. WebVR: Bringing virtual reality to the web, 2017. https://webvr.info/.
[212] M. Weier, T. Roth, E. Kruijff, A. Hinkenjann, A. Perard-Gayot, P. Slusallek, and Y. Li. Foveated real-time ray tracing for head-mounted displays. In Computer Graphics Forum, volume 35, pages 289–298. Wiley Online Library, 2016.
[213] F. Weymouth. Visual sensory units and the minimal angle of resolution. Elsevier American Journal of Ophthalmology, 46(1):102–113, July 1958.
[214] C. Wu, R. Zhang, Z. Wang, and L. Sun. A spherical convolution approach for learning long term viewport prediction in 360 immersive video. In Proc. of the AAAI Conference on Artificial Intelligence (AAAI-20), volume 34, pages 14003–14040, June 2020.
[215] Xavier Corbillon. Optimal set of 360-degree videos for viewport-adaptive streaming, 2019. https://github.com/xmar/optimal-setrepresentation-viewport-adaptive-streaming.
[216] L. Xie, Z. Xu, Y. Ban, X. Zhang, and Z. Guo. 360ProbDASH: Improving QoE of 360 video streaming using Tile-Based HTTP adaptive streaming. In Proc. of ACM International Conference on Multimedia (MM’17), pages 315–323, Mountain View, CA, October 2017.
[217] L. Xie, X. Zhang, and Z. Guo. CLS: A cross-user learning based system for improving QoE in 360-degree video adaptive streaming. In Proc. of ACM International Conference on Multimedia (MM’18), pages 564–572, Seoul, Korea, October 2018.
[218] S. Xie, Y. Xu, Q. Qian, Q. Shen, Z. Ma, and W. Zhang. Modeling the perceptual impact of viewport adaptation for immersive video. In Proc. of IEEE International Symposium on Circuits and Systems (ISCAS’18), pages 1–5, Florence, Italy, May 2018.
[219] X. Xie and X. Zhang. POI360: Panoramic mobile video telephony over LTE cellular networks. In Proc. of International Conference on Emerging Networking EXperiments and Technologies (CoNEXT’17), pages 336–349, Incheon, Korea, December 2017.
[220] M. Xu, Y. Song, J. Wang, M. Qiao, L. Huo, and Z. Wang. Predicting head movement in panoramic video: A deep reinforcement learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(8):1–14, November 2018.
[221] T. Xu, B. Han, and F. Qian. Analyzing viewport prediction under different VR interactions. In Proc. of International Conference on Emerging Networking Experiments And Technologies (CoNEXT’19), pages 165–171, Orlando, FL, December 2019.
[222] Y. Xu, Y. Dong, J. Wu, Z. Sun, Z. Shi, J. Yu, and S. Gao. Gaze prediction in dynamic 360 immersive videos. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18), pages 5333–5342, Salt Lake City, Utah, June 2018.
[223] Z. Xu, X. Zhang, K. Zhang, and Z. Guo. Probabilistic viewport adaptive streaming for 360-degree videos. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS’18), pages 1–5, Florence, Italy, May 2018.
[224] M. Yahia, Y. Le Louedec, G. Simon, and L. Nuaymi. HTTP/2-based streaming solutions for tiled omnidirectional videos. In Proc. of IEEE International Symposium on Multimedia (ISM’18), pages 89–96, Taichung, Taiwan, December 2018.
[225] S. Yao, C. Fan, and C. Hsu. Towards quality-of-experience models for watching 360◦ videos in head-mounted virtual reality. In Proc. of International Conference on Quality of Multimedia Experience (QoMEX’19), pages 1–6, Berlin, Germany, June 2019.
[226] S. Yen, C. Fan, and C. Hsu. Streaming 360◦ videos to head-mounted virtual reality using DASH over QUIC transport protocol. In Proc. of ACM Workshop on Packet Video (PV’19), pages 7–12, Amherst, MA, June 2019.
[227] M. Yu, H. Lakshman, and B. Girod. A framework to evaluate omnidirectional video coding schemes. In IEEE International Symposium onMixed and Augmented Reality (ISMAR’15), pages 31–36, Fukuoka, Japan, September 2015.
[228] M. Yu, H. Lakshman, and B. Girod. A framework to evaluate omnidirectional video coding schemes. In Proc. of IEEE International Symposium on Mixed and Augmented Reality (ISMAR’15), Fukuoka, Japan, September 2015.
[229] A. Zare, A. Aminlou, M. Hannuksela, and M. Gabbouj. HEVC-compliant tile-based streaming of panoramic video for virtual reality applications. In Proc. of ACM International Conference on Multimedia (MM’16), pages 601–605, Amsterdam, The Netherlands, October 2016.
[230] Z. Zhang, Y. Xu, J. Yu, and S. Gao. Saliency detection in 360 videos. In Proc. of European Conference on Computer Vision (ECCV’18), pages 488–503, Munich, Germany, September 2018.
[231] H. Zhou, X. Xie, J. Lai, Z. Chen, and L. Yang. Interactive two-stream decoder for accurate and fast saliency detection. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20), pages 9141–9150, June 2020.
(此全文20250929後開放外部瀏覽)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *