基於LSTM神經網路之車輛軌跡預測__國立清華大學博碩士論文全文影像系統

帳號：guest(3.147.85.221) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	郭士鈞
作者(外文):	Kuo, Shih-Chun
論文名稱(中文):	基於LSTM神經網路之車輛軌跡預測
論文名稱(外文):	LSTM-Based Vehicle Trajectory Prediction
指導教授(中文):	林嘉文
指導教授(外文):	Lin, Chia-Wen
口試委員(中文):	彭文孝康立威蔡文錦
口試委員(外文):	Peng, Wen-Hsiao Kang, Li-Wei Tsai, Wen-Jing
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	通訊工程研究所
學號:	104064506
出版年(民國):	107
畢業學年度:	107
語文別:	英文
論文頁數:	49
中文關鍵詞:	軌跡預測、社交網路、注視模擬
外文關鍵詞:	Trajectory prediction、Social network、Attention model
相關次數:	推薦:0 點閱:431 評分: 下載:147 收藏:0

物體的未來軌跡預測對於自駕車及導航系統是相當重要的技術環節。為了能進行安全有效率以及避免碰撞的導航，自駕車應當有能力去預見多變的環境中即將會發生的事情並且預先預測周圍物體未來的位置。目前在自駕車已經有許多重大的進展，像是Google自駕車以及Tesla的Autopilot。
在過去物體軌跡預測的研究中，藉由考慮物體距離和使用長短期記憶模型(Long Short-Term Memory)的學習函數去模擬物體之間的互動。但未來預測的結果不應當僅僅只考慮四周物體的距離來做決定也和自身的運動軌跡慣性及自身和物體的相對重要程度有所關連。預測系統應當注視所有過去運動軌跡並建立輸入軌跡對預測結果的重要關聯性以及了解自己和周圍那些物體有比較大的重要程度，藉此用於提升預測效能。另外，考慮更多的物體訊息例如所向何處和多快速以及考慮物體類別資訊將有助於預測結果的提升。
在這篇論文中，我們主要目標為提升行車紀錄器中物體軌跡預測效能。首先，建立一個時間注視模型關注物體過去軌跡的運動特性，我們的做法是對所有過去軌跡計算出對未來預測軌跡的重要性。再者，建立一個空間注視模型已用於了解自身和周圍物體的相對重要關係，藉此減少預測軌跡的失誤及誤差傳遞，最後，結合輸入軌跡的方向及速度還有物體類別資訊將有助於提供更多的物體訊息，減少預測的誤判。我們將實驗結果應用在真實行車紀錄器影片的Kitti追蹤資料庫以及車站行人軌跡預測資料庫，結果顯示我們的方法比起過去方法，仍有相當好的結果。

Future trajectory prediction of objects is a very important technical link for self-driving cars and navigation systems. In order to be safe, efficient, and to avoid collisions, self-driving cars should be able to anticipate what will happen in a changeable environment and predict the future location of surrounding objects in advance. There have been many significant technical advances in autonomous cars, such as Google’s self-driving cars and Tesla’s Autopilot.
In the past research of object trajectory prediction, the interaction between objects was simulated by considering the object distances and learning functions using the Long Short-Term Memory model. However, the results of future predictions should not only depend on the distance of the surrounding objects but also related to their own inertial trajectories and the relative importance between target and other objects. The forecasting system should look at all past trajectories and establish an important relationship between the input trajectories and the prediction results. The forecasting system also needs to aware which surrounding objects is important to target, thereby improving the prediction performance. In addition, considering more object information, such as heading, how fast, and object class will improve the prediction results.
In this paper, our main goal is to improve the effectiveness of object trajectory prediction in dashcam videos. First, a temporal attention model is built to focus on the motion characteristics of moving objects from past trajectories. Our approach is to calculate the importance relationship value between future position and all past trajectories. Furthermore, we build a spatial attention model to understand the relative importance relationship between itself and surrounding objects information, thereby reducing errors and error propagation of predicted results. Finally, combining the direction, speed information and object class of the input trajectory will provide more object information and reduce the misjudgment of prediction. We apply the experimental results to the Kitti tracking database of real driving dashcam videos and New York Grand Central of pedestrian trajectory prediction database. The results show that our method still has quite good results compared to the previous method.

摘要 i
Abstract ii
Content iv
Chapter 1 Introduction 6
Chapter 2 Related Work 8
2.1 The classical methods 8
2.2 The inverse reinforcement learning methods 10
2.3 Markov Decision Processes 13
2.4 Deep learning driven prediction 15
2.5 LSTM based Encoder-Decoder prediction 19
2.6 Object interaction simulation 21
2.7 The attention model 23
Chapter 3 Proposed method 25
3.1 Overview 25
3.2 Problem Definition 27
3.3 Social LSTM with trajectory attention model 27
3.3.1 Architecture 31
Chapter 4 Experiments and Discussion 38
4.1 Dataset 38
4.1.1 Evaluation And Performace 39
Chapter 5 Conclusion 46
References 47

[1] R. E. Kalman. “A new approach to linear filtering and prediction problems,” the Journal of basic Engineering, vol. 82, no. 1, pp. 35-45, 1960
[2] T. Simon, T. Horiuchi, S. Kagami. “A probabilistic model of human motion and navigation intent for mobile robot path planning,” In Autonomous Robots and Agents (ICARA), 2009, pp. 663-668.
[3] Helbing, Dirk, and P. Molnar. “Social force model for pedestrian dynamic,” Physical review, vol. 51, no. 5, pp. 4282, 1995.
[4] W. Hu, D. Xie, Z. Fu, W. Zeng, and S. Maybank. “Semantic based surveillance video retrieval,” IEEE Trans. Image Processing, vol. 16, no. 4, pp. 1168-1181, 2007.
[5] B. T. Morris and M. M. Trivedi. “Trajectory learning for activity understanding: Unsupervised, multilevel, and long-term adaptive approach,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 11, pp. 2287-2301, 2011.
[6] B. Zhou, X. Wang, and X. Tang. “Random field topic model for semantic region analysis in crowded scenes from tracklet,” In Computer Vision and Pattern Recognition (CVPR), 2011, pp. 3441-3448.
[7] J. Walker, A. Gupta, and M. Hebert. “Patch to the future: Unsupervised visual prediction,” In Computer Vision and Pattern Recognition (CVPR), 2014.
[8] K. M. Kintai, B. D. Ziebart, J. A. Bagnell, and M. Hebert. “Activity forecasting,” In European Conference on Computer Vision (ECCV), 2012, pp. 201-214.
[9] C. Finn, Ian Goodfellow, and S. Levine. “Unsupervised Learning for Physical Interaction through Video Prediction,” In Neural Information Processing System (NIPS), 2016.
[10] W. Lotter, G. Kreiman, and D. Cox. “Deep predictive Coding Networks For Video Prediction And Unsupervised Learning,” In International Conference on Learning Representations (ICLR), 2017.
[11] P. Luc, N. Neverova, C. Couprie, J. Verbeek, and Y. LeCun. “Predictive Deeper into the Future of Semantic Segmentation,” In International Conference on Computer Vision (ICCV), 2017, pp. 648-657.
[12] S. Huang, X. Li, Z. Zhang, Z. He, F. Wu, W. Liu, J. Tang, and Y. Zhuang, “Deep Learning Driven Visual Path Prediction from a Single Image,” In IEEE Transactions on Image Processing (TIP), 2016.
[13] C. Fan, J. Lee, and M.S. Ryoo, “Forecasting Hands and Objects in Future Frames,” arXiv preprint arXiv:1705.07328, 2017.
[14] F. Bartoli, G. Lisanti, L. Ballan, and A. Del Bimbo, “Context-aware trajectory prediction,” arXiv preprint arXiv:1705.02503, 2017.
[15] B. Kim, C. M. Kang, J. Kim, S. Lee, C. C. Chung, J. W. Choi. “Probabilistic vehicle trajectory prediction over occupancy grid map via recurrent neural network,” In IEEE 20th International Conference on Intelligent Transportation Systems (ITSC),2017
[16] A. Mousavian, D. Anguelov, J. Flynn, J. Kosecka. “3D Bounding Box Estimation Using Deep Learning and Geometry,” In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[17] S. H. Park, B. Kim, C. M. Kang, C. C. Chung, J. W. Choi. “Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture,” In Intelligent Vehicles Symposium (IV), 2018
[18] D. Bahdanau , K. Cho, Y. Bengio, 2014. “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473. 2014.
[19] K. Xu, J. Ba, R. Kiros, K. Cho, A.C. Courville, R. Salakhutdinov, R.S. Zemel, Y. Bengio, “Show, attend and tell: Neural image caption generation with visual attention,” In International Conference on Machine Learning (ICML), 2015, pp. 77-81.
[20] A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese. “Social lstm: Human trajectory prediction in crowded spaces,” In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 961-971.
[21] L. Yao, A. Torabi, K. Cho, N. Ballas, C. Pal, H. Larochelle, A. Courville, “Describing videos by exploiting temporal structure,” In International Conference on Computer Vision (ICCV), 2015, pp. 4507-4515.

電子全文
中英文摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文