帳號:guest(18.221.163.13)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):舒 邁
作者(外文):Sabir, Umair
論文名稱(中文):偵測不尋常軌跡方法的比較
論文名稱(外文):Comparison of detection methods of abnormal trajectories
指導教授(中文):李哲榮
指導教授(外文):Lee, Che-Rung
口試委員(中文):黃俊穎
徐正圻
口試委員(外文):Huang, Chun-Ying
Hsu, Cheng-Hsin
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學號:105062422
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:55
中文關鍵詞:不尋常軌跡演算法相似度
外文關鍵詞:abnormal detectiontrajectoriesclusteringsimilarity measure
相關次數:
  • 推薦推薦:0
  • 點閱點閱:493
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
由於科技的與日俱增,城市透過物聯網技術也變得更加智慧。而智慧化運輸更是對減少交通流量
,以及預測車輛的常用路徑起了重要作用。由內建GPS的車輛所產生的空間資料可用於找出未知
的訊息。這些重要的空間資料常用於許多研究領域。其中一項便是尋找不尋常的路徑
。本論文的研究便是尋找不尋常的路徑,並將其從正常路徑中區別出來。在第一階段,GPS資料
將被匯入、清理、進行預處理。在第二階段,將採取不同的異常值偵測方法來找尋最佳的相似度
測量與群集方法
。在第三階段,不同的群集方法對所有不同的相似度測量將被檢驗與分析,並以相互之間的穩定
性作為標準進行比較。在第四階段,最佳方法的結果將在數位地圖上視覺化以展示不尋常軌跡。
We evaluated our implementation using different similarity measures like Frechet distance, Dynamic Time Warping distance, Longest Common Subsequence distance, Hausdorff distance, Edit distance and lustering methods used are K Means, CLARA (Clustering Large Applications), PAM (Partitioning Around Medoids), Hierarchical clustering, Model Based Clustering and FANNY
(Fuzzy Analysis Clustering).
我們的評估包含了不同的相似度測量,如Frechet distance、Dynamic Time Warping
distance、Longest Common Subsequence distance、Hausdorff distance、Edit
distance;不同的群集方法則包含K Means、CLARA (Clustering Large Applications)、PAM (Partitioning Around Medoids)、Hierarchical clustering、Model Based Clustering 與FANNY (Fuzzy Analysis lustering)。
The smart transportation is of great importance to reduce the traffic and predict the most frequent path taken by the vehicles. One of the key technologies is trajectory analysis, which can be used for the study of ride sharing, traffic flow, abnormal detection, and more. In this thesis, we provide an empirical survey on the trajectory analysis for abnormal detection. The analysis has four stages. In the first stage, importing, cleaning and pre-processing of the GPS data will be initiated. In the second stage, different outliers detection methods will be used to find the best similarity measure and clustering method. In the third stage the validation and analysis of the different clustering methods with all similarity measures will be compared according to the internal and stability measure criteria. In the fourth stage, the method with the best output will be visualized on a digital map to see the abnormal behavior. We implemented different similarity measures, including Frechet distance, Dynamic Time Warping distance, Longest Common Subsequence distance, Hausdorff distance, Edit distance, and different clustering methods: K Means, CLARA (Clustering Large Applications), PAM (Partitioning Around Medoids), Hierarchical clustering, Model Based Clustering and FANNY (Fuzzy Analysis Clustering). We evaluated those methods using the HsinChu Bus trajectories,GeoLife dataset, Rio De Janeiro bus dataset and Android app Go Track dataset. Experimental results show that Frechet distance and Hausdorff distance with Hierarchical clustering provides the best result in terms of internal and stability measure criteria. The higher value of Silhouette width and Dunn Index clearly signifies that Frechet and Hausdorff distance when used with Hierarchical clustering give clusters with strong bonds and hence are able to find the abnormality in the trajectory data.
Contents
Chinese Abstract i
Abstract ii
Acknowledgements iii
Contents iv
List of Figures vii
List of Algorithms ix
1 Introduction 1
2 Methods and Algorithms 4
2.1 Data Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Map matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Path Compression . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Similarity Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Dynamic Time Warping (DTW) . . . . . . . . . . . . . . . . . 5
2.2.2 Frechet Distance . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.3 Longest Common Sub-Sequence . . . . . . . . . . . . . . . . . 7
2.2.4 Edit distance . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.5 Hausdor distance . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Clustering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 K-Means clustering . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Partitioning Around Medoids clustering . . . . . . . . . . . . . 10
2.3.3 Clustering Large Applications . . . . . . . . . . . . . . . . . . 12
2.3.4 Hierarchical clustering . . . . . . . . . . . . . . . . . . . . . . 13
2.3.5 Fuzzy clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.6 Model Based clustering . . . . . . . . . . . . . . . . . . . . . . 16
3 Cluster Analysis and Validation 18
3.1 Optimal number of clusters . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.1 Elbow method . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.2 Average Silhouette method . . . . . . . . . . . . . . . . . . . . 19
3.2 Comparison of Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 Internal measures . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.2 Stability measures . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Cluster Validation Methods . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.1 Silhouette Width . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.2 Dunn Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Compute p-value for Hierarchical clustering . . . . . . . . . . . . . . 22
4 Implementation and Experiments 23
4.1 Importing and cleaning data . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Preprocessing of data . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 Conversion to spatial lines . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4 Data matrix using similarity measures . . . . . . . . . . . . . . . . . 27
4.5 Cluster analysis and validation using 5 similarity measures and 6
clustering techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.5.1 DTW distance and 6 clustering techniques . . . . . . . . . . . 28
4.5.2 Frechet distance and 6 clustering techniques . . . . . . . . . . 32
4.5.3 Hausdor distance and 6 clustering techniques . . . . . . . . . 34
4.5.4 Edit distance and 6 clustering techniques . . . . . . . . . . . . 36
4.5.5 LCSS distance and 6 clustering techniques . . . . . . . . . . . 38
4.6 Finding the best clustering technique and similarity measure . . . . . 41
4.7 Visualization of clustering using Frechet distance on two dimensional
plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.8 Visualization of the clusters using Frechet distance on the digital map
of city . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.9 Visualization of clustering using Hausdor distance on two dimensional
plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.10 Visualization of the clusters using Hausdor distance on the digital
map of city . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.11 Complete table of observations . . . . . . . . . . . . . . . . . . . . . . 46
4.12 p-value calculation of hierarchical clusters . . . . . . . . . . . . . . . 47
4.13 List of the R packages used in coding . . . . . . . . . . . . . . . . . . 48
4.14 Implementation on Go Track dataset . . . . . . . . . . . . . . . . . . 49
4.15 Implementation on Geolife dataset . . . . . . . . . . . . . . . . . . . 49
4.16 Implementation on Rio De Janeiro dataset . . . . . . . . . . . . . . . 50
5 Conclusion 52
[1] fvizmclust : Plotmodel􀀀basedclusteringresultsusingggplot2:https : ==rdrr:io=cran=factoextra=man=fvizmclust:html:
[2] Hierarchical clustering. http://www.saedsayad.com/clusteringhierarchical:htm:
[3] Jae-Gil Lee 0001, Jiawei Han 0001, and Xiaolei Li. Trajectory outlier detection: A partition-and-detect framework. In Gustavo Alonso, Jose A. Blakeley, and Arbee L. P. Chen, editors, ICDE, pages 140{149. IEEE Computer Society, 2008.
[4] Marco Lippi 0001, Matteo Bertini, and Paolo Frasconi. Collective trac forecasting. In Jose L. Balcazar, Francesco Bonchi, Aristides Gionis, and Michele Sebag, editors, Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2010, Barcelona, Spain, September 20-24, 2010, Proceedings, Part II, volume 6322 of Lecture Notes in Computer Science, pages 259{273. Springer, 2010.
[5] Pankaj K. Agarwal and Scribe: Nihshanka Debroy. Lecture 23: Hausdor and frechet distance. pages 23.1{23.6. Geometric Optimization, 2017.
[6] KP Agrawal and Sanjay Garg andPinkal Patel. Spatio-temporal outlier detection technique.
[7] Stefan Atev, Grant Miller, and Nikolaos P. Papanikolopoulos. Clustering of vehicle trajectories. IEEE Trans. Intelligent Transportation Systems, 11(3):647{657, 2010.
[8] Aline Bessa, Fernando de Mesentier Silva, Rodrigo Frassetto Nogueira, Enrico Bertini, and Juliana Freire. Riobusdata: Outlier detection in bus routes of rio de janeiro. CoRR, abs/1601.06128, 2016.
[9] D. Buzan, S. Sclaro , and G. Kollios. Extraction and clustering of motion trajectories in video. In ICPR, pages II: 521{524, 2004.
[10] Chao Chen, Daqing Zhang, Pablo Samuel Castro, Nan Li, Lin Sun, and Shijian Li. Real-time detection of anomalous taxi trajectories from gps traces. In Alessandro Puiatti and Tao Gu, editors, Mobile and Ubiquitous Systems: Computing, Network-ing, and Services, pages 63{74, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.
[11] J. Y. Cheung. Data mining: concepts and techniques. Choice, 49(6):1100, 02 2012. Copyright - Copyright American Library Association dba CHOICE Feb 2012; People - Han, Jiawei; Kamber, Micheline; Pei, Jian; Last updated - 2012-11-29; CODEN - CHOIAV; SubjectsTermNotLitGenreText - Han, Jiawei; Kamber, Micheline; Pei, Jian.
[12] Hongbo Deng and Jiawei Han 0001. Probabilistic models for clusterings. In Data Clustering: Algorithms and Applications, pages 61{86. 2013.
[13] Dua Dheeru and E Karra Taniskidou. UCI machine learning repository, 2017.
[14] B. Durgadevi and Dr.S. Rajalakshmi. Performing age group clustering in breast cancer datasets using fcm algorithm. 2013.
[15] I. N. Junejo, O. Javed, and M. Shah. Multi feature path modeling for video surveillance. In ICPR, pages II: 716{719, 2004.
[16] Alboukadel Kassambara. Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning, volume 1. STHDA, 2017.
[17] Rikard Laxhammar and Goran Falkman. Sequential conformal anomaly detection in trajectories based on hausdor distance. In FUSION, pages 1{8. IEEE, 2011.
[18] Xiaolei Li, Zhenhui Li, Jiawei Han 0001, and Jae-Gil Lee 0001. Temporal outlier detection in vehicle trac data. In Yannis E. Ioannidis, Dik Lun Lee, and Raymond T.Ng, editors, Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, March 29 2009 - April 2 2009, Shanghai, China, pages 1319{1322. IEEE Computer Society, 2009.
[19] Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. Isolation forest. In ICDM, pages 413{422. IEEE Computer Society, 2008.
[20] Jean Damascene Mazimpaka and Sabine Timpf. Trajectory data mining: A review of methods and applications. J. Spatial Information Science, 13(1):61{99, 2016.
[21] Shibin Parameswaran and Je rey Ellen. Identifying outliers in human movement trajectories clustered by hausdor distance. pages 1{7, Athens, 2014. The Steering Committee of TheWorld Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp). Copyright - Copyright The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied
Computing (WorldComp) 2014; Last updated - 2015-01-27.
[22] Han Su, Kai Zheng 0001, Kai Zeng 0002, Jiamin Huang, and Xiaofang Zhou. Stmaker - a system to make sense of trajectory data. PVLDB, 7(13):1701{1704,2014.
[23] Kevin Toohey and Matt Duckham. Trajectory similarity measures. SIGSPATIAL Special, 7(1):43{50, 2015.
[24] Andrea Trevino. Introduction to k-means clustering.
https://www.datascience.com/blog/k-means-clustering. Accessed: 2012-06-16.
[25] Michail Vlachos, Dimitrios Gunopulos, and George Kollios. Discovering similar multidimensional
trajectories. In Rakesh Agrawal 0001 and Klaus R. Dittrich, editors,
ICDE, pages 673{684. IEEE Computer Society, 2002.
[26] Daqing Zhang, Nan Li, Zhi-Hua ZHOU, Chao Chen, Lin Sun, and Shijian Li. ibat : detecting anomalous taxi trajectories from gps traces. pages 99{108, September 17 2011.
[27] Jie Zhu, Wei Jiang, An Liu 0002, Guanfeng Liu, and Lei Zhao 0001. Time-dependent popular routes based trajectory outlier detection. In Jianyong Wang, Wojciech Cellary, Dingding Wang 0001, Hua Wang, Shu-Ching Chen, Tao Li 0001, and Yanchun Zhang, editors, Web Information Systems Engineering - WISE 2015 - 16th Inter- national Conference, Miami, FL, USA, November 1-3, 2015, Proceedings, Part I, volume 9418 of Lecture Notes in Computer Science, pages 16{30. Springer, 2015.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *