帳號:guest(18.226.4.16)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):謝祥彥
作者(外文):Hsieh, Hsiang Yen
論文名稱(中文):透過分類法與社會網路分析研究惡意電話之行為
論文名稱(外文):Spam Calls Analysis Using Classification and Social Network Analysis
指導教授(中文):王俊程
指導教授(外文):Wang, Jyun Cheng
口試委員(中文):王貞雅
江成欣
口試委員(外文):Wang, Chen Ya
Chiang, Cheng Hsin
學位類別:碩士
校院名稱:國立清華大學
系所名稱:服務科學研究所
學號:103078517
出版年(民國):105
畢業學年度:104
語文別:英文
論文頁數:57
中文關鍵詞:惡意電話詐騙電話騷擾電話行銷電話分類樹邏輯回歸分析社會網路分析
外文關鍵詞:spam callfraud callharassed callmarketing callclassification treelogistic regressionsocial network analysis
相關次數:
  • 推薦推薦:0
  • 點閱點閱:1091
  • 評分評分:*****
  • 下載下載:11
  • 收藏收藏:0
惡意電話在現實世界中層出不窮,根據調查就台灣而言,平均每月台灣人浪費 15 萬小時講惡意電話,主要包含了詐騙電話、騷擾電話與行銷電話,而每年因為詐騙電話所造成的損失就超過了37億台幣。然而過去的研究都只針對偵測惡意電話,不去探究其是否為嚴重的詐騙或一般的行銷電話。
本篇論文中,我們分析知名電話偵測APP的惡意電話資料,其中包含了這些惡意電話的種類、通話時間、通話日期等,這些資料經過前置處理後將其合併成適合分析的階段。之後利用過採樣來消除資料不平衡的問題,並透過多重邏輯迴歸分析解決多類別的分類目的,得出一個可以分類三種惡意電話的模型。另一方面也透過社會網路分析,找出不同種類惡意電話中的交集,更有利於我們區別其是否為惡意電話。
透過本篇論文除了能對惡意電話的行為有進一步的了解之外,也可以透過分析結果發現不同類別內的相似性。而我們新的檢測方法相較於過去而言,也能夠進一步將惡意電話區分成三個類別。
Spam calls are everywhere. According to a research study, Taiwanese wastes almost 150,000 hours on spam calls per month. Spam calls include Fraud, Harassed and Marketing. Moreover, we lost 3.7 billion NTD every year because of the Fraud call. Although there are many studies talking about spam calls detection, few of them try to classify the category of spam calls.
In this research, we obtain a huge dataset about spam calls’ call logs that include the category, duration and date. First, we run data preprocess and data aggregation, then use oversampling to overcome the problem of imbalanced data. In addition, we implement multiple models of logistic regression to solve the multi-class classification, and then build models that can classify spam calls into three categories. We also use social network analysis to find out the social relationship of calls within some subgroups.
In conclusion, different spam calls have exactly different behaviors. It is possible to identify them by using classification and social network analysis. However, spammers’ behavior may change as the time goes by, doing analysis once and for all is impossible. It is necessary to train new model routinely to overcome the changing behavior.
Abstract ii
中文摘要 iii
致謝 iv
List of Contents v
List of Figures vii
List of Tables viii
Chapter 1 Introduction 1
1.1 Research background and motivation 1
1.2 Application about phone identification 1
1.3 Research objective 2
Chapter 2 Related Work 4
2.1 Spam on the Internet 4
2.2 Social Network Analysis 5
2.3 The related research 6
Chapter 3 Methodology 11
3.1 Dataset description 11
3.2 Software introduction 12
3.3 Data analysis process 13
Chapter 4 Classification 16
4.1 Data visualization 16
4.2 Data preprocess 21
4.3 Classification tree 24
4.4 Logistic regression 26
4.5 Logistic Regression with Oversampling 32
Chapter 5 Social Network Analysis 38
5.1 Social network in most active spam call 38
5.2 Social network in top users (receive more than 200) 40
5.3 Convert 2-mode social network into 1-mode 42
5.4 Social network in top users (receive more than 50) 45
Chapter 6 Findings and Discussion 48
6.1 Spammer and user behavior 48
6.2 Finding of our classification models 49
6.3 Non-specific category scoring 51
6.4 Social network analysis 51
Chapter 7 Conclusion 53
7.1 Contribution 53
7.2 Limitation and future work 54
Chapter 8 Reference 56

Bokharaei, H. K., Sahraei, A., Ganjali, Y., Keralapura, R., & Nucci, A. (2011). You can SPIT, but you cannot hide: Spammer Identification in Telephony Networks. 2011 Proceedings Ieee Infocom, 41-45.
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and Regression Trees: Taylor & Francis.
Catanese, S., Ferrara, E., & Fiumara, G. (2012). Forensic analysis of phone call networks. Social Network Analysis and Mining, 3(1), 15-33.
Chaisamran, N., Okuda, T., Blanc, G., & Yamaguchi, S. (2011). Trust-Based VoIP Spam Detection Based on Call Duration and Human Relationships.
Coffman, T. R., & Marcus, S. E. (2004). Pattern classification in social network analysis: A case study. Paper presented at the 2004 Ieee Aerospace Conference Proceedings, Vols 1-6. ://WOS:000225274000320
Dev, P., Singh, K., & Dhawan, S. (2015). Classification of malicious and legitimate nodes for analysing the users' behaviour in heterogeneous online social networks. 359-363.
Dongwook, S., Jinyoung, A., & Choon, S. (2006). Progressive multi gray-leveling: a voice spam protection algorithm. IEEE Network, 20(5), 18-24.
Farseev, A., Nie, L., Akbari, M., & Chua, T.-S. (2015). Harvesting Multiple Sources for User Profile Learning. 235-242.
Garton, L., Haythornthwaite, C., & Wellman, B. (1997). Studying Online Social Networks. Journal of Computer-Mediated Communication, 3(1), 0-0.
Hawkins, D. M. (2004). The problem of overfitting. J Chem Inf Comput Sci, 44(1), 1-12.
Haythornthwaite, C. (1996). Social network analysis: An approach and technique for the study of information exchange. Library & Information Science Research, 18(4), 323-342.
Jabeur Ben Chikha, R., Abbes, T., Ben Chikha, W., & Bouhoula, A. (2015). Behavior-based approach to detect spam over IP telephony attacks. International Journal of Information Security, 15(2), 131-143.
Kurata, M., Toyoda, K., & Sasase, I. (2015). Two-stage SPIT detection scheme with betweenness centrality and social trust. 289-293.
Opsahl, T., Agneessens, F., & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks, 32(3), 245-251.
Rahman, M. M., & Davis, D. N. (2013). Addressing the Class Imbalance Problem in Medical Datasets. International Journal of Machine Learning and Computing, 224-228.
Shmueli, G., Patel, N. R., & Bruce, P. C. (2010). Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner: Wiley Publishing.
Wang, T., Krim, H., & Viniotis, Y. (2013). A Generalized Markov Graph Model: Application to Social Network Analysis. Paper presented at the IEEE Journal of Selected Topics in Signal Processing.
Watts, D. J., Dodds, P. S., & Newman, M. E. (2002). Identity and search in social networks. Science, 296(5571), 1302-1305.
Ye, Q., Zhu, T., Hu, D. Y., Wu, B., Du, N., & Wang, B. (2008). Cell Phone Mini Challenge Award: Social Network Accuracy-Exploring Temporal Communication in Mobile Call Graphs. Ieee Symposium on Visual Analytics Science and Technology 2008, Proceedings, 207-208.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *