帳號:guest(3.17.110.42)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):葉柏辰
作者(外文):Yeh, Po-Chen
論文名稱(中文):物聯網於車險的應用:以機器學習的方法解決數據不平衡與嚴重度分析
論文名稱(外文):Application of the Internet of Things to Auto Insurance: Solving Imbalanced Data and Severity Analysis with Machine Learning
指導教授(中文):韓傳祥
指導教授(外文):Han, Chuan-Hsiang
口試委員(中文):黃能富
丁台怡
口試委員(外文):Huang, Nen-Fu
Ding, Tai-Yi
學位類別:碩士
校院名稱:國立清華大學
系所名稱:計量財務金融學系
學號:106071505
出版年(民國):109
畢業學年度:108
語文別:中文
論文頁數:31
中文關鍵詞:物聯網(IoT)保險科技不平衡資料集集成學習隨機森林XGBoost神經網路
外文關鍵詞:IoTInsurtechImbalanced dataEnsemble learningRandom forestXGBoostNeural Network
相關次數:
  • 推薦推薦:0
  • 點閱點閱:329
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
保險科技的時代來臨,物聯網(IoT)蒐集的動態即時資訊顛覆了傳統保險的模式。因應新的生活型態,消費者也對於物聯網保險和微型保險的需求日益增加,產險業者則透過IoT裝置來擷取駕駛人的行為數據,凸顯了數據分析技術的重要性。以機器學習的方法分析數據必須滿足訓練數據完整及數據分布相對均勻的特性,但對於車禍資料集而言,致死車禍往往佔所有車禍比率極低,造成數據不平衡的問題。此外,車險理賠上往往仰賴保險公司的評估,導致理賠程序冗長。
  本篇利用機器學習的方法解決數據不平衡的問題,並以車禍碰撞資料做嚴重度分析,找出車禍的特徵,診斷車禍事故等級,縮短理賠時間,為保險公司節省人力與時間成本;另外透過車禍嚴重度分析模型篩選出重要特徵,將其應用在車禍預警上,藉此減輕民眾發生車禍事故的可能性,降低保險公司的理賠率,體現物聯網保險的價值。
The era of insurtech is coming. The dynamic real-time information collected by the Internet of Things (IoT) has overturned the traditional insurance model. In response to the new lifestyle, consumers have an increasing demand for IoT insurance and micro-insurance. Property insurance companies use IoT devices to capture driver behavior data, highlighting the importance of data analysis. Using machine learning to analyze data requires complete training data and relatively uniform data distribution. However, for car accident data sets, fatal accidents often account for an extremely low percentage of all accidents, causing the problem of imbalanced data. In addition, auto insurance claims often rely on the assessment of the insurance company, resulting in a lengthy claims process.
This paper uses machine learning methods to solve the problem of imbalanced data and analyzes the severity of car crash data to find out the characteristics of car accidents. It also diagnoses car accident levels, shortens the claim time, and saves labor costs as well as time costs for insurance companies, In addition, important features are screened out through the analysis model of the severity of car accidents and applied to car accidents as an early warning, so as to reduce the possibility of car accidents as well as the insurance company’s claim rate, and reflect the actual value of IoT insurance.
摘要 i
ABSTRACT ii
誌謝辭 iii
目錄 iv
圖目錄 vi
表目錄 vii
Chapter1.緒論 1
Chapter2.文獻回顧 3
Chapter3.研究方法 4
3.1資料集 4
3.2資料處理 6
3.2.1 Under-sampling減少多數法 7
3.2.2 Over-sampling增加少數法 7
3.2.3 SMOTE 7
3.3機器學習演算法 8
3.3.1一般演算法 8
3.3.2集成學習(Ensemble learning)演算法 9
3.4深度學習演算法 10
3.4.1神經網路 10
3.4.2深度神經網路(DNN, Deep Neural Network,多層神經網路) 11
3.5評估指標 12
Chapter4.研究結果與討論 15
4.1資料預處理 15
4.1.1缺損值的處理 15
4.1.1.1缺損值的移除 15
4.1.1.2缺損值的填補 15
4.1.2分類變數的編碼(One-Hot Encoding) 15
4.1.3分散式交叉驗證(Stratified K-Fold Cross-Validation) 16
4.2不同機器學習方法結果比較 16
4.3討論 20
4.3.1特徵重要性(Feature importance) 20
4.3.2消融實驗(Ablation) 22
4.3.3嚴重度分析模型(多分類) 23
4.3.4與保險的應用 25
4.3.4.1理賠模式 26
4.3.4.2車禍預警 27
Chapter5.結論與未來展望 28
參考文獻 30

一、 英文部分:
1. A. Sonak and R. A. Patankar. “A Survey on Methods to Handle Imbalance Dataset”, In International Journal of Computer Science and Mobile Computing, Vol.4, Issue 11, pp. 338-343, 2015
2. Dr. D. Ramyachitra and P. Manikandan. “Imbalanced Dataset Classification and Solutions:A Review”, IJCBR, Vol. 5, Issue 4, 2014
3. C. Cortes and V. Vapnik. “Support-vector networks”, In Machine Learning, pages 273–297, 1995.
4. B. Krawczyk. “Learning from imbalanced data: open challenges and future directions”, Progress in Artificial Intelligence, vol. 5, pp.221-232, 2016
5. I. Jamali, M. Bazmara and S. Jafari. “Feature Selection in Imbalance data sets”, IJCSI, Vol.9, Issue 3, No 2, 2012
6. R. Quinlan. “Induction of decision trees”, Machine Learning, pp .81–106, 1986.
7. L. Breiman. “Random forests. Machine learning”, 45(1):5–32, 2001.
8. B. Scholkopf and A. Smola. “Support vector machine”, KDD 99 The First Annual International Conference on Knowledge Discovery in Data, pp. 321–357, 2001.
9. J. Davis and M. Goadrich. “The relationship between Precision-Recall and ROC curves”, in 23rd international conference on Machine learning, 2006, pp.233-240


二、 中文部分:
1. 呂承翰, “以機器學習方法解決保險理賠數據集不平衡之問題”, 台大, 2020.
2. 李顯正, “金融科技概論”, 78-119, 新陸書局, 2018.
3. 陳允傑, “Python資料科學與人工智慧應用實務”, 8-2~10-45,13-2~14-19,16-2~16-9, 旗標出版, 2019.
4. S. Raschka, “Python機器學習”, 2-14,91-118,161-190, 博碩文化, 2016.
5. G. Bonaccorso, “初探機器學習演算法”, 146-169, 273-280, 碁峰資訊, 2017.
6. 阮敬, “Python數據分析基礎-包含數據挖掘和機器學習”, 104-240, 469-494, 五南出版, 2019.
7. 趙志勇,” Python機器學習算法”, 1-26,58-137, 電子工業出版社, 2017.

 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *