帳號:guest(216.73.216.146)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):曾淯湞
作者(外文):Tseng, Yu-Chen
論文名稱(中文):集成學習與深度學習在系外行星研究的應用
論文名稱(外文):The Applications of Ensemble Learning and Deep Learning on the Exoplanet Researches
指導教授(中文):葉麗琴
指導教授(外文):Yeh, Li-Chin
口試委員(中文):江瑛貴
陳賢修
口試委員(外文):Jiang, Ing-Guey
Chen, Shyan-Shiou
學位類別:碩士
校院名稱:國立清華大學
系所名稱:計算與建模科學研究所
學號:111026507
出版年(民國):113
畢業學年度:112
語文別:英文
論文頁數:86
中文關鍵詞:集成學習深度學習凌日法系外行星卷積神經網路人工智慧特徵工程機器學習
外文關鍵詞:Ensemble LearningDeep LearningTESSExoplanetConvolutional Neural NetworkArtificial IntelligenceTSfreshMachine Learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:322
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
本研究結合了機器學習中的集成學習和深度學習方法,並利用凌星法尋找TESS[6]數據中凌星週期在一到二天的候選系外行星。在模型訓練之前,我們使用了TSfresh特徵工程,並比較了使用Random Forest、Adaboost、XGBoost、LightGBM和CNN模型進行預測的差異。我們將原始光曲線處理成雜訊,並利用凌星模組Mandel&Agol[5]加入凌星訊號,建立了RF、Adab、XGB、LGBM和CNN模型。我們選擇適當的樣本大小,並使用交叉驗證法訓練模型。隨後,我們選擇效能較好的模型進行預測,尋找可能的候選系外行星。最終,我們在研究中發現了4個候選系外行星。
This study combines ensemble learning and deep learning methods in machine learning, integrated with the transit method to identify candidate exoplanets with transit periods ranging from one to two days in TESS[6] data. Prior to model training, we utilized TSfresh feature engineering and compared the predictive performance of Random Forest, Adaboost, XGBoost, LightGBM, and CNN models. We processed the original light curves as noise and incorporated transit signals using the Mandel & Agol [5] transit module, establishing RF, Adab, XGB, LGBM, and CNN models. Optimal sample sizes were selected, and models were trained using cross-validation. Subsequently, we employed the best-performing models for prediction to identify potential candidate exoplanets. Ultimately, our research identified four candidate exoplanets.
Abstract--------------------------------------------------------------3
摘要------------------------------------------------------------------4
致謝------------------------------------------------------------------5
Chapter 1: Introduction-----------------------------------------------6
Chapter 2: Dataset Introduction---------------------------------------9
2.1 Data Sources------------------------------------------------------9
2.2 Data Preparation-------------------------------------------------11
2.3 Data Preprocessing-----------------------------------------------15
2.3.1 Clustering And Standardization--------------------------15
2.3.2 Removing Outliers And Processing Scattered Data Points--19
2.3.3 Data Folding--------------------------------------------22
2.3.4 Interpolation For Imputation---------------------------23

Chapter 3: Feature Engineering---------------------------------------25
3.1 Purpose And Effects Of Feature Engineering-----------------------25
3.2 Using TSfresh For Feature Extraction ----------------------------25
3.3 Specific Feature Engineering Steps And Methods----------------26

Chapter 4: Model Selection-------------------------------------------33
4.1 Development Of Artificial Intelligence---------------------------33
4.2 Introduction And Development Of Machine Learning-----------------34
4.3 Decision Tree----------------------------------------------------36
4.4 Ensemble Learning Methods----------------------------------------37
4.4.1 Random Forest-------------------------------------------42
4.4.2 Adaboost------------------------------------------------43
4.4.3 XGboost(Extreme Gradient Boosting)----------------------46
4.4.4 LightGBM------------------------------------------------48
4.5 Development Of Deep Learning-------------------------------------50
4.6 Convolutional Neural Network---------------------------------51

Chapter 5: Model Selection-------------------------------------------60
5.1 Training、Validation、Testing Set--------------------------------60
5.2 K-Fold Cross-Validation -----------------------------------------61
5.3 Model Evaluation(Random Forest, Adaboost, XGBoost, LightGBM)63
5.4 Model Evaluation(Convolutional Neural Networks)-------------71

Chapter 6: Results and Discussions-----------------------------------75
6.1 RF、XGB、LGBM Model----------------------------------------------76
6.2 Adab、 CNN Model---------------------------------------------79

Chapter 7: Conclutions-----------------------------------------------80
References-----------------------------------------------------------81
Python Code-------------------------------------------------------82~86
[1] Christ, M., et al. (2024). tsfresh documentation (Version 0.20.2.post0.dev4+g3da2360) [Software documentation]. Blue Yonder GmbH. Retrieved from https://tsfresh.readthedocs.io/en/latest/

[2] Huang, C.-S. (2018). 機器學習 Ensemble Learning之Bagging, Boosting與AdaBoost. Medium. Retrieved from https://chih-sheng-huang821.medium.com/機器學習-ensemble-learning之bagging-boosting與adaboost-af031229ebc3

[3] Leo, chiu. (2018). 使用 TensorFlow 了解 Dropout. Medium. Retrieved from https://medium.com/手寫筆記/使用-tensorflow-了解-dropout-bf64a6785431

[4] Malik, A., Moster, B. P., & Obermeier, C. (2021). Exoplanet Detection using Machine Learning. arXiv preprint arXiv:2011.14135. Retrieved from https://arxiv.org/abs/2011.14135

[5] Mandel, K., & Agol, E. (2002). Analytic Light Curves for Planetary Transit
Searches. The Astrophysical Journal, 580(2): L171.

[6] TESS - FFI/TP/LC Bulk Downloads :
https://archive.stsci.edu/tess/bulk_downloads/bulk_downloads_ffi-tp-lc-dv.html.

[7] Yeh, L.-C., & Jiang, I.-G. (2020). Searching for Possible Exoplanet Transits from BRITE Data through a Machine Learning Technique. Publications of the Astronomical Society of the Pacific, 133(1019), 014401.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *