基於自動編碼器的模型架構進行多維時間序列的特徵萃取和預測_

帳號：guest(216.73.216.146) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	姚承宏
作者(外文):	Yao, Cheng-Hong
論文名稱(中文):	基於自動編碼器的模型架構進行多維時間序列的特徵萃取和預測
論文名稱(外文):	Embedding and Forecast for Multivariate Time Series via Autoencoder-based Models
指導教授(中文):	徐南蓉
指導教授(外文):	Hsu, Nan-Jung
口試委員(中文):	黃信誠陳春樹
口試委員(外文):	Huang, Hsin-Cheng Chen, Chun-Shu
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	統計學研究所
學號:	110024502
出版年(民國):	112
畢業學年度:	112
語文別:	英文
論文頁數:	43
中文關鍵詞:	自編碼器、嵌入、標準化相互資訊、一步預測
外文關鍵詞:	Autoencoder、embedding、normalized mutual information、one-step-ahead forecast
相關次數:	推薦:0 點閱:0 評分: 下載:0 收藏:0

時間序列資料分析在各個領域中被廣泛應用，像是在金融經濟及自然環境科學等領域，而在高維度時間序列的模型建構上，已存在許多廣泛常用的模型。隨著資料量及變數維度增加，近年來蓬勃發展的深度學習技術逐漸成為分析高維度時間序列數據的建模方式。本篇論文主要探究在自編碼器 (autoencoder) 這類深度學習模型的架構下，如何從模型建造出低維度且非線性的嵌入 (embedding)，能夠盡可能地保有原先高維時間序列的複雜資訊，並同時做出時序資料的一步預測。本文提出了不同輸出的自編碼器模型 (autoencoder-based models)，並設計了多種實驗方法去比較這些模型在建造嵌入以及一步預測這兩個任務的表現情形。在衡量指標上，第一個任務是利用標準化相互資訊 (normalized mutual information)以及散布圖去判斷訓練得到的嵌入與原先數據關聯性;而對於第二個任務，利用預測均方誤差（predicted mean square error）以及時間序列圖去判斷模型預測的準確度。本論文經由模擬實驗及實際資料去驗證不同模型在這兩個任務上的差異。從結果得知這些不同輸出的自編碼器模型在不同的設定以及真實資料都有穩定的表現，而標準化相互資訊則是有比預期相對低的
數值，與我們所挑選的標準化版本可能有關。

Multivariate time series are popular in many fields such as finance, industry, and economics. Deep learning technology has emerged as a powerful tool for analyzing complex time series data in recent years. This study considers autoencoder-based models to learn non-linear embeddings of multivariate time series and to forecast the series by compressing the data into a lower-dimensional latent space. The idea is similar to latent factor modeling but allows flexible nonlinear relationships between the time series and its embedding via neural network modeling. During the model training, this study makes several experiments to explore the tradeoff in performance between two tasks, constructing low-dimensional embedding versus one-step-ahead forecasting. For assessing the performance, this thesis uses a normalized mutual information measure for the embedding task and adopts the prediction mean squared errors for the forecasting task. Finally, the proposed method is illustrated through simulation and empirical studies. The results indicate that autoencoder-based models consistently demonstrate stable performance across different settings. Additionally, it was observed that the normalized mutual information criterion yielded values lower than anticipated.

Contents
摘要
i
Abstract
ii
List of Figures
v
List of Tables
vi
1 Introduction
1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2 Autoencoder-based Models
4
2.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.2 Structure of Autoencoder-based Models and Embedding . . . . . . . . . . . . .
4
2.3 Constructing Low-dimensional Embedding and One-step-ahead Forecasting . . .
6
2.4 Training Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
3 Performance Evaluation
11
3.1 Performance Measure on Embedding . . . . . . . . . . . . . . . . . . . . . . . .
11
3.2 Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
4 Simulation Study
14
4.1 Simulation Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
4.2 Reference Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
4.3 Simulation 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
4.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16

4.3.2 Training Process 18
4.3.3 Testing Performance 18
4.3.4 NMI and Embedding 22
4.3.5 Performance for 100 Simulation Replicates 25
4.4 Simulation 2 27
4.5 Simulation 3 29
4.5.1 Data 29
4.5.2 Testing performance 30
4.5.3 NMI and Embedding 31
4.6 Summary of Simulation 34
5 Application 35
5.1 Dataset 35
5.2 Reference Model 36
5.3 Training Process 36
5.4 Testing Performance 37
5.5 Constructing Embedding 37
6 Discussion 39
References 41
Appendix: Derivation of Coeﬀicient Random Setting 43

1. Hinton, G. E. and Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507.
2. Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation,
9(8):1735–1780.
3. Ienco, D. and Interdonato, R. (2020). Deep multivariate time series embedding clustering via attentive-gated autoencoder. In Advances in Knowledge Discovery and Data Mining: 24th Pacific-Asia Conference, PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Part I 24.
Springer. 318–329.
4. Lütkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Springer Science & Business Media.
5. Molenaar, P. C. (1985). A dynamic factor model for the analysis of multivariate time series.
Psychometrika, 50(2):181–202.
6. Nguyen, N. and Quanz, B. (2021). Temporal latent auto-encoder: A method for probabilistic multivariate time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35. 9117–9125.
7. Park, J., Lee, M., Chang, H. J., Lee, K., and Choi, J. Y. (2019). Symmetric graph convolutional autoencoder for unsupervised graph representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6519–6528.
8. Ross, B. C. (2014). Mutual information between discrete and continuous data sets. PloS One,
9(2):e87357.
9. Shumway, R. H., Stoffer, D. S., and Stoffer, D. S. (2000). Time Series Analysis and Its Appli- cations. Springer, 3rd edition.
10. Vinh, N. X., Epps, J., and Bailey, J. (2009). Information theoretic measures for clusterings com- parison: is a correction for chance necessary? In Proceedings of the 26th Annual International Conference on Machine Learning. 1073–1080.

電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文