帳號:guest(216.73.216.146)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張文瑋
作者(外文):Chang, Wen-Wei.
論文名稱(中文):以深度強化學習進行台指期交易策略
論文名稱(外文):Futures Trading Strategies with Deep Reinforcement Learning
指導教授(中文):孫宏民
指導教授(外文):Sun, Hung-Min
口試委員(中文):許富皓
曾新穆
口試委員(外文):Hsu, Fu-Hau
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:107062589
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:45
中文關鍵詞:強化學習自動交易
外文關鍵詞:Deep Reinforcement LearningAutomatic trading
相關次數:
  • 推薦推薦:0
  • 點閱點閱:1175
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
主觀投資人在投資商品時,時常會參考一些特定的資訊來輔助交易決策,例
如技術指標、法人未平倉量,還有其他具有相關性的商品或指數走向,這些都將
成為影響投資人決策的一部份。然而市場上真正持續獲利的投資人是少之又少,
一般來說,他們勝於大多數投資人的要點時常在於經驗法則,當市場出現某些情
況會導致他們做出相對應的決策,並在這場遊戲中獲勝。而由於金融市場的複雜
程度,使人類難以輕易觀察並學習,因此我們致力於使用深度學習來學習策略。
投資人在學習決策的過程可以使用馬可夫決策過程表示,而強化學習則是建立在
此決策過程基礎上的一種機器學習演算法,本研究使用深度強化學習進行演算法
交易,透過模型觀察市場上的各式重要資訊,並學習到最佳獲利策略。以台灣指
數期貨為研究對象,並盡可能取得可獲得之公開資料,目標為將模型上線運作。
我們在訓練深度強化學習模型時做了一些嘗試,使得模型更適合應用於金
融商品的交易。一、為了簡化馬可夫決策過程的複雜程度,我們將策略拆成多
方和空方,使得挑選動作成為 (0, 1) 的分類問題,並讓兩模型分別訓練。二、我
們採用動作擴增技巧,使得訓練過程更容易找到最佳解。最後在 value-based 與
policy-based 不同演算法中比較績效結果。
When investing in commodities, subjective investors often refer to some specific
information to assist in trading decisions, such as technical indicators, institutional
investors open positions, and other related commodities or index trends, which will
become part of investors’ decision. However, there are very few investors who are
truly profitable in the market. Generally speaking, the main point that they out-
perform most investors is the rule of thumb. When certain situations in the market
cause them to make corresponding decisions, and win in this game. Because of
the complexity of financial markets, it is difficult for humans to easily observe and
learn, so we are committed to using deep learning to learn trading skills. An agent
can use the Markov Decision Process (MDP) to represent the learning decision pro-
cess, while reinforcement learning is a machine learning algorithm based on MDP.
This study uses deep reinforcement learning (DRL) conducts algorithmic trading,
observes various important information on the market through models, and learns
the best profit strategy. Taking Taiwan Index Futures as the research object, and
obtaining the public information as much as possible, the goal is to actually apply
the strategy to the market.
We made some attempts when training the deep reinforcement learning model,
making the model more suitable for financial commodity trading. 1. In order to
simplify the complexity of MDP, we split the strategy into long and short, so that
the action only has the classification problem of (0, 1), and let the two models be
trained separately. 2. We use action augmentation to make the training process
easier to find the best solution. Finally, the performance results are compared in
different algorithms, value-based and policy-based.
Abstract
摘要
致謝
Table of Contents......i
List of Figures......iii
List of Table......iv
Chapter 1 Introduction......1
1.1 Motivation......2
1.2 Contribution......3
1.3 Organization......3
Chapter 2 Background......4
2.1 Taiwan Index Futures......4
2.1.1 Definition......4
2.1.2 Institutional investors......5
2.1.3 VIX......5
2.2 Deep Reinforcement Learning......6
2.2.1 Markov Decision Process......6
2.2.2 Introduce to Reinforcement Learning......6
2.3 Deep Q-Network......8
2.3.1 Q-Learning......8
2.3.2 DQN......8
2.3.3 Double DQN......9
2.3.4 Dueling DQN......9
2.3.5 Noisy DQN......10
2.3.6 Categorical DQN......10
2.4 Proximal Policy Optimization(PPO)......10
Chapter 3 Related works......12
3.1 Prediction-based approach......12
3.2 Strategy-based approach......13
Chapter 4 Methodology......14
4.1 Design Trading MDP......14
4.1.1 Assumption......14
4.1.2 Data Preparation and Feature Extraction......15
4.1.3 State space......15
4.1.4 Action space......17
4.1.5 Reward function......17
4.1.6 Trading MDP......18
4.2 Action augmentation......18
4.3 Model architecture......19
Chapter 5 Implementation......20
5.1 Tools......20
5.2 Trading environment......21
5.3 Dataset......22
5.4 Model training......22
Chapter 6 Evaluation......23
6.1 Evaluation Method......23
6.2 Performance......24
6.2.1 Hyperparameter......24
6.2.2 Performance analysis......26
Chapter 7 Conclusion......40
7.1 Conclusion......40
7.2 Future Work......41
Bibliography......42
[1] 金管會. 金融業投入金融科技發展金額今年將突破百億元.
https://www.fsc.gov.tw/ch/home.jsp?id=96&parentpath=0,2&mcustomiz
e=news_view.jsp&dataserno=201808210003&aplistdn=ou=news,ou=multi
site,ou=chinese,ou=ap_root,o=fsc,c=tw&dtable=News.
[2] Vatsal H. Shah. Machine Learning Techniques for Stock Prediction. Foundations of Machine Learning., 2007.
[3] Leonardo dos Santos Pinheiro and Mark Dras. Stock market prediction with deep learning: A character-based neural language model for event-based trading. In Proceedings of the Australasian Language Technology Association Workshop 2017, pages 6–15, Brisbane, Australia, dec 2017.
[4] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural Comput., November 1997.
[5] L. C. Jain and L. R. Medsker. Recurrent Neural Networks: Design and Applications. CRC Press, Inc., Boca Raton, FL, USA, 1st edition, 1999.
[6] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA, 2018.
[7] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning, 2013.
[8] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms, 2017.
[9] 臺灣期貨交易所. Taiwan futures exchange.
https://www.taifex.com.tw/cht/index.
[10] Inc. Cboe Exchange. Cboe vix white paper.
https://www.cboe.com/micro/vix/vixwhite.pdf, 2019.
[11] W. Xia, H. Li, and B. Li. A control strategy of autonomous vehicles based on deep reinforcement learning. In 2016 9th International Symposium on Computational Intelligence and Design (ISCID), volume 2, pages 198–201, 2016.
[12] Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, and Tim Rocktäschel. A survey of reinforcement learning informed by natural language, 2019.
[13] S. Yun, J. Choi, Y. Yoo, K. Yun, and J. Y. Choi. Action-decision networks for visual tracking with deep reinforcement learning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1349–1358, 2017.
[14] Hado van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with double q-learning, 2015.
[15] Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, and Nando de Freitas. Dueling network architectures for deep reinforcement learning, 2015.
[16] Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, and Shane Legg. Noisy networks for exploration, 2017.
[17] Marc G. Bellemare, Will Dabney, and Rémi Munos. A distributional perspective on reinforcement learning, 2017.
[18] Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin Riedmiller, and David Silver. Emergence of locomotion behaviours in rich environments, 2017.
[19] Sangyeon Kim and Myungjoo Kang. Financial series prediction using attention lstm, 2019.
[20] Jiayu Qiu, Bin Wang, and Changjun Zhou. Forecasting stock prices with long-short term memory neural network based on attention mechanism.
https://doi.org/10.1371/journal.pone.0227222, 2020.
[21] Tsung-Ren Lin. The performance of intraday trading in taiwan stock index futures by employing three critical prices and counter daily potential: An application of multicharts programming.
https://hdl.handle.net/11296/gc35v7, 2014.
[22] Yi-Ling Lai. Using reinforcement learning to establish taiwan stock index future intra-day trading strategies.
https://www.csie.ntu.edu.tw/~lyuu/theses/thesis_r96922117.pdf, 2009.
[23] Shao-ChunYang. Using modified rainbow for enhancing reinforcement learning for stock trading-nasdaq’s stocks as examples.
https://hdl.handle.net/11296/dn3fkh, 2019.
[24] Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, and David Silver. Rainbow: Combining improvements in deep reinforcement learning, 2017.
[25] Hanchen Xu, Xiao Li, Xiangyu Zhang, and Junbo Zhang. Arbitrage of energy storage in electricity markets with deep reinforcement learning, 2019.
[26] 台灣經濟新報. Tej database.
https://www.tej.com.tw/twsite/Default.aspx.
[27] Chien Yi Huang. Financial trading as a game: A deep reinforcement learning approach, 2018.
[28] Kei Ota. tf2rl. https://github.com/keiohta/tf2rl.
[29] John Benediktsson and Brian Cappello. Ta-lib.
https://mrjbq7.github.io/ta-lib/doc_index.html.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *