帳號:guest(52.14.2.251)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):楊庭瑄
作者(外文):Yang, Ting-Hsuan
論文名稱(中文):運用文本探勘技術於交易投資策略:以LDA模型辨別主題
論文名稱(外文):Applying Techniques of Text Mining on Trading Investment Strategy:an LDA Approach to Distinguish the Topics
指導教授(中文):張焯然
指導教授(外文):Chang, Jow-Ran
口試委員(中文):劉鋼
蔡璧徽
口試委員(外文):Liu, Kang
Tsai, Pi-Hui
學位類別:碩士
校院名稱:國立清華大學
系所名稱:計量財務金融系
學號:104071506
出版年(民國):106
畢業學年度:105
語文別:英文
論文頁數:61
中文關鍵詞:文本探勘LDA模型聯準會會議記錄S&P 500指數交易策略
外文關鍵詞:text miningLDA modelminutes of FOMCS&P 500trading strategy
相關次數:
  • 推薦推薦:0
  • 點閱點閱:151
  • 評分評分:*****
  • 下載下載:52
  • 收藏收藏:0
情緒分析是近年來在文本探勘領域中被熱烈討論的一項議題,它的應用十分 多元,可以被應用於網路資訊安全的探測、總統大選的預測甚至是購物網站上的 推薦系統等等,而本研究則將情緒分析應用於交易策略上,對聯準會 (Federal Reserve) 的會議記錄做情緒分析來預測股票的報酬率,並先以 LDA (Latent Dirichlet Allocation) 主題模型來探討文章中的潛在主題,研究目的在於分辨 與聯準會相關的文本資料中與經濟財金議題比較不相關的段落並將這些段落刪 去後,期望能夠更精準地捕捉到投資人對於股票市場的情緒,依據這樣的研究發 現,擬定出一項具有可獲利性的交易投資策略。
此研究以 Tetlock (2007) 以及 Tetlock, Saar-Tsechansky, and MacSkassy (2008) 的論文為發想,先以 LDA 模型分辨出文章中與經濟財金議題不相干的詞 彙,刪去部分包含這些詞彙的段落後,再依據每篇文章建構出來的情緒指數對應 並產出合適的交易建議,最後在檢驗這項交易投資策略的績效之後,做一些適當的調整來做改善。
Sentiment analysis has triggered a heated discussion in recent years,
and it can be widely used in various kinds of fields. For example, It can be applied on the detection of network security, the prediction of the president election, the recommendation system on the shopping website, and so on. This thesis aims to apply the sentiment analysis on the trading investment strategy and make use of the articles of Federal Reserve to do the sentiment analysis to predict the return rate of stocks. Moreover, the thesis uses the topic model of latent dirichlet allocation to investigate the latent topics from the articles of Federal Reserve, and the goal is to distinguish the topics which influence the return rate of stock the most from the articles of Federal Reserve. Finally, my research expects to frame a lucrative trading investment strategy based on the research results.
The thesis is inspired by the researches of Tetlock (2007) and Tetlock, Saar-Tsechansky, and MacSkassy (2008). First, I will use the topic model of latent dirichlet allocation to classify the words according to different topics. Second, I will eliminate the paragraph which is irrelevant to finance in order to assess the exact financial sentiment and to apply it on investment trading strategy. Last but not least, I will add the derivatives into the investment trading strategy so as to hedge the loss from the wrong prediction of sentiment, and then I will examine the performance of the investment trading strategy after the modification.
Table of Contents
Chapter One Introduction ...........................................................................1 Chapter Two Literature Review .................................................................5 Chapter Three Methodology .....................................................................10
3.1 Latent Dirichlet Allocation Topic Model ....................................10 3.1.1 Latent Dirichlet Allocation Representation .......................11 3.1.2 Parameter Estimation by Gibbs Sampling .........................14
3.2 The Sentiment Index and the Trading Strategy ...........................15
3.3 The Modified Investment Trading Strategy ................................18 Chapter Four Empirical Study ..................................................................20 4.1 Performance of the Original Strategy ..........................................20 4.1.1 Loughran and McDonald Word List..................................21 4.1.2 Bing Liu Word List............................................................22 4.1.3 Both Loughran and McDonald and Bing Liu ....................24 4.1.4 Add Up Two Sentiment Indexes........................................26 4.1.5 Winning Rate of the Original Strategy ..............................29 4.2 Performance of the Strategy After LDA Model ..........................29 4.2.1 Loughran and McDonald Word List..................................32 4.2.2 Bing Liu Word List............................................................34 4.2.3 Both Loughran and McDonald and Bing Liu ....................36 4.2.4 Add Up Two Sentiment Indexes........................................38 4.2.5 Winning Rate After LDA Model .......................................40 4.3 Performance of Portfolio..............................................................41 4.3.1 Loughran and McDonald Word List..................................42 4.3.2 Bing Liu Word List............................................................43 4.3.3 Both Loughran and McDonald and Bing Liu ....................45 4.3.4 Add Up Two Sentiment Indexes........................................47
x
4.3.5 Winning Rate of Portfolio..................................................49 4.4. Return Rate .................................................................................50 Chapter Five Conclusion ..........................................................................52 References .................................................................................................54
References
Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of political economy, 81(3), 637-654.
Cutler, D. M., Poterba, J. M., & Summers, L. H. (1988). What Moves Stock Prices? NBER Working Paper(w2538).
Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. Paper presented at the Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining.
Huang, X., Teoh, S. H., & Zhang, Y. (2013). Tone management. The Accounting Review, 89(3), 1083-1113.
Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., & Allan, J. (2000). Mining of concurrent text and time series. Paper presented at the KDD-2000 Workshop on Text Mining.
Loughran, T., & McDonald, B. (2009a). Plain English, readability, and 10-K filings. Retrieved from
Loughran, T., & McDonald, B. (2009b). When is a Liability not a Liability? Journal of Finance, forthcoming.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65.
Loughran, T., & McDonald, B. (2014a). Measuring readability in financial disclosures. The Journal of Finance, 69(4), 1643-1671.
Loughran, T., & McDonald, B. (2014b). Regulation and financial disclosure: The impact of plain English. Journal of Regulatory Economics, 45(1), 94-113.
Loughran, T., & McDonald, B. (2015). Textual analysis in accounting and finance: A survey. University of Notre Dame Working Paper.
54
Loughran, T., McDonald, B., & Yun, H. (2009). A wolf in sheep’s clothing: The use of ethics-related terms in 10-K reports. Journal of Business Ethics, 89(1), 39-49.
Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139-1168.
Tetlock, P. C., SAAR‐TSECHANSKY, M., & Macskassy, S. (2008). More than words: Quantifying language to measure firms' fundamentals. The Journal of Finance, 63(3), 1437-1467.
Lu, Y. Y. (2014). The Information Content of Risk Factor Disclosures in Annual Reports, National Taiwan University, Taipei City.
Lin, I. H. (2013). Creating and Verifying Sentiment Dictionaryof Finance and Economics via Financial News, National Taiwan University, Taipei City.
Huang, C. C. (2012). Text mining of corporate annual report and its information content in predicting financial distress, Feng Chia University, Taichung City.
Hsieh, S. W. (2010). Using Text Mining Technique for Financial Statement Disclosures, National Chung Cheng University,Chiayi County.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *