帳號:guest(3.15.29.119)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):王韋智
作者(外文):Wang, Wei-Chih
論文名稱(中文):以多語系自然語言理解與機器學習為基之智慧型專利摘要系統
論文名稱(外文):Intelligent patent summarization system incorporating multiple natural language understanding and machine learning capability
指導教授(中文):張瑞芬
指導教授(外文):Trappey, Amy-J.C.
口試委員(中文):樊晉元
吳政隆
張力元
學位類別:碩士
校院名稱:國立清華大學
系所名稱:工業工程與工程管理學系
學號:105034549
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:84
中文關鍵詞:人工智慧循環類神經網路多語系智財文件智慧機械詞嵌入
外文關鍵詞:Artificial intelligenceNatural language processingRecurrent Neural NetworkIntellectual property
相關次數:
  • 推薦推薦:0
  • 點閱點閱:210
  • 評分評分:*****
  • 下載下載:16
  • 收藏收藏:0
隨著近年來第四次工業革命及人工智慧的崛起,巨量的智慧財產資訊伴隨著研發技術的進步而不斷的產生且更新,資訊超載的問題也因此產生,因此如何能讓公司或研究者快速且有效率地掌握IP重要技術以及訊息,已成為重要的研究課題。而以往公司或研究者在面對大量智財文檔上,其需能大量檢索相關前案、系爭專利、先前技術,且需閱讀理解文義、分析相關判例才能獲得有利參考情資;此工作往往耗費大量時間與人力,若檢索精準度與分析結果不佳,將難以做出具時效且合理的智財防禦及權力主張。因此本計畫運用人工智慧下之序列到序列關注機制、深度類神經網路及循環類神經網路、詞嵌入等演算法,針對多語系智財文件 (中、英文專利技術文檔)進行智財文檔智慧技術摘要彙編,以節省原本人工研擬彙整文意、翻譯、撰寫摘要的大量人力與物力,並提供研究者快速掌握IP重要技術資訊之契機。另外本研究鎖定智慧機械領域之多語系技術專利文件為資料集進行自動技術彙編之功能,運用智慧文字探勘與語意辨識之方法,進行智財文件群組之自動研讀、關鍵詞句彙整、技術報告摘要彙編,本研究最後運用ROUGE的Precision與 Recall去衡量摘要產生的品質,以驗證摘要產出涵蓋關鍵重要資訊的一致性。
With the growing awareness of applying advanced technologies in hardware, software, and their integration for industrial applications, such as intelligent manufacturing (Industry 4.0), there are increasing demands in fast securing intellectual properties (IPs), to commercially protect competitive products and innovations. In order to allowing machine to learn rapid growing amount of IP documents, such as patent documents, Natural Language Processing (NLP) and Deep Learning (DL) algorithms should be deployed for their context e-discovery. The means to explain the related patent documents in a short summary remains a significant challenge. In this research, we develop an intelligent patent summarization system based on artificial intelligence (AI) approaches that include Recurrent Neural Network (RNN), Word Embedding, and Attention Mechanisms. The aim of this system is to automatically summarize multi-lingual patents in Chinese and English. The AI-based solution for summarization is used to capture the key technical keywords, popular terminologies. The ROUGE- Precision ratio and recall ratio are used to evaluate the accuracy and consistency of the output pf summarization.
Table of content
中文摘要 2
Abstract 3
List of Figures 6
List of Tables 7
1. Introduction 8
1.1 Research Background and Motivation 8
1.2 Research Scope and Purpose 9
1.3 Research Framework and Procedure 10
2. Literature review 12
2.1 Artificial Intelligence 12
2.2 Natural Language Processing 14
2.3 Automatic summarization techniques 19
3. Methodology 24
2.1 Raw data pre-processing 26
3.2 Text pre-processing 26
3.3 Smart summarization model training 27
3.3.1 Sequence to sequence with attention (S2SWA) 27
3.3.2 Smart summary generation 32
3.4 Evaluation 33
4. Case study 35
4.1 Smart machinery ontology construction and patent search conduction 35
4.1.1 Patent search strategy 37
4.2 Each domain of smart machinery patents initial case study 38
4.2.1 Initial case study 38
4.2.2 The each domain of specific patents collected from recommendation system 41
4.3 Smart machinery patent technical summary compilation 43
4.3.1 Control intelligence patents technical summary compilation 43
4.3.2 Sensor intelligence patents technical summary compilation 48
4.3.3 Intelligent decision making patents technical summary compilation 54
4.4 Comprehensive comparison 60
5 Conclusions 63
References 65
Appendix A – Six Case examples 70
Appendix B – Six Case result 73

[1] Afzal, N., Wang, Y., & Liu, H. (2016). MayoNLP at SemEval-2016 Task 1: Semantic Textual Similarity based on Lexical Semantic Net and Deep Learning Semantic Model. In SemEval@ NAACL-HLT (pp. 674-679).
[2] Allman, T. (2009). Deterring E-Discovery Misconduct with Counsel Sanctions: The Unintended Consequences of Qualcomm v. Broadcom. Yale LJ.
[3] Anjum, N. A., Harding, J. A., Young, R. I., & Case, K. (2012). Mediation of foundation ontology based knowledge sources. Computers in Industry, 63(5), 433-442.
[4] Arel, I., Rose, D. C., & Karnowski, T. P. (2010). Deep machine learning-a new frontier in artificial intelligence research [research frontier]. IEEE computational intelligence magazine, 5(4), 13-18.
[5] Arel, I., Rose, D. C., & Karnowski, T. P. (2010). Deep machine learning-a new frontier in artificial intelligence research [research frontier]. IEEE computational intelligence magazine, 5(4), 13-18.
[6] Atluru, S., Huang, S. H., & Snyder, J. P. (2012). A smart machine supervisory system framework. The International Journal of Advanced Manufacturing Technology, 58(5-8), 563-572.
[7] Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
[8] Barzilay, R., McKeown, K. R., & Elhadad, M. (1999, June). Information fusion in the context of multi-document summarization. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics (pp. 550-557). Association for Computational Linguistics.
[9] Bass, S. D., & Kurgan, L. A. (2010). Discovery of factors influencing patent value based on machine learning in patents in the field of nanotechnology. Scientometrics, 82(2), 217-241
[10] Beudert, R., Juergensen, L., and Weiland, J. (2015). Understanding smart machines: how they will shape the future. Retrieved from http://www.mhi.org/media/members/15373/131111776789208915.pdf [January 30, 2018]
[11] Bijalwan, V., Kumar, V., Kumari, P., & Pascual, J. (2014). KNN based machine learning approach for text and document mining. International Journal of Database Theory and Application, 7(1), 61-70.
[12] Cao, H., Zhang. X., & Chen, X. (2017). The concept and progress of intelligent spindles: A review. International Journal of Machine Tools and Manufacture, vol. 112, pp. 21-52.
[13] Cao, Z., Wei, F., Dong, L., Li, S., & Zhou, M. (2015, January). Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization. In AAAI (pp. 2153-2159).
[14] Carrasquilla, J., & Melko, R. G. (2017). Machine learning phases of matter. Nature Physics, 13(5), 431.
[15] Chen, K. Y., Liu, S. H., Chen, B., & Wang, H. M. (2016). Learning to distill: the essence vector modeling framework. arXiv preprint arXiv:1611.07206.
[16] Chen, S. T., Yu, P. S., & Tang, Y. H. (2010). Statistical downscaling of daily precipitation using support vector machines and multivariate analysis. Journal of Hydrology, 385(1), 13-22.
[17] Collobert, R., & Weston, J. (2008, July). A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning (pp. 160-167). ACM.
[18] Ferreira, R., de Souza Cabral, L., Lins, R. D., e Silva, G. P., Freitas, F., Cavalcanti, G. D., ... & Favaro, L. (2013). Assessing sentence scoring techniques for extractive text summarization. Expert systems with applications, 40(14), 5755-5764.
[19] Ferreira, R., Freitas, F., de Souza Cabral, L., Lins, R. D., Lima, R., Franca, G., ... & Favaro, L. (2014, April). A context based text summarization system. In Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on(pp. 66-70). IEEE.
[20] Hou, Y., Xiang, Y., Tang, B., Chen, Q., Wang, X., & Zhu, F. (2017). Identifying High Quality Document–Summary Pairs through Text Matching. Information, 8(2), 64.
[21] Hsieh, C., Trappey, A. J., Trappey, C. V. (2005). Ontology-Based Neural Network Electronic Document Categorization System.
[22] Huang, P. S., He, X., Gao, J., Deng, L., Acero, A., & Heck, L. (2013, October). Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management (pp. 2333-2338). ACM.
[23] Jia, F., Lei, Y., Lin, J., Zhou, X., & Lu, N. (2016). Deep neural networks: A promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data. Mechanical Systems and Signal Processing, 72, 303-315.
[24] Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. Machine learning: ECML-98, 137-142.
[25] Khan, A., Baharudin, B., Lee, L. H., & Khan, K. (2010). A review of machine learning algorithms for text-documents classification. Journal of advances in information technology, 1(1), 4-20.
[26] Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop (Vol. 8)
[27] Lin, Y. S., Jiang, J. Y., & Lee, S. J. (2014). A similarity measure for text classification and clustering. IEEE transactions on knowledge and data engineering, 26(7), 1575-1590.
[28] Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., & McClosky, D. (2014, June). The stanford corenlp natural language processing toolkit. In ACL (System Demonstrations) (pp. 55-60).
[29] Nallapati, R., Zhou, B., Gulcehre, C., & Xiang, B. (2016). Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023.
[30] Oppenheim, C. (1985). Patent novelty; proposals for change and their possible impact on information scientists. Information Scientist, 10(4), 181-186.
[31] Pooley, J., & Huang, V. (2011). Multi-National Patent Litigation: Management of Discovery and Settlement Issues and the Role of the Judiciary. Fordham Intell. Prop. Media & Ent. LJ, 22, 45.
[32] Qiu, X., Zhang, Q., & Huang, X. (2013, August). FudanNLP: A Toolkit for Chinese Natural Language Processing. In ACL (Conference System Demonstrations) (pp. 49-54).
[33] Rokach, L., & Maimon, O. (2005). Top-down induction of decision trees classifiers-a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 35(4), 476-487.
[34] See, A., Liu, P. J., & Manning, C. D. (2017). Get To The Point: Summarization with Pointer-Generator Networks. arXiv preprint arXiv:1704.04368.
[35] Severyn, A., & Moschitti, A. (2015, August). Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 373-382). ACM.
[36] Singhal, A., Buckley, C., & Mitra, M. (2017, August). Pivoted document length normalization. In ACM SIGIR Forum (Vol. 51, No. 2, pp. 176-184). ACM.
[37] Suadaa, L. H., & Purwarianti, A. (2016, May). Combination of Latent Dirichlet Allocation (LDA) and Term Frequency-Inverse Cluster Frequency (TFxICF) in Indonesian text clustering with labeling. In Information and Communication Technology (ICoICT), 2016 4th International Conference on (pp. 1-6). IEEE.
[38] Swietojanski, P., Ghoshal, A., & Renals, S. (2012, December). Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR. In Spoken Language Technology Workshop (SLT), 2012 IEEE (pp. 246-251). IEEE.
[39] Trappey, A. J., & Trappey, C. V. (2008). An R&D knowledge management method for patent document summarization. Industrial Management & Data Systems, 108(2), 245-257.
[40] Tsai, C. I., Hung, H. T., Chen, K. Y., & Chen, B. (2016, December). Extractive speech summarization leveraging convolutional neural network techniques. In Spoken Language Technology Workshop (SLT), 2016 IEEE (pp. 158-164). IEEE.
[41] Turian, J., Ratinov, L., & Bengio, Y. (2010, July). Word representations: a simple and general method for semi-supervised learning. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 384-394). Association for Computational Linguistics.
[42] Yao, C., Shen, J., & Chen, G. (2015, December). Automatic Document Summarization via Deep Neural Networks. In Computational Intelligence and Design (ISCID), 2015 8th International Symposium on (Vol. 1, pp. 291-296). IEEE.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *