帳號:guest(18.117.141.149)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳冠元
作者(外文):Chen, Guan-Yuan
論文名稱(中文):利用深度遷移學習處理跨語言文本分類問題
論文名稱(外文):Deep Transfer Learning for Cross-Lingual Text Classification Problems
指導教授(中文):蘇豐文
指導教授(外文):Soo, Von-Wun
口試委員(中文):陳煥宗
陳宜欣
口試委員(外文):Chen, Hwann-Tzong
Chen, Yi-Shin
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:105065530
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:38
中文關鍵詞:深度學習遷移學習域自適應學習跨語言自然語言處理
外文關鍵詞:Deep LearningTransfer LearningDomain AdaptationCross-LingualNatural Language Processing
相關次數:
  • 推薦推薦:0
  • 點閱點閱:1171
  • 評分評分:*****
  • 下載下載:79
  • 收藏收藏:0
基於深度學習的文本分類模型,在利用大量資料做訓練的情況下,相較於傳統的機器學習方式,在結果表現上有著顯著的提升。然而現有的(易取得的)文本分類資料集,皆僅以特定語言為主(尤其是英文)。而對於其他語言,特別是資源較為稀少的語言 (Low-Resource Languages),因資料收集不易,且亦可能難以利用機器翻譯等方式取得大規模的跨語言文本資料(尤其對於某些資源稀少的語言,現今並無可靠的機器翻譯系統),導致在許多重要的文本分類問題上訓練相對困難。為解決上述問題,在此論文中,我們結合兩種重要的深度遷移學習方法 (模型參數共享與深度域適應),將英文以及其他資料量較為稀少的語言之文本資料共同用來對模型進行訓練。結果顯示我們結合英文資料一起訓練的基於深度遷移學習之模型,在中文以及越南文之多種重要的文本分類問題上 (情感分析、主觀性辨識、問題類別分類),皆較僅用單語言資料做訓練之最先進的深度文本分類模型,在分類結果的表現上,有明顯之提升。
Recently, the data-driven machine learning approaches have shown their successes on many text classification tasks for a resource-abundant language. However, there are still many languages that lack of sufficient enough labeled data for carrying out the same specific tasks. They may be costly to obtain high-quality parallel corpus or cannot rely on automated machine translation due to unreliable or unavailable machine translation tools in those low-resource languages. In this work, we propose an effective transfer learning method in the scenarios where the large-scale cross-lingual data is not available. It combines transfer learning schemes of parameter sharing (parameter based) and domain adaptation (feature based) that are joint trained with high-resource and low-resource languages together. We conducted the cross-lingual transfer learning experiments on text classification on sentiment, subjectivity and question types from English to Chinese and from English to Vietnamese respectively. The experiments show that the proposed approach significantly outperformed the state-of-the-art models that are trained merely with monolingual data on the corresponding benchmarks.
Contents
Introduction 1
Related Work 6
2.1 The Text Classification and Cross-Lingual Text Classification . . . . . . 6
2.1.1 Neural Text Classification . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Cross-Lingual Text Classification . . . . . . . . . . . . . . . . 7
2.2 Transfer Learning Based Methods . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Parameter Transfer . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Feature Representation Transfer . . . . . . . . . . . . . . . . . 11
2.3 Transfer Learning and Cross-Lingual NLP tasks . . . . . . . . . . . . . 12
Methodology 13
3.1 Bilingual Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Parameter Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Domain Adaptation and The Objective Function . . . . . . . . . . . . . 17
Experiments 20
4.1 Training and Evaluation Setup . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Discussion and Conclusion 31
[1] Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, Stroudsburg, PA, October 2013. Association for Computational Linguistics.

[2] Yoon Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746–1751. Association for Computational Linguistics, 2014.

[3] Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level convolutional networks for text classification. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS’15, pages 649–657, Cambridge, MA, USA, 2015. MIT Press.

[4] Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. Recurrent convolutional neural networks for text classification. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, pages 2267–2273. AAAI Press, 2015.

[5] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13, pages 3111–3119, USA, 2013. Curran Associates Inc.

[6] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.

[7] Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. Bag of tricks for efficient text classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 427–431. Association for Computational Linguistics, April 2017.

[8] Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, H´erve J´egou, and Tomas Mikolov. Fasttext.zip: Compressing text classification models. arXiv preprint arXiv:1612.03651, 2016.

[9] Yann Lecun, L´eon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, pages 2278–2324, 1998.

[10] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735–1780, November 1997.

[11] Xiaojun Wan. Using bilingual knowledge and ensemble techniques for unsupervised chinese sentiment analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’08, pages 553–561, Stroudsburg, PA, USA, 2008. Association for Computational Linguistics.

[12] Xiaojun Wan. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1, ACL ’09, pages 235–243, Stroudsburg, PA, USA, 2009. Association for Computational Linguistics.

[13] Xinjie Zhou, Xiaojun Wan, and Jianguo Xiao. Cross-lingual sentiment classification with bilingual document representation learning. In ACL, 2016.

[14] Hongjie Shi, Takashi Ushio, Mitsuru Endo, Katsuyoshi Yamagami, and Noriaki Horii. A multichannel convolutional neural network for cross-language dialog state tracking. 2016 IEEE Spoken Language Technology Workshop (SLT), pages 559–564, 2016.

[15] Goran Glavas, Marc Franco-Salvador, Simone Paolo Ponzetto, and Paolo Rosso. A resource-light method for cross-lingual semantic textual similarity. Knowl.-Based Syst., 143:1–9, 2018.

[16] Sinno Jialin Pan and Qiang Yang. A survey on transfer learning. IEEE Trans. On Knowl. and Data Eng., 22(10):1345–1359, October 2010.

[17] Samuel L. Smith, David H. P. Turban, Steven Hamblin, and Nils Y. Hammerla. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. CoRR, abs/1702.03859, 2017.

[18] Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Herv´e J´egou. Word translation without parallel data. arXiv preprint arXiv:1710.04087, 2017.

[19] Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Sch¨olkopf, and Alexander Smola. A kernel two-sample test. J. Mach. Learn. Res., 13(1):723–773, March 2012.

[20] Arthur Gretton, Dino Sejdinovic, Heiko Strathmann, Sivaraman Balakrishnan, Massimiliano Pontil, Kenji Fukumizu, and Bharath K. Sriperumbudur. Optimal kernel choice for large-scale two-sample tests. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1205–1213. Curran Associates, Inc., 2012.

[21] Quoc Le and Tomas Mikolov. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14, pages II–1188–II–1196. JMLR.org, 2014.

[22] Saif Mohammad, Mohammad Salameh, and Svetlana Kiritchenko. Sentiment lexicons for arabic social media. In Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Paris, France, may 2016. European Language Resources Association (ELRA).

[23] Xinjie Zhou, Xiaojun Wan, and Jianguo Xiao. Attention-based lstm network for cross-lingual sentiment classification. In EMNLP, 2016.

[24] John Blitzer, Ryan McDonald, and Fernando Pereira. Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06, pages 120–128, Stroudsburg, PA, USA, 2006. Association for Computational Linguistics.

[25] Zhongtang Zhao, Yiqiang Chen, Junfa Liu, Zhiqi Shen, and Mingjie Liu. Crosspeople mobile-phone based activity recognition. In Proceedings of the Twenty- Second International Joint Conference on Artificial Intelligence - Volume Volume Three, IJCAI’11, pages 2545–2550. AAAI Press, 2011.

[26] Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. How transferable are features in deep neural networks? In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, pages 3320–3328, Cambridge, MA, USA, 2014. MIT Press.

[27] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012.

[28] Sinno Jialin Pan, IvorW. Tsang, James T. Kwok, and Qiang Yang. Domain adaptation via transfer component analysis. In Proceedings of the 21st International Jont Conference on Artifical Intelligence, IJCAI’09, pages 1187–1192, San Francisco, CA, USA, 2009. Morgan Kaufmann Publishers Inc.

[29] Basura Fernando, Amaury Habrard, Marc Sebban, and Tinne Tuytelaars. Unsupervised visual domain adaptation using subspace alignment. 2013 IEEE International Conference on Computer Vision, pages 2960–2967, 2013.

[30] Baochen Sun, Jiashi Feng, and Kate Saenko. Correlation alignment for unsupervised domain adaptation. CoRR, abs/1612.01939, 2016.

[31] Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, and Trevor Darrell. Deep domain confusion: Maximizing for domain invariance. CoRR, abs/1412.3474, 2014.

[32] Mingsheng Long, Yue Cao, Jianmin Wang, and Michael Jordan. Learning transferable features with deep adaptation networks. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 97–105, Lille, France, 07–09 Jul 2015. PMLR.

[33] Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, and Trevor Darrell. Deep domain confusion: Maximizing for domain invariance. CoRR, abs/1412.3474,
2014.

[34] Mingsheng Long, Yue Cao, JianminWang, and Michael I. Jordan. Learning transferable features with deep adaptation networks. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning – Volume 37, ICML’15, pages 97–105. JMLR.org, 2015.

[35] Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. Transfer learning for low-resource neural machine translation. In EMNLP, 2016.

[36] Dingquan Wang, Nanyun Peng, and Kevin Duh. A multi-task learning approach to adapting bilingual word embeddings for cross-lingual named entity recognition. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 383–388. Asian Federation of Natural Language Processing, 2017.

[37] Shyam Upadhyay, Manaal Faruqui, Gokhan Tur, Dilek Hakkani-Tur, and Larry Heck. (almost) zero-shot cross-lingual spoken language understanding. In Proceedings of the IEEE ICASSP, 2018.

[38] Bo Pang and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity. In Proceedings of ACL, pages 271–278, 2004.

[39] Xin Li and Dan Roth. Learning question classifiers. In Proceedings of the 19th International Conference on Computational Linguistics - Volume 1, COLING ’02, pages 1–7, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *