帳號:guest(3.144.37.196)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):彭桂香
作者(外文):Peng, Kuei-Hsiang
論文名稱(中文):從臉書中文使用者之動態貼文預測其人格特質
論文名稱(外文):Predicting Personality Traits of Chinese Users Based on Facebook Wall Posts
指導教授(中文):張正尚
口試委員(中文):李端興
林華君
黃之浩
學位類別:碩士
校院名稱:國立清華大學
系所名稱:通訊工程研究所
學號:101064530
出版年(民國):103
畢業學年度:102
語文別:英文
論文頁數:32
中文關鍵詞:人格特質文本分類文本探勘中文文本探勘機器學習
外文關鍵詞:personalitytext classi cationtext miningChinese text miningmachine learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:637
  • 評分評分:*****
  • 下載下載:26
  • 收藏收藏:0
人格特質是影響人類行為的重要因素之一,因此自動預測人格特質是相當俱有潛力的研究課題。近年來,已有許多學者投入相關研究。然而,主要的研究都集中在處理英文文本的部分,鮮少有針對中文文本的人格特質預測。中文文本和英文文本在詞(word)與詞的連接上有很大的不同:英文的詞與詞之間會有空白分隔,而中文沒有。這使得中文文本在分詞上比英文來得困難,更不容易分析。
因此,我們在本篇論文嘗試透過中文文本來分類一個人的人格特質。首先,我們收集222 位使用中文的臉書使用者的塗鴉牆貼文以及其人格特質分數。接著,應用結巴中文分詞來完成分詞任務,以及使用支持向量機作為分類人格特質的學習演算法。
實驗的結果顯示,在中文分詞的幫助之下預測精確度和召回率都有大幅的改善。而同時考慮文本特徵及臉書朋友數可達到最佳的表現。此外,我們發現外向的人比起內向的人來說,傾向於發表較長或較多的貼文且會頻繁使用常見的字。這暗示外向的人較喜歡在臉書上和其他人分享自己的心情或生活瑣事。
Automatically recognizing personality is a promising subject as a way to infer a person'sbehaviors. Many studies have been performed in recent years. However, very few of them are focus on predicting personality from Chinese texts. Chinese texts are very different from English texts where words are separated by the spaces. A Chinese sentence consists of a sequence of characters with no space between them. But a character is not a meaningful unit, a word is. This makes it more dicult to analyze Chinese texts since the boundaries of words are not obvious.
In this thesis, we attempt to classify the personality traits from Chinese texts. We collected a dataset with posts and personality scores of the 222 Facebook users who use Chinese as their main written language. Then, the Jieba Chinese text segmentation was employed to accomplish the text segmentation task, and SVM was used as a learning algorithm for personality classi cation.
Experimental results show that the performance in precision and recall gain much improvement with the help of text segmentation and considering both the text and friend features yields the best performance. Moreover, we nd that extraverts seem to write more sentences and use more common words than introverts do. This indicates that the extraverts are more willing to share their mood and life with others than the introverts.
List of Figures
List of Tables
1 Introduction
2 Background
2.1 Big Five Model
2.2 Related work
3 Methods 11
3.1 Text feature extraction algorithms
3.1.1 Bag-of-words model
3.1.2 Chinese text segmentation
3.1.3 Weighted schemes
3.2 Feature selection algorithms
3.2.1 Chi-squared test
3.2.2 Recursive Feature Elimination
3.3 Classi cation algorithms
3.3.1 Support vector machines
4 Experiments and results
4.1 Data collection
4.2 Statistical characteristics of the dataset
4.3 Evaluation Metrics
4.3.1 Accuracy
4.3.2 Precision and recall
4.3.3 Negative predictive value and True Negative Rate
4.4 Experiments and results
4.4.1 Experimental setup
4.4.2 Tokenizing with text segmentation
4.4.3 Classifying using document-term matrix in TF or TF-IDF weighted scheme
4.4.4 Classifying using di erent feature selection approaches
4.4.5 Classifying using both text and friend features
4.4.6 Comparing the selected features of these experiments
5 Conclusion
[1] Lewis R Goldberg. The structure of phenotypic personality traits. American psychologist, 48(1):26, 1993.
[2] Extroversion. http://www.merriam-webster.com/dictionary/extroversion. Accessed July 21, 2014.
[3] Robert Hogan, John A Johnson, and Stephen R Briggs. Handbook of personality psychology. Elsevier, 1997.
[4] conscientiousness. http://en.wiktionary.org/wiki/conscientious. Accessed July 21, 2014.
[5] Paul T Costa and Robert R McCrae. Normal personality assessment in clinical practice: The neo personality inventory. Psychological assessment, 4(1):5, 1992.
[6] Samuel D Gosling, Peter J Rentfrow, and William B Swann Jr. A very brief measure of the big- ve personality domains. Journal of Research in personality, 37(6):504-528, 2003.
[7] Bin Fu, Jialiu Lin, Lei Li, Christos Faloutsos, Jason Hong, and Norman Sadeh. Why people hate your app: Making sense of user feedback in a mobile app store. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1276{1284. ACM, 2013.
[8] Jon Oberlander and Scott Nowson. Whose thumb is it anyway?: classifying author personality from weblog text. In Proceedings of the COLING/ACL on Main conference poster sessions, pages 627{634. Association for Computational Linguistics, 2006.
[9] Fabio Celli, Fabio Pianesi, David Stillwell, and Michal Kosinski. Workshop on computational personality recognition: Shared task. In Proceedings of the Workshop on Computational Personality Recognition, 2013.
[10] Dejan Markovikj, Sonja Gievska, Michal Kosinski, and David Stillwell. Mining facebook data for predictive personality modeling. In Proceedings of the 7th international AAAI conference on Weblogs and Social Media (ICWSM 2013), Boston, MA, USA, 2013.
[11] Marc T Tomlinson, David Hinote, and David B Bracewell. Predicting conscientiousness through semantic analysis of facebook posts. Proceedings of WCPR, 2013.
[12] Baharum Baharudin, Lam Hong Lee, and Khairullah Khan. A review of machine learning algorithms for text-documents classi cation. Journal of advances in information technology, 1(1):4-20, 2010.
[13] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825{2830, 2011.
[14] Gerard Salton, Anita Wong, and Chung-Shu Yang. A vector space model for automatic indexing. Communications of the ACM, 18(11):613{620, 1975.
[15] Jieba chinese text segmentation. https://github.com/fxsjy/jieba. Accessed July 21, 2014.
[16] Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. Gene selection for cancer classi cation using support vector machines. Machine learning, 46(1-3):389-422, 2002.
[17] Gerard Saucier. Mini-markers: A brief version of goldberg's unipolar big- ve markers. Journal of personality assessment, 63(3):506-516, 1994.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *