Espinoza Barahona, Jeydels Alexander
A Proposed Process for Exploring Employees Relationship via Social Network and Sentiment Analysis.
Sun, Hung-Min
Employees relationship, Social Network Analysis, Sentiment Analysis
The purpose of this study is to apply social network analysis techniques to construct a social network out of an email dataset, as well as analyze and visualize the network properties and utilize sentiment analysis as an additional source of information to study employees relationships in a company. This study is primarily based on the Enron email dataset and covers the methodology followed to transform the data into a suitable format to detect patterns and get useful information. The social graph is based on the From and To fields in the data, plus the distribution of emails sent by the entities. The resulting social graph contains around 371 nodes and 67 thousand edges, with a ratio of 84% neutral messages, 11% positive and approximately 5% negative. It was concluded that when social network analysis is used in conjunction with emotion detection, it is possible to see the positive or negative areas where the company must work in order to promote a healthy organizational culture and uncover possible organizational issues in a timely manner.
Abstract iii
Acknowledgements iv
List of Figures vi
List of Tables vii
1 Introduction 1
2 Related Work 2
3 Methodology 5
3.1 The Enron Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Data transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.3 Extracting the email addresses . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 Preprocessing the email addresses . . . . . . . . . . . . . . . . . . . . . . 6
3.5 Preprocessing the messages . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.6 Feature selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.7 Sentiment analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.8 Social Network Analysis on the Enron emails . . . . . . . . . . . . . . . . 13
4 Results and Discussion . . . . . . . . . . . . . . . . . . . . .14
4.1 The documents in the Enron Corpus . . . . . . . . . . . . . . . . . . . . . 14
4.2 Enron Emails Sentiment Classification . . . . . . . . . . . . . . . . . . . . 15
4.3 Social network Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.1 Degree Centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.2 Betweenness Centrality . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.3 Closeness Centrality . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 Sentiment diffusion in social network . . . . . . . . . . . . . . . . . . . . 23
4.4.1 Positive sentiment subgraph . . . . . . . . . . . . . . . . . . . . . 23
4.4.2 Negative sentiment subgraph . . . . . . . . . . . . . . . . . . . . . 24
4.4.3 Neutral sentiment subgraph . . . . . . . . . . . . . . . . . . . . . 25
5 Conclusions . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .27
References . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . .29
