帳號:guest(18.188.192.255)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):傑爾
作者(外文):Taylor, Jherez
論文名稱(中文):於社群媒體偵測語境仇恨言語之密碼詞彙
論文名稱(外文):Detecting contextual hate speech code words within social media
指導教授(中文):陳宜欣
指導教授(外文):CHEN, YI-SHIN
口試委員(中文):蘇豐文
陳朝欽
口試委員(外文):SOO, VON-WUN
CHEN, CHAUR-CHIN
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:104065424
出版年(民國):106
畢業學年度:105
語文別:英文
論文頁數:47
中文關鍵詞:仇恨言論
外文關鍵詞:Hate speechNLPTwitterPageRank
相關次數:
  • 推薦推薦:0
  • 點閱點閱:235
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
相較於面對面人際互動,仇恨言論近年來在社群媒體中急速增長。過去研究多使用字詞黑名單或斷詞法偵測仇恨言論,然而社群媒體使用者不斷發明新字 詞,以暗號影射或代表所要攻擊的對象,導致字詞黑名單或是斷詞法並無法有 效發揮功用。本研究發展了一個圖像式方法,結合傳統的字詞間隔脈絡與句法 依賴脈絡,找出仇恨言論暗號中隱含的仇恨言論。本研究使用不同脈絡中的字詞使用模式,目的為辨認出仇恨言論暗號,擴展了仇恨言論詞彙,並改進分類結果之精確性.
While relatively rare in face–to–face interactions, social media platforms have recently seen an increase in the occurrence of hate speech discourse. Most methods rely on word blacklists and other text level features such as n-grams. While this approach is effective for flagging hate speech content, the discourse is not limited to a specific vocabulary as users are constantly adopting new terms. In this work we develop a graph based approach that incorporates conventional word window contexts along with syntactic dependency contexts in order to learn the hidden meaning of hate speech code words that have relatively unknown associations to hate speech. Our proposal utilizes the different types of contexts in which words are utilized with the goal being to identify new code words, thus expanding the hate speech lexicon and improving the accuracy of future classification systems.
Contents
Introduction 1
Related Work 4
Hate Speech and Context 8
3.1 Neural Embeddings and Context . . . . . . . . . . . . 9
3.2 Embedding Types . . . . . . . . . . . . . . . . . . . 12
Methodology 15
4.1 Data Collection and Embedding Creation . . . . . . . 17
4.2 Contextual Graph Expansion . . . . . . . . . . . . . 21
4.3 Contextual Codeword Search . . . . . . . . . . . . . 28
Experiment Results 33
5.1 Training Data . . . . . . . . . . . . . . . . . ... . 33
5.2 Experimental Setup . . . . . . . . . . . . . . . . . 34
5.3 Annotation Experiment . . . . . . . . . . . . . . . . 35
Conclusion and Future Work 44
[1] United Nations General Assembly Resolution 2200A [XX1]. International covenant on civil and political rights, 1966.

[2] Bloomberg. Disney dropped twitter pursuit partly over image. https://www.bloomberg.com/news/articles/2016-10-17/
disney-said-to-have-dropped-twitter-pursuit-partly-over-image/,2016.

[3] Quartz. The uk wants new laws to fine google, twitter, and facebook for failing to deal with hate speech. https://qz.com/972583/the-uk-wants-new-laws-impose-serious-fines-on-google-goog-twitter-twtr-and-facebook-fb-for-hate-speech/, May 2017.

[4] William Warner and Julia Hirschberg. Detecting hate speech on the world wide web. In Proceedings of the Second Workshop on Language in Social Media, LSM’12, pages 19–26, Stroudsburg, PA, USA, 2012. Association for Computational
Linguistics.

[5] Njagi Dennis Gitari, Zhang Zuping, Hanyurwimfura Damien, and Jun Long. A lexicon-based approach for hate speech detection. In International Journal of Multimedia and Ubiquitous Engineering Vol.10, No.4, IJMUE ’15, pages 215–230, 20 Virginia Court, Sandy Bay, Tasmania, Australia, 2015. Science and Engineering Research Society.

[6] Pete Burnap and Matthew L. Williams. Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Science, 5(1), 2016. 45

[7] Zeerak Waseem and Dirk Hovy. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. Proceedings of the NAACL Student Research Workshop, pages 88–93, 2016.

[8] Zeerak Waseem. Are You a Racist or Am I Seeing Things ? Annotator Influence on Hate Speech Detection on Twitter. Proceedings of 2016 EMNLP Workshop on Natural Language Processing and Computational Social Science, pages 138–142,
2016.

[9] Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, and Narayan Bhamidipati. Hate speech detection with comment embeddings. In Proceedings of the 24th International Conference on World Wide Web, WWW’15 Companion, pages 29–30, New York, NY, USA, 2015. ACM.

[10] Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. Abusive language detection in online user content. In Proceedings of the 25th International Conference on World Wide Web, WWW'16, pages 145–153, Re-
public and Canton of Geneva, Switzerland, 2016. International World Wide Web Conferences Steering Committee.

[11] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. 2016.

[12] Levy Omer and Goldberg Yoav. Dependency-based word embeddings. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Short papers), pages 302–308. ACL, 2014. 46

[13] Rijul Magu, Kshitij Joshi, and Jiebo Luo. Detecting the Hate Code on Social Media. In Proceedings of the Eleventh International AAAI Conference on Web and Social Media (ICWSM 2017), pages 608–611, 2017.

[14] Zellig S. Harris. Distributional Structure, pages 3–22. Springer Netherlands, Dordrecht, 1981.

[15] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 3111–3119. Curran Associates, Inc., 2013.

[16] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab, November 1999. Previous number = SIDL-WP-1999-0120.
(此全文未開放授權)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *