帳號:guest(          離開系統
字體大小: 字級放大   字級縮小   預設字形  


作者(外文):Chu, Ning Min
論文名稱(外文):Multi-document Context Relationship Analysis - A Case Study of Product Related Documents
指導教授(外文):Hou, Jiang Liang
外文關鍵詞:Document Context RelationshipClassificationReading Recommendation
  • 推薦推薦:0
  • 點閱點閱:867
  • 評分評分:*****
  • 下載下載:39
  • 收藏收藏:0
As one searches required documents via keywords over the Internet, ranks of the related documents are determined based on their correlation with the specified keywords and their click rates. That is, context relationship between the related documents is not employed to determine the rank. As a result, readers have to spend more time to understand the document contents or face difficulties in understanding the documents. In order to solve the problems, this research analyzes a great number of documents and generalizes the relationship between document characteristics and document categories. On the basis of the analysis results, this research develops a model for context relationship analysis of multiple documents. By using the proposed model, characteristics and categories of documents can be identified by using determinant vectors. Finally, the documents can be sorted and the context relationship of documents can be visually displayed for reading. As a whole, the research can assist readers to acquire reasonable and visualized ranking of documents and to read the documents in appropriate sequence.

摘要 I
目錄 III
圖目錄 V
表目錄 VIII
第一章、研究背景 1
1.1研究動機與目的 1
1.2研究步驟 4
1.3研究定位 7
第二章、文獻回顧 11
2.1文件特質擷取 11
2.1.1依質化特性擷取文件特質 11
2.1.2依量化特性擷取文件特質 14
2.1.3依質化與量化特性擷取文件特質 20
2.2文件分類 24
2.2.1以監督式方法判定文件類別 24
2.2.2以半監督式方法判定文件類別 32
2.2.3以非監督式方法判定文件類別 35
2.3文件排序 39
2.3.1以搜尋字特質為基礎之文件排序模式 39
2.3.2以文件特質為基礎之文件排序模式 42
2.3.3以資訊需求者特質為基礎之文件排序模式 48
2.4小結 52
第三章、以文件內容為基礎之多文件脈絡關係分析模式 54
3.1現行文件內容解析 55
3.1.1文件特徵點與文件類別釐清 56
3.1.2特徵點與文件類別之關係分析 62
3.2文件特質擷取 67
3.3文件類別判定 74
3.4文件脈絡排序 91
3.5小結 95
第四章、系統規劃與架構 97
4.1系統核心架構 97
4.2系統功能架構 98
4.3資料模式定義 101
4.4系統功能運作流程 103
4.4.1系統功能操作流程 103
4.4.2系統資料傳遞流程 107
4.5系統開發工具 108
第五章、系統績效驗證與分析 109
5.1系統運作概況說明 109
5.2系統驗證方式說明 114
5.3系統驗證結果分析 118
第六章、結論與未來發展 136
6.1論文總結 136
6.2未來發展 139
參考文獻 141
附錄A、現行文件內容解析前置作業 147
附錄B、系統功能說明 166
附錄C、模式與系統於第二階段各週期之績效驗證結果 182


1. Agrawal, J., Sharma, N., Kumar, P., Parshav, V. and Goudar, R. H., 2013, "Ranking of Searched Documents Using Semantic Technology," Procedia Engineering, Vol. 64, pp. 1-7.
2. Akbari Torkestani, J., 2012, "An Adaptive Learning Automata-Based Ranking Function Discovery Algorithm," Journal of Intelligent Information Systems, Vol. 39, No. 2, pp. 441-459.
3. Alsmadi, I. and Alhami, I., 2015, "Clustering and Classification of Email Contents," Journal of King Saud University - Computer and Information Sciences, Vol. 27, No. 1, pp. 46-57.
4. Al-Tahrawi, M. M. and Al-Khatib, S. N., 2015, "Arabic Text Classification Using Polynomial Networks," Journal of King Saud University - Computer and Information Sciences, Vol. 27, No. 4, pp. 437-449.
5. Benny, A. and Philip, M., 2015, "Keyword Based Tweet Extraction and Detection of Related Topics," Procedia Computer Science, Vol. 46, pp. 364-371.
6. Bonzanini, M., Martinez-Alvarez, M. and Roelleke, T., 2012, "Opinion Summarisation through Sentence Extraction: An Investigation with Movie Reviews," Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1121-1122.
7. Chan, W. K. and Chong, W. C., 2004, "Unsupervised Clustering for Nontextual Web Document Classification," Decision Support Systems, Vol. 37, No. 3, pp. 377-396.
8. Chen, Y.-H., Lu, J.-L. and Tsai, M. F., 2014, "Finding Keywords in Blogs: Efficient Keyword Extraction in Blog Mining via User Behaviors," Expert Systems with Applications, Vol. 41, No. 2, pp. 663-670.
9. Choi, D., Kim, T., Min, M. and Lee, J-H., 2011, "An Approach to Use Query-Related Web Context on Document Ranking," Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication, pp. 1-7.
10. Dali, L., Fortuna, B. and Rupnik, J., 2010, "Learning to Rank for Personalized News Article Retrieval," Workshop on Applications of Pattern Analysis, pp. 152-159.
11. Daniłowicz, C. and Baliński, J., 2001, "Document Ranking Based upon Markov Chains," Information Processing & Management, Vol. 37, No. 4, pp. 623-637.
12. Druck, G., Pal, C., McCallum, A. and Zhu, X., 2007, "Semi-Supervised Classification with Hybrid Generative/Discriminative Methods," Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 280-289.
13. Duh, K. and Kirchhoff, K., 2011, "Semi-Supervised Ranking for Document Retrieval," Computer Speech and Language, Vol. 25, No. 2, pp. 261-281.
14. Elsas, J. L., Carvalho, V. R. and Carbonell, J. G., 2008, "Fast Learning of Document Ranking Function with Committee Perceptron," Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 55-64.
15. Ercan, G. and Cicekli, I., 2007, "Using Lexical Chains for Keyword Extraction," Information Processing and Management, Vol. 43, No. 6, pp. 1705-1714.
16. Figueiredo, F., Rocha, L., Couto, T., Salles, T., Gonçalves, M. A. and Jr, W. M., 2011, "Word Co-Occurrence Features for Text Classification," Information Systems, Vol. 36, No. 5, pp. 843-858.
17. Ghiassi, M., Olschimke, M., Moon, B. and Arnaudo, P., 2012, "Automated Text Classification Using a Dynamic Artificial Neural Network Model," Expert Systems with Applications, Vol. 39, No. 12, pp. 10967-10976.
18. Guan, H., Zhou, J., Xiao, B., Guo, M. and Yang, T., 2013, "Fast Dimension Reduction for Document Classification Based on Imprecise Spectrum Analysis," Information Sciences, Vol. 222, pp. 147-162.
19. Hahm, G. J., Lee, J. H. and Suh, H. W., 2015, "Semantic Relation Based Personalized Ranking Approach for Engineering Document Retrieval," Advanced Engineering Informatics, Vol. 29, No. 3, pp. 366-379.
20. Haveliwala, T. H., 2003, "Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search," IEEE Transactions on Knowledge & Data Engineering, Vol. 15, No. 4, pp. 784-796.
21. Hawalah, A. and Fasli, M., 2011, "A Hybrid Re-Ranking Algorithm Based on Ontological User Profiles," Proceedings of the 3rd Conference on Computer Science and Electronic Engineering, pp. 50-55.
22. Hernández, I., Rivero, C. R., Ruiz, D. and Corchuelo, R., 2014, "CALA: An Unsupervised URL-Based Web Page Classification System," Knowledge-Based Systems, Vol. 57, pp. 168-180.
23. Hong, B. and Zhen, D, 2012, "An Extended Keyword Extraction Method," Physics Procedia, Vol. 24, pp. 1120-1127.
24. Jameel, S. and Qian, X., 2012, "An Unsupervised Technical Readability Ranking Model by Building a Conceptual Terrain in LSI," Proceedings of the 8th International Conference on Semantics, Knowledge and Grids, pp. 39-46.
25. Jiang, Z., Zhang, S. and Zeng, J., 2013, "A Hybrid Generative/Discriminative Method for Semi-Supervised Classification," Knowledge-Based Systems, Vol. 37, pp. 137-145.
26. Ji, D., Zhao, S. and Xiao, G., 2009, "Chinese Document Re-Ranking Based on Automatically Acquired Term Resource," Language Resources and Evaluation, Vol. 43, No. 4, pp. 385-406.
27. Jun, S., Park, S.-S. and Jang, D.-S., 2014, "Document Clustering Method Using Dimension Reduction and Support Vector Clustering to Overcome Sparseness," Expert Systems with Applications, Vol. 41, No. 7, pp. 3204-3212.
28. Ko, Y. and Seo, J., 2009, "Text Classification from Unlabeled Documents with Bootstrapping and Feature Projection Techniques," Information Processing & Management, Vol. 45, No. 1, pp. 70-83.
29. Lee, L. H., Isa, D., Choo, W. O. and Chue, W. Y., 2012, "High Relevance Keyword Extraction Facility for Bayesian Text Classification on Different Domains of Varying Characteristic," Expert Systems with Applications, Vol. 39, No. 1, pp. 1147-1155.
30. Li, C. H. and Park, S. C., 2009, "An Efficient Document Classification Model Using an Improved Back Propagation Neural Network and Singular Value Decomposition," Expert Systems with Applications, Vol. 36, No. 2, pp. 3208-3215.
31. Lin, S.-S., 2009, "A Document Classification and Retrieval System for R&D in Semiconductor Industry – A Hybrid Approach," Expert Systems with Applications, Vol. 36, No. 3, pp. 4753-4764.
32. Liu, Y., Zhang, L., Song, R., Nie, J.-Y. and Wen, J.-R., 2009, "Clustering Queries for Better Document Ranking," Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1569-1572.
33. Li, Z., Zhou, D., Juan, Y.-F. and Han, J., 2010, "Keyword Extraction for Social Snippets," Proceedings of the 19th International Conference on World Wide Web, pp. 1143-1144.
34. Lloret, E. and Palomar, M., 2013, "Towards Automatic Tweet Generation: A Comparative Study from the Text Summarization Perspective in the Journalism Genre," Expert Systems with Applications, Vol. 40, No. 16, pp. 6624-6630.
35. Lopez, C., Prince, V. and Roche, M., 2014, "How Can Catchy Titles Be Generated without Loss of Informativeness?" Expert Systems with Applications, Vol. 41, No. 4, pp. 1051-1062.
36. Miao, D., Duan, Q., Zhang, H. and Jiao, N., 2009, "Rough Set Based Hybrid Algorithm for Text Classification," Expert Systems with Applications, Vol. 36, No. 5, pp. 9168-9174.
37. Nebhi, K., 2012, "Ontology-Based Information Extraction from Twitter," Proceedings of the Workshop on Information Extraction and Entity Analytics on Social Media Data, pp. 17-22.
38. Okamoto, J. and Ishizaki, S., 2011, "Important Sentence Extraction Using Contextual Semantic Network," Procedia - Social and Behavioral Sciences, Vol. 27, pp. 86-94.
39. Ouertani, H. C., 2013, "Implicit Sensitive Text Summarization Based on Data Conveyed by Connectives," International Journal of Advanced Computer Science & Application, Vol. 4, No. 11 pp. 1-4.
40. Özel, S. A., 2011, "A Web Page Classification System Based on A Genetic Algorithm Using Tagged-Terms As Features," Expert Systems with Applications, Vol. 38, No. 4, pp. 3407-3415.
41. Pak, A. and Paroubek, P., 2010, "Twitter as a Corpus for Sentiment Analysis and Opinion Mining," Proceedings of the 7th Conference on International Language Resources and Evaluation, pp. 1320-1326.
42. Preethi, P. G., Uma, V. and Kumar, A., 2015, "Temporal Sentiment Analysis and Causal Rules Extraction from Tweets for Event Prediction," Procedia Computer Science, Vol. 48, pp. 84-89.
43. Qin, L., Zheng, Q., Jiang, S., Huang, Q. and Gao, W., 2008, "Unsupervised Texture Classification: Automatically Discover and Classify Texture Patterns," Image and Vision Computing, Vol. 26, No. 5, pp. 647-656.
44. Roul, R. K., Devanand, O. R. and Sahay, S. K., 2014, "Web Document Clustering and Ranking Using Tf-Idf Based Apriori Approach," IJCA Proceedings on International Conference on Advances in Computer Engineering and Applications, No. 2, pp. 74-78.
45. Tsui, E., Wang, W. M., Cai, L., Cheung, C. F. and Lee, W. B., 2014, "Knowledge-Based Extraction of Intellectual Capital-Related Information from Unstructured Data," Expert Systems with Applications, Vol. 41, No. 4 pp. 1315-1325.
46. Usui, S., Palmes, P., Nagata, K., Taniguchi, T. and Ueda, N., 2007, "Keyword Extraction, Ranking, and Organization for the Neuroinformatics Platform," BioSystems, Vol. 88, No. 3, pp. 334-342.
47. Wang, Z. and Sun, X., 2011, "Document Classification Algorithm Based on MMP and LS-SVM," Procedia Engineering, Vol. 15, pp. 1565-1569.
48. Wen, K., Li, R., Xia, J. and Gu, X., 2014, "Optimizing Ranking Method Using Social Annotations Based on Language Model," Artificial Intelligence Review, Vol. 41, No. 1, pp. 81-96.
49. Xiang, B., Jiang, D., Pei, J., Sun, X., Chen, E. and Li, H., 2010, "Context-Aware Ranking in Web Search," Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 451-458.
50. Yang, P., Gao, W., Tan, Q. and Wong, K.-F., 2013, "A Link-Bridged Topic Model for Cross-Domain Document Classification," Information Processing and Management, Vol. 49, No. 6, pp. 1181-1193.
51. Yu, H., Oh, J. and Han, W.-S., 2009, "Efficient Feature Weighting Methods for Ranking," Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1157-1166.
52. Zhao, W. X., Jiang, J., He, J., Song, Y., Achananuparp, P., Lim, E.-P. and Li, X., 2011, "Topical Keyphrase Extraction from Twitter," Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 1, pp. 379-388.
53. Zhao, X.-G., Wang, G., Bi, X., Gong, P. and Zhao, Y., 2011, "XML Document Classification Based on ELM," Neurocomputing, Vol. 74, No. 16, pp. 2444-2451.
54. Zhou, S., Chen, Q. and Wang, X., 2013, "Active Deep Learning Method for Semi-Supervised Sentiment Classification," Neurocomputing, Vol. 120, pp. 536-546.
55. Zhou, S., Chen, Q. and Wang, X., 2014, "Fuzzy Deep Belief Networks for Semi-Supervised Sentiment Classification," Neurocomputing, Vol. 131, pp. 312-322.
第一頁 上一頁 下一頁 最後一頁 top
* *