基於多模態主動式學習法進行樣本與標記之間的關係分析於候用校長評鑑之自動化評分系統建置_

帳號：guest(3.136.234.12) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	孫泓敬
作者(外文):	Sun, Hung Ching
論文名稱(中文):	基於多模態主動式學習法進行樣本與標記之間的關係分析於候用校長評鑑之自動化評分系統建置
論文名稱(外文):	An Multimodal Active Learning Approach in the Identification of Samples-to-Labels toward Developing Automatic Oral Presentation Assessment in the Pre-service Principals Certification Program
指導教授(中文):	李祈均
指導教授(外文):	Lee, Chi Chun
口試委員(中文):	曹昱謝名娟蔡明學
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	電機工程學系
學號:	103061558
出版年(民國):	105
畢業學年度:	105
語文別:	中文
論文頁數:	54
中文關鍵詞:	主動式學習、資料選取、資料探勘、機器學習、未標記資料
外文關鍵詞:	active learning、data selection、data mining、machine learning、unlabeled data
相關次數:	推薦:0 點閱:472 評分: 下載:0 收藏:0

主動式學習 (active learning)，在機器學習領域中越來越受到重視，因為它可以用來優化訓練的過程，讓結果更好[17]。主要的概念是假如學習演算法可以在學習的過程中選擇比較決定性的資料點而不是挑選全部資料來做學習。接著根據對於模型而言具有代表性的資料點做挑選，將會對於學習的效果更有幫助，獲得更佳的結果。換句話說，透過觀察已知的標記資料，主動地挑選未標記的資料，並藉此獲得比挑選全部資料或是隨機抽樣資料的監督式學習方式更高的準確率以及更少的資料量。

對於任何監督式學習 (supervised learning)來說，假如想要促使學習系統表現的更好，則需要大量的被標記的資料來做訓練。但是，在這些被標記的資料中，可能會存在著對於學習系統有著負面影響的資料，從而降低學習效果與準確率。在這篇論文中，我們將會應用主動式學習(active learning)的概念在系統學習的過程上，藉此來分辨資料對於系統的好壞；並測試主動式學習(active learning)在訓練過程中的實際效果[12]。

在未來，我們希望可以實現出一個系統框架，可以量化一組參數用來決定對於學習系統來說是否認為此筆資料值得標記，並研究何時停止主動式學習法可以獲得最佳的效果，以達到優化資料庫與降低資料的標記成本。

Active learning is becoming more and more important in machine learning which can optimize the learning process. [17] The main concept is that if learning algorithm can choose the decisive data points from which it learns, instead of choosing all of them, it will perform better with less training process. In other words, we aggressively select the unlabeled data instances by observing the known labeled data instances to get the higher accuracy and use smaller amounts of data instances than select all of the dataset or random choose data when training the supervised learning system. [12]
For any supervised learning, if you would like to make the system perform well, it had to be trained on lots of labeled instances. But, in these labeled instances, there might be some worthless instances which affect the learning system and raise your training cost. So, we used the active learning concept during training process to discriminate whether the data instance is good for the learning system or not. In this work, we would like to know that the concept of active learning to select the training data, will work or not.
In the future, we hope that we can realize a framework which can quantize a parameter to determine which data instance deserve to be labeled through observing exiting dataset. It will refine the dataset and increase the system quality.

Chapter 1 研究介紹 1
Chapter 2 資料蒐集 5
2.1 校長演講儲訓計畫資料庫 5
2.2 資料庫標記與正規排序法 6
Chapter 3 實驗理論 8
3.1 短時高密度特徵擷取法(Dense Unit-level Feature Extraction) 8
3.1.1 短時高密度聲音特徵擷取法 8
3.1.2 短時高密度影像特徵擷取法 11
3.2 段落層級特徵值編碼(Session-level Feature Encoding) 12
3.2.1 K類分群方式的詞袋模型編碼(K-means bag of word encoding) 13
3.2.2 費雪向量編碼(Fisher-vector encoding) 13
3.3 主動式學習法(active learning) 14
3.3.1 不確定性採樣法(Sampling by Uncertainty) 15
3.3.2 不確定性密度採樣法(Sampling by Uncertainty and Density) 24
3.4 基於連續性分數在SVR上的訓練 27
Chapter 4 實驗設計 29
4.1 實驗一：基於最佳相關性結果分析 29
4.1.1 實驗設計 29
4.1.2 實驗結果 31
4.1.3 實驗分析 33
4.2 實驗二：最低基準結果與最佳結果比較 44
4.2.1 實驗設計 44
4.2.2 實驗結果 44
4.2.3 實驗分析 45
4.3 實驗三主動式學習對於個別特徵效果分析 47
Chapter 5 結論 49
參考文獻 50

[1] L. Baraldi, F. Paci, G. Serra, L. Benini, and R. Cucchiara. Gesture recognition in ego-centric videos using dense trajectories and hand segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 688-693, 2014.
[2] A. L. Berger, V. J. D. Pietra, and S. A. D. Pietra. A maximum entropy approach to natural language processing. Computational linguistics, 22(1):39-71, 1996.
[3] Shan-Wen Hsiao, Hung-Ching Sun, Ming-Chuan Hsieh, Ming-Hsueh Tsai, Hsin-Chih Lin, Chi-Chun Lee: A multimodal approach for automatic assessment of school principals' oral presentation during pre-service training program. INTERSPEECH 2015: 2529-2533.
[4] F. Eyben, M. Wöllmer, and B. Schuller. Opensmile: the munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on Multimedia, pages 1459-1462. ACM, 2010.
[5] P. S. Keung. Continuing professional development of principals in hong kong. Frontiers of Education in China, 2(4):605-619, 2007
[6] H. D. Kim, C. Zhai, and J. Han. Aggregation of multiple judgments for evaluating ordered lists. In Advances in information retrieval, pages 166-178. Springer, 2010.
[7] C.-C. Lee, E. Mower, C. Busso, S. Lee, and S. Narayanan. Emotion recognition using a hierarchical binary decision tree approach. Speech Communication, 53(9):1162-1171, 2011.
[8] C. M. Lee and S. S. Narayanan. Toward detecting emotions in spoken dialogs. Speech and Audio Processing, IEEE Transactions on, 13(2):293-303, 2005.
[9] H. Gunes, M. Piccardi, and M. Pantic, From the lab to the real world: Affect recognition using multiple cues and modalities. InTech Education and Publishing, 2008.
[10] I. Muslea, S. Minton, and C. A. Knoblock. Selective sampling with redundant views. In AAAI/IAAI, pages 621-626, 2000.
[11] S. Narayanan and P. G. Georgiou. Behavioral signal processing: Deriving human behavioral informatics from speech and language. Proceedings of the IEEE, 101(5):1203-1233, 2013.
[12] M. Prince. Does active learning work? a review of the research. Journal of engineering education, 93(3):223-231, 2004.
[13] D. S. Cheng, H. Salamin, P. Salvagnini, M. Cristani, A. Vinciarelli, and V. Murino, “Predicting online lecture ratings based on gesturing and vocal behavior,” Journal on Multimodal User Interfaces, vol. 8, no. 2,pp. 151–160, 2014.
[14] P. Salvagnini, H. Salamin, M. Cristani, A. Vinciarelli, and V. Murino. Learning how to teach from "videolectures": automatic prediction of lecture ratings based on teacher's nonverbal behavior. In Cognitive Infocommunications (CogInfoCom), 2012 IEEE 3rd International Conference on, pages 415-419. IEEE, 2012.
[15] G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In ICML, pages 839-846. Citeseer, 2000.
[16] B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. MuLler, and S. Narayanan. Paralinguistics in speech and language-state-of-the-art and the challenge. Computer Speech & Language, 27(1):4-39, 2013.
[17] B. Settles. Active learning literature survey. University of Wisconsin, Madison, 52(55-66):11, 2010.
[18] A. Tamrakar, S. Ali, Q. Yu, J. Liu, O. Javed, A. Divakaran, H. Cheng, and H. Sawhney. Evaluation of low-level features and their combinations for complex event detection in open source videos. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 3681-3688. IEEE, 2012.
[19] C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, S. Lee, U. Neumann, and S. Narayanan, “Analysis of emotion recognition using facial expressions, speech and multimodal information,” in Proceedings of the 6th international conference on Multimodal interfaces. ACM, 2004, pp. 205–211.
[20] E. Sariyanidi, H. Gunes, and A. Cavallaro, “Automatic analysis of facial affect: A survey of registration, representation, and recognition,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 37, no. 6, pp. 1113–1133, 2015.
[21] H. Wang, A. Kläser, C. Schmid, and C.-L. Liu. Action recognition by dense trajectories. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 3169-3176. IEEE, 2011.
[22] J. Zhu, H. Wang, T. Yao, and B. K. Tsou. Active learning with sampling by uncertainty and density for word sense disambiguation and text classification. In Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, pages 1137-1144. Association for Computational Linguistics, 2008.
[23] X. Zhu. Semi-supervised learning literature survey. 2005.
[24] X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical report, Citeseer, 2002.
[25] F. Perronnin, J. S´anchez, and T. Mensink, “Improving the fisher kernel for large-scale image classification,” in Computer Vision–ECCV 2010. Springer, 2010, pp. 143–156.
[26] C.-C. Chang and C.-J. Lin, “Libsvm: A library for support vector machines,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, no. 3, p. 27, 2011.
[27] K. Chatfield, V. S. Lempitsky, A. Vedaldi, and A. Zisserman, “The devil is in the details: an evaluation of recent feature encoding methods.” in BMVC, vol. 2, no. 4, 2011, pp. 8–19.
[28] L. Cosmides, “Invariances in the acoustic expression of emotion during speech.” Journal of Experimental Psychology: Human Perception and Performance, vol. 9, no. 6, p. 864, 1983.
[29] M. Grimm, K. Kroschel, E. Mower, and S. Narayanan, “Primitives-based evaluation and estimation of emotions in speech,” Speech Communication, vol. 49, no. 10, pp. 787–800, 2007.
[30] D. Sztah´o, G. Kiss, and K. Vicsi, “Estimating the severity of parkinson’s disease from speech using linear regression and database partitioning,” in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
[31] J. Kim, M. Nasir, R. Gupta, M. V. Segbroeck, D. Bone, M. Black, Z. I. Skordilis, Z. Yang, P. Georgiou, and S. Narayanan, “Automatic estimation of parkinsons disease severity from diverse speech tasks,” in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
[32] T. F. Quatieri and N. Malyska, “Vocal-source biomarkers for depression: A link to psychomotor activity.” in Interspeech, 2012, pp. 1059–1062.
[33] Kapur, Jagat Narain. Maximum-entropy models in science and engineering. John Wiley & Sons, 1989.
[34] Jaynes, Edwin Thompson. "Clearing up mysteries—the original goal."Maximum Entropy and Bayesian Methods. Springer Netherlands, 1989. 1-27.
[35] Rousseeuw, Peter J., and Annick M. Leroy. Robust regression and outlier detection. Vol. 589. John Wiley & Sons, 2005.
[36] S. Watson, T. Miller, L. Johnston, and V. Rutledge, “Professional development school graduate performance: Perceptions of school principals,” The Teacher Educator, vol. 42, no. 2, pp. 77–86, 2006.
[37] D. L. Keith, “Principal desirabilitiy for professional development,” Academy of Educational Leadership Journal, vol. 15, no. 2, p. 95, 2011.
[38] M. El Ayadi, M. S. Kamel, and F. Karray, “Survey on speech emotion recognition: Features, classification schemes, and databases,” Pattern Recognition, vol. 44, no. 3, pp. 572–587, 2011.
[39] M. Karg, A.-A. Samadani, R. Gorbet, K. Kuhnlenz, J. Hoey, and D. Kulic, “Body movements for affective expression: a survey of automatic recognition and generation,” Affective Computing, IEEE Transactions on, vol. 4, no. 4, pp. 341–359, 2013.
[40] E. Crane and M. Gross, “Motion capture and emotion: Affect detection in whole body movement,” in Affective computing and intelligent interaction. Springer, 2007, pp. 95–101.

電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文