|
參考文獻 [1] S. Narayanan and P. G. Georgiou, "Behavioral signal processing: Deriving human behavioral informatics from speech and language," Proceedings of the IEEE, vol. 101, no. 5, pp. 1203-1233, 2013. [2] A. Tsanas, M. A. Little, P. E. McSharry, J. Spielman, and L. O. Ramig, "Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease," IEEE Transactions on Biomedical Engineering, vol. 59, no. 5, pp. 1264-1271, 2012. [3] X. Zhu, H.-I. Suk, and D. Shen, "A novel matrix-similarity based loss function for joint regression and classification in AD diagnosis," NeuroImage, vol. 100, pp. 91-105, 2014. [4] J. Gibson et al., "A Deep Learning Approach to Modeling Empathy in Addiction Counseling," Commitment, vol. 111, p. 21, 2016. [5] B. Xiao, C. Huang, Z. E. Imel, D. C. Atkins, P. Georgiou, and S. S. Narayanan, "A technology prototype system for rating therapist empathy from audio recordings in addiction counseling," PeerJ Computer Science, vol. 2, p. e59, 2016. [6] B. Xiao, Z. E. Imel, P. G. Georgiou, D. C. Atkins, and S. S. Narayanan, "" Rate My Therapist": Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing," PloS one, vol. 10, no. 12, p. e0143055, 2015. [7] P. M. Faye et al., "Newborn infant pain assessment using heart rate variability analysis," The Clinical journal of pain, vol. 26, no. 9, pp. 777-782, 2010. [8] F.-S. Tsai, Y.-L. Hsu, W.-C. Chen, Y.-M. Weng, C.-J. Ng, and C.-C. Lee, "Toward Development and Evaluation of Pain Level-Rating Scale for Emergency Triage based on Vocal Characteristics and Facial Expressions," Interspeech 2016, pp. 92-96, 2016. [9] M. P. Black et al., "Toward automating a human behavioral coding system for married couples’ interactions using speech acoustic features," Speech Communication, vol. 55, no. 1, pp. 1-21, 2013. [10] M. F. Jung, "Coupling Interactions and Performance: Predicting Team Performance from Thin Slices of Conflict," ACM Transactions on Computer-Human Interaction (TOCHI), vol. 23, no. 3, p. 18, 2016. [11] C.-C. Lee et al., "Computing vocal entrainment: A signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions," Computer Speech & Language, vol. 28, no. 2, pp. 518-539, 2014. [12] D. Bone, M. S. Goodwin, M. P. Black, C.-C. Lee, K. Audhkhasi, and S. Narayanan, "Applying machine learning to facilitate autism diagnostics: pitfalls and promises," Journal of autism and developmental disorders, vol. 45, no. 5, pp. 1121-1136, 2015. [13] D. Bone et al., "The psychologist as an interlocutor in autism spectrum disorder assessment: Insights from a study of spontaneous prosody," Journal of Speech, Language, and Hearing Research, vol. 57, no. 4, pp. 1162-1177, 2014. [14] D. Wall, J. Kosmicki, T. Deluca, E. Harstad, and V. Fusaro, "Use of machine learning to shorten observation-based screening and diagnosis of autism," Translational psychiatry, vol. 2, no. 4, p. e100, 2012. [15] A. Metallinou, Z. Yang, C.-c. Lee, C. Busso, S. Carnicke, and S. Narayanan, "The USC CreativeIT database of multimodal dyadic interactions: from speech and full body motion capture to continuous emotional annotations," Language resources and evaluation, vol. 50, no. 3, pp. 497-521, 2016. [16] Z. Yang, A. Metallinou, E. Erzin, and S. Narayanan, "Analysis of interaction attitudes using data-driven hand gesture phrases," in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, 2014, pp. 699-703: IEEE. [17] A. F. A. Khan, O. Mourad, A. M. K. B. Mannan, H. B. A. M. Dahan, and M. A. Abushariah, "Automatic Arabic pronunciation scoring for computer aided language learning," in Communications, Signal Processing, and their Applications (ICCSPA), 2013 1st International Conference on, 2013, pp. 1-6: IEEE. [18] S. E. Petersen and M. Ostendorf, "A machine learning approach to reading level assessment," Computer speech & language, vol. 23, no. 1, pp. 89-106, 2009. [19] S. M. Witt and S. J. Young, "Phone-level pronunciation scoring and assessment for interactive language learning," Speech communication, vol. 30, no. 2, pp. 95-108, 2000. [20] S.-W. Hsiao, H.-C. Sun, M.-C. Hsieh, M.-H. Tsai, H.-C. Lin, and C.-C. Lee, "A Multimodal Approach for Automatic Assessment of School Principals' Oral Presentation During Pre-Service Training Program," in Sixteenth Annual Conference of the International Speech Communication Association, 2015. [21] W.-Y. Huang, S.-W. Hsiao, H.-C. Sun, M.-C. Hsieh, M.-H. Tsai, and C.-C. Lee, "Enhancement of Automatic Oral Presentation Assessment System Using Latent N-Grams Word Representation and Part-of-Speech Information," Interspeech 2016, pp. 1432-1436, 2016. [22] Y. Cheong Cheng, Y.-Q. Mao, W. Yan, and L. Catherine Ehrich, "Principal preparation and training: a look at China and its issues," International Journal of Educational Management, vol. 23, no. 1, pp. 51-64, 2009. [23] D. L. Keith, "Principal desirabilitiy for professional development," Academy of Educational Leadership Journal, vol. 15, no. 2, p. 95, 2011. [24] P. S. Keung, "Continuing professional development of principals in Hong Kong," Frontiers of Education in China, vol. 2, no. 4, pp. 605-619, 2007. [25] P. S. Salazar, "The professional development needs of rural high school principals: A seven-state study," The Rural Educator, vol. 28, no. 3, 2007. [26] S. Watson, T. Miller, L. Johnston, and V. Rutledge, "Professional development school graduate performance: Perceptions of school principals," The Teacher Educator, vol. 42, no. 2, pp. 77-86, 2006. [27] B. R. Baucom and E. Iturralde, "A behaviorist manifesto for the 21 st century," in Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific, 2012, pp. 1-4: IEEE. [28] G. Margolin et al., "The nuts and bolts of behavioral observation of marital and family interaction," Clinical child and family psychology review, vol. 1, no. 4, pp. 195-213, 1998. [29] J. Burstein, J. Tetreault, and N. Madnani, "The e-rater automated essay scoring system," Handbook of automated essay evaluation: Current applications and new directions, pp. 55-67, 2013. [30] D. S. McNamara, S. A. Crossley, R. D. Roscoe, L. K. Allen, and J. Dai, "A hierarchical classification approach to automated essay scoring," Assessing Writing, vol. 23, pp. 35-59, 2015. [31] L. Streeter, J. Bernstein, P. Foltz, and D. DeLand, "Pearson’s automated scoring of writing, speaking, and mathematics," ed: Pearson White Paper. Iowa City, IA: Pearson. Retrieved from http://www.pearsonassessments.com/hai/images/tmrs/PearsonsAutomatedScoringofWritingSpeakingandMathematics.pdf, 2011. [32] M. Chatterjee, S. Park, L.-P. Morency, and S. Scherer, "Combining two perspectives on classifying multimodal data for recognizing speaker traits," in Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, 2015, pp. 7-14: ACM. [33] D. Higgins, X. Xi, K. Zechner, and D. Williamson, "A three-stage approach to the automated scoring of spontaneous spoken responses," Computer Speech & Language, vol. 25, no. 2, pp. 282-306, 2011. [34] I. Naim, M. I. Tanveer, D. Gildea, and M. E. Hoque, "Automated prediction and analysis of job interview performance: The role of what you say and how you say it," in Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, 2015, vol. 1, pp. 1-6: IEEE. [35] I. Naim, M. I. Tanveer, D. Gildea, and E. Hoque, "Automated analysis and prediction of job interview performance," IEEE Transactions on Affective Computing, 2016. [36] L. S. Nguyen, D. Frauendorfer, M. S. Mast, and D. Gatica-Perez, "Hire me: Computational Inference of Hirability in Employment Interviews Based on Nonverbal Behavior," IEEE Transactions on Multimedia, vol. 16, no. 4, pp. 1018-1031, 2014. [37] L. S. Nguyen and D. Gatica-Perez, "I would hire you in a minute: Thin slices of nonverbal behavior in job interviews," in Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, 2015, pp. 51-58: ACM. [38] D. A. Silverstein and T. Zhang, "System and method of providing evaluation feedback to a speaker while giving a real-time oral presentation," ed: Google Patents, 2006. [39] O. Kang, "Impact of rater characteristics and prosodic features of speaker accentedness on ratings of international teaching assistants' oral performance," Language Assessment Quarterly, vol. 9, no. 3, pp. 249-269, 2012. [40] L. Chen, C. W. Leong, G. Feng, and C. M. Lee, "Using multimodal cues to analyze MLA'14 oral presentation quality corpus: Presentation delivery and slides quality," in Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge, 2014, pp. 45-52: ACM. [41] M. Gentilucci and M. C. Corballis, "From manual gesture to speech: a gradual transition," Neuroscience & Biobehavioral Reviews, vol. 30, no. 7, pp. 949-960, 2006. [42] D. McNeill, How language began: Gesture and speech in human evolution. Cambridge University Press, 2012. [43] S. Scherer, G. Layher, J. Kane, H. Neumann, and N. Campbell, "An audiovisual political speech analysis incorporating eye-tracking and perception data," in LREC, 2012, pp. 1114-1120. [44] A. Rosenberg and J. Hirschberg, "Acoustic/prosodic and lexical correlates of charismatic speech," in INTERSPEECH, 2005, pp. 513-516. [45] M. Barthet, G. Fazekas, and M. Sandler, "Multidisciplinary perspectives on music emotion recognition: Implications for content and context-based models," Proc. CMMR, pp. 492-507, 2012. [46] M. P. Black, P. G. Georgiou, A. Katsamanis, B. R. Baucom, and S. Narayanan, "“You made me do it”: Classification of Blame in Married Couples' Interactions by Fusing Automatically Derived Speech and Language Information," in Twelfth Annual Conference of the International Speech Communication Association, 2011. [47] A. Kazemzadeh, S. Lee, and S. Narayanan, "Fuzzy logic models for the meaning of emotion words," IEEE Computational intelligence magazine, vol. 8, no. 2, pp. 34-49, 2013. [48] H. D. Kim, C. Zhai, and J. Han, "Aggregation of multiple judgments for evaluating ordered lists," in European Conference on Information Retrieval, 2010, pp. 166-178: Springer. [49] J. San Pedro and S. Siersdorfer, "Ranking and classifying attractiveness of photos in folksonomies," in Proceedings of the 18th international conference on World wide web, 2009, pp. 771-780: ACM. [50] J. Tang, H.-f. Leung, Q. Luo, D. Chen, and J. Gong, "Towards Ontology Learning from Folksonomies," in IJCAI, 2009, vol. 9, pp. 2089-2094. [51] T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013. [52] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, "A neural probabilistic language model," Journal of machine learning research, vol. 3, no. Feb, pp. 1137-1155, 2003. [53] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Advances in neural information processing systems, 2013, pp. 3111-3119. [54] T. Mikolov, W.-t. Yih, and G. Zweig, "Linguistic Regularities in Continuous Space Word Representations," in Hlt-naacl, 2013, vol. 13, pp. 746-751. [55] F. Morin and Y. Bengio, "Hierarchical Probabilistic Neural Network Language Model," in Aistats, 2005, vol. 5, pp. 246-252: Citeseer. [56] R. Johnson and T. Zhang, "Effective use of word order for text categorization with convolutional neural networks," arXiv preprint arXiv:1412.1058, 2014. [57] R. Johnson and T. Zhang, "Semi-supervised convolutional neural networks for text categorization via region embedding," in Advances in neural information processing systems, 2015, pp. 919-927. [58] Y. Kim, "Convolutional neural networks for sentence classification," arXiv preprint arXiv:1408.5882, 2014. [59] Q. V. Le and T. Mikolov, "Distributed Representations of Sentences and Documents," in ICML, 2014, vol. 14, pp. 1188-1196. [60] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, "Facial landmark detection by deep multi-task learning," in European Conference on Computer Vision, 2014, pp. 94-108: Springer. [61] B. Jie, D. Zhang, B. Cheng, and D. Shen, "Manifold regularized multitask feature learning for multimodality disease classification," Human brain mapping, vol. 36, no. 2, pp. 489-507, 2015. [62] Y. Luo, D. Tao, B. Geng, C. Xu, and S. J. Maybank, "Manifold regularized multitask learning for semi-supervised multilabel image classification," IEEE Transactions on Image Processing, vol. 22, no. 2, pp. 523-536, 2013. [63] M.-T. Luong, Q. V. Le, I. Sutskever, O. Vinyals, and L. Kaiser, "Multi-task sequence to sequence learning," arXiv preprint arXiv:1511.06114, 2015. [64] A. Argyriou, T. Evgeniou, and M. Pontil, "Convex multi-task feature learning," Machine Learning, vol. 73, no. 3, pp. 243-272, 2008. [65] J. Liu, S. Ji, and J. Ye, "Multi-task feature learning via efficient l 2, 1-norm minimization," in Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence, 2009, pp. 339-348: AUAI Press. [66] G. Obozinski, B. Taskar, and M. Jordan, "Multi-task feature selection," Statistics Department, UC Berkeley, Tech. Rep, vol. 2, 2006. [67] I. Bíró, J. Szabó, and A. A. Benczúr, "Latent dirichlet allocation in web spam filtering," in Proceedings of the 4th international workshop on Adversarial information retrieval on the web, 2008, pp. 29-32: ACM. [68] M. Lienou, H. Maitre, and M. Datcu, "Semantic annotation of satellite images using latent Dirichlet allocation," IEEE Geoscience and Remote Sensing Letters, vol. 7, no. 1, pp. 28-32, 2010. [69] J. D. Mcauliffe and D. M. Blei, "Supervised topic models," in Advances in neural information processing systems, 2008, pp. 121-128. [70] R. Das, M. Zaheer, and C. Dyer, "Gaussian LDA for Topic Models with Word Embeddings," in ACL (1), 2015, pp. 795-804. [71] D. Q. Nguyen, R. Billingsley, L. Du, and M. Johnson, "Improving topic models with latent feature word representations," Transactions of the Association for Computational Linguistics, vol. 3, pp. 299-313, 2015. [72] C. E. Moody, "Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec," arXiv preprint arXiv:1605.02019, 2016. [73] L. Niu, X. Dai, J. Zhang, and J. Chen, "Topic2Vec: learning distributed representations of topics," in Asian Language Processing (IALP), 2015 International Conference on, 2015, pp. 193-196: IEEE. [74] Y. Liu, Z. Liu, T.-S. Chua, and M. Sun, "Topical Word Embeddings," in AAAI, 2015, pp. 2418-2424. [75] J. W. Pennebaker, C. K. Chung, M. Ireland, A. Gonzales, and R. J. Booth, "The Development and Psychometric Properties of LIWC2007." [76] Y. R. Tausczik and J. W. Pennebaker, "The psychological meaning of words: LIWC and computerized text analysis methods," Journal of language and social psychology, vol. 29, no. 1, pp. 24-54, 2010. [77] M. del Pilar Salas-Zárate, E. López-López, R. Valencia-García, N. Aussenac-Gilles, Á. Almela, and G. Alor-Hernández, "A study on LIWC categories for opinion mining in Spanish reviews," Journal of Information Science, vol. 40, no. 6, pp. 749-760, 2014. [78] C.-L. Huang et al., "The development of the Chinese linguistic inquiry and word count dictionary," Chinese Journal of Psychology, vol. 54, no. 2, pp. 185-201, 2012. [79] H. Jegou, F. Perronnin, M. Douze, J. Sánchez, P. Perez, and C. Schmid, "Aggregating local image descriptors into compact codes," IEEE transactions on pattern analysis and machine intelligence, vol. 34, no. 9, pp. 1704-1716, 2012. [80] F. Eyben, M. Wöllmer, and B. Schuller, "Opensmile: the munich versatile and fast open-source audio feature extractor," in Proceedings of the 18th ACM international conference on Multimedia, 2010, pp. 1459-1462: ACM. [81] H. Wang and C. Schmid, "Action recognition with improved trajectories," in Proceedings of the IEEE International Conference on Computer Vision, 2013, pp. 3551-3558. [82] F. Perronnin, J. Sánchez, and T. Mensink, "Improving the fisher kernel for large-scale image classification," Computer Vision–ECCV 2010, pp. 143-156, 2010. [83] Y. Sun, Y. Chen, X. Wang, and X. Tang, "Deep learning face representation by joint identification-verification," in Advances in neural information processing systems, 2014, pp. 1988-1996. [84] Y. Kim, H. Lee, and E. M. Provost, "Deep learning for robust feature generation in audiovisual emotion recognition," in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, 2013, pp. 3687-3691: IEEE. [85] H. Wang, H. Huang, and C. Ding, "Multi-label feature transform for image classifications," Computer Vision–ECCV 2010, pp. 793-806, 2010. [86] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching word vectors with subword information," arXiv preprint arXiv:1607.04606, 2016. [87] Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui, "Jointly modeling embedding and translation to bridge video and language," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4594-4602. [88] K. Cho et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014. [89] S. Arora, Y. Liang, and T. Ma, "A simple but tough-to-beat baseline for sentence embeddings," 2016. [90] R. Kiros et al., "Skip-thought vectors," in Advances in neural information processing systems, 2015, pp. 3294-3302. |