|
[1] R. Cowie, and R. R. Cornelius, “Describing the emotional states that are expressed in speech,” Speech Communication, vol. 40, no. 1-2, pp. 5-32, 2003. [2] L. F. Barrett, “Discrete Emotions or Dimensions? The Role of Valence Focus and Arousal Focus,” Cognition and Emotion, vol. 12, no 4, pp. 579-599, 1998. [3] B. Schuller, M. Valster, F. Eyben, R. Cowie, and M. Pantic, “AVEC 2012: the continuous audio/visual emotion challenge,” In Proc. of the 14th ACM International Conference on Multimodal interaction, pp. 449-456, 2012. [4] M. Valstar, B. Schuller, K. Smith, T. Almaev, F. Eyben, J. Krajewski, R. Cowie, and M. Pantic, “AVEC 2014: 3D Dimensional Affect and Depression Recognition Challenge,” In Proc. of the 4th International Workshop on Audio/Visual Emotion Challenge, pp. 3-10, 2014. [5] P. Ekman, and W. Friesen, “Emotion in the Human Face,” Prentice Hall, New Jersey, 1975. [6] P. Ekman, and W. Friesen, “Facial Action Coding System: A Technique for the Measurement of Facial Movement,” Consulting Psychologists Press, Palo Alto, 1978. [7] P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression,” In Proc. of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 94-101, 2010. [8] A. C. Cruz, B. Bhanu, and N. S. Thakoor, “Vision and Attention Theory Based Sampling for Continuous Facial Emotion Recognition,” IEEE Transactions on Affective Computing, vol.5, no.4, pp.418-431, 2014. [9] J. Nicolle, V. Rapp, K. Bailly, L. Prevost, and M. Chetouani, “Robust Continuous Prediction of Human Emotions Using Multiscale Dynamic Cues,” In Proc. of the 14th ACM International Conference on Multimodal interaction, pp. 501-508, 2012. [10] D. Ozkan, S. Scherer, and L.-P. Morency, “Step-wise emotion recognition using concatenated-HMM,” In Proc. of the 14th ACM International Conference on Multimodal interaction, pp. 477-484, 2012. [11] C. Soladié, H. Salam, C. Pelachaud, N. Stoiber, and R. Séguier, “A Multimodal Fuzzy Inference System Using a Continuous Facial Expression Representation for Emotion Detection,” In Proc. of the 14th ACM International Conference on Multimodal interaction, pp. 493-500, 2012. [12] A. Savran, H. Cao, A. Nenkova, and R. Verma, “Temporal Bayesian Fusion for Affect Sensing: Combining Video, Audio, and Lexical Modalities,” IEEE Transactions on Cybernetics, vol. 45, no. 9, pp. 1927-1941, 2015. [13] M. Kächele, M. Schels, and F. Schwenker, “Inferring Depression and Affect from Application Dependent Meta Knowledge,” In Proc. of the 4th International Workshop on Audio/Visual Emotion Challenge, pp. 41-48, 2014. [14] H. Chen, J. Li, F. Zhang, Y. Li, and H. Wang, “3D model-based continuous emotion recognition,” In Proc. of the 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1836-1845, 2015. [15] T. Baltrušaitis, N. Banda, and P. Robinson, “Dimensional affect recognition using Continuous Conditional Random Fields,” In Proc. of the 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1-8, 2013. [16] S. Kaltwang, S. Todorovic, and M. Pantic, “Doubly Sparse Relevance Vector Machine for Continuous Facial Behavior Estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 9, pp. 1748-1761, 2016. [17] I. J. Goodfellow, D. Erhan, P. L. Carrier, A. Courville, M. Mirza, B. Hamner, W. Cukierski, Y. Tang, D. Thaler, and D.-H. Lee, et al, “Challenges in representation learning: a report on three machine learning contests,” In Proc. of the 2013 ICML Workshop on Representation Learning, 2013. [18] Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep Learning Face Attributes in the Wild,” In Proc. of the 2015 IEEE International Conference on Computer Vision, pp. 3730-3738, 2015. [19] K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556, 2014. [20] O.M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep Face Recognition,” In Proc. of the 26th British Machine Vision Conference, pp. 41.1-41.12, 2015. [21] Y. Tang, “Deep learning using linear support vector machines,” In Proc. of the 2013 ICML Workshop on Representation Learning, 2013. [22] https://github.com/senecaur/caffe-rta [23] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional Architecture for Fast Feature Embedding,” In Proc. of the 14th ACM International Conference on Multimedia, pp. 675-678, 2014. [24] Asthana, S. Zafeiriou, S. Cheng, and M. Pantic, “Incremental Face Alignment in the Wild,” In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1859-1866, 2014. [25] F. Eyben, M. Wöllmer, and B. Schuller, “openSMILE – The Munich Versatile and Fast Open-Source Audio Feature Extractor”, In Proc. of the 18th ACM Multimedia, pp. 1459-1462, 2010. [26] G. Mckeown, M. F. Valstar, R. Cowie, M. Pantic, and M. Schroeder, “The SEMAINE database: Annotated multimodal records of emotionally coloured conversations between a person and a limited agent,” IEEE Transactions on Affective Computing, vol. 3, no.1, pp. 5-17, 2012. [27] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei, “Large-Scale Video Classification with Convolutional Neural Networks,” In Proc. of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725-1732, 2014. [28] S. Chen, and Q. Jin, “Multi-modal Dimensional Emotion Recognition using Recurrent Neural Networks,” In Proc. of the 5th International Workshop on Audio/Visual Emotion Challenge, pp. 49-56, 2015. [29] J. Domke, “Learnin Graphical Model Parameters with Approximate Marginal Inference,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 10, pp. 2454-2467, 2013. [30] P. Fewzee, and F. Karray, “Continuous Emotion Recognition: Another Look at the Regression Problem,” In Proc. of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, pp. 197-202, 2013. [31] G. Andrew, R. Arora, J. Bilmes, and K. Livescu, “Deep canonical correlation analysis,” In Proc. of the 2013 International Conference on Machine Learning, pp. 1247-1255, 2013. [32] M. Nicolaou, S. Zafeiriou, and M. Pantic, “Correlated-Spaces Regression for Learning Continuous Emotion Dimensions,” In Proc. of the 21th ACM International Conference on Multimedia, pp. 773-776, 2013. [33] H. Meng, N. Bianchi-Berthouze, Y. Deng, J. Cheng, and J. P. Cosmas, “Time-Delay Neural Network for Continuous Emotional Dimension Prediction From Facial Expression Sequences,” IEEE Transactions on Cybernetics, vol. 46, no. 4, pp. 916-929, 2016. [34] L. Chao, J. Tao, M. Yang, Y. Li, and Z. Wen, “Multi-scale Temporal Modeling for Dimensional Emotion Recognition in Video,” In Proc. of the 4th International Workshop on Audio/Visual Emotion Challenge, pp. 11-18, 2014. [35] R. Gupta, N. Malandrakis, B. Xiao, T. Guha, M. V. Segbroeck, M. Black, A. Potamianos, and S. Narayanan, “Multimodal Prediction of Affective Dimensions and Depression in Human-Computer Interactions,” In Proc. of the 4th International Workshop on Audio/Visual Emotion Challenge, pp. 33-40, 2014. [36] S. Mariooryad, and C. Busso, “Correcting Time-Continuous Emotional Labels by Modeling the Reaction Lag of Evaluators,” IEEE Transactions on Affective Computing, vol. 6, no. 2, pp. 97-108, 2015. [37] Y. Song, L.-P. Morency, and R. Davis, “Learning a sparse codebook of facial and body microexpressions for emotion recognition,” In Proc. of the 15th ACM on International conference on multimodal interaction, pp. 237-244, 2013.
|