|
[1] M. Ayadi, M. S.Kamel and F. Karray, "Survey on speech emotion recognition: Features, classification schemes, and databases," Pattern Recognition, Volume 44, Issue 3, pp. 572-587, March 2011. [2] S. Ramakrishnan and I. M. M. E. Emary, "Speech emotion recognition approaches in human computer interaction.," Telecommunication Systems 52, p. 1467–1478 , 2 September 2011. [3] A. B. Nassif, I. Shahin, I. Attili, M. Azzeh and K. Shaalan, "Speech recognition using deep neural networks: A systematic review," IEEE Access vol. 7, p. 19143–19165, 2019. [4] K. Oh, D. Lee, B. Ko and H. Choi, "A Chatbot for Psychiatric Counseling in Mental Healthcare Service Based on Emotional Dialogue Analysis and Sentence Generation," in IEEE International Conference on Mobile Data Management (MDM), 2017. [5] P. Valentina and R. M. Hannah, "Alexa, she's not human but… Unveiling the drivers of consumers' trust in voice-based artificial intelligence," Psychology Marketing, 20 January 2021. [6] B. G. C. Dellaert, S. B. Shu, T. A. Arentze, T. Baker, K. Diehl, B. Donkers, N. J. Fast, G. Häubl, H. Johnson, U. R. Karmarkar, H. Oppewal, B. H. Schmitt, J. Schroeder, S. A. Spiller and Steff, "Consumer decisions with artificially intelligent voice assistants," Marketing Letters, p. 335–347, 17 August 2020. [7] A. B. Ingale and D. S. Chaudhari, "Speech Emotion Recognition," International Journal of Soft Computing and Engineering (IJSCE), pp. 235-238, March 2012. [8] M. Swain, A. Routray and P. Kabisatpathy3, "Databases, features and classifiers for speech emotion recognition: a review," International Journal of Speech Technology, p. 93–120 , 19 January 2018. [9] L. Muda, M. Begam and I. Elamvazuthi, "Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques," JOURNAL OF COMPUTING, VOLUME 2, ISSUE 3, March 2010. [10] A. Baevski, H. Zhou, A. Mohamed and M. Auli, "wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations," Advances in Neural Information Processing Systems 33, pp. 12449-12460, 2020. [11] Y.-L. Lin and G. Wei, "Speech emotion recognition based on HMM and SVM," International Conference on Machine Learning and Cybernetics, pp. 4898-4901, 2005. [12] L. Tarantino, P. N. Garner and A. Lazaridis., "Self-Attention for Speech Emotion Recognition," Interspeech, pp. 2578-2582, 2019. [13] J. Wang, M. Xue, R. Culhane, E. Diao, J. Ding and V. Tarokh, "Speech emotion recognition with dual-sequence LSTM architecture," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 6474-6478, 2020. [14] S. Latif, R. Rana, S. Khalifa, R. Jurdak and J. Epps., "Direct Modelling of Speech Emotion from Raw Speech," Interspeech 2019, p. 3920–3924, 2019. [15] J.-L. Li, T.-Y. Huang, C.-M. Chang and C.-C. Lee, "A waveformfeature dual branch acoustic embedding network for emotion recognition," Frontiers in Computer Science, vol. 2, p. 13, 2020. [16] D. Wu, T. D. Parsons, E. Mower and S. Narayanan, "Speech emotion estimation in 3D space," 2010 IEEE International Conference on Multimedia and Expo, pp. 737-742, 2010. [17] S. Gielen, E. Douglas-cowie and R. Cowie, "Acoustic correlates of emotion dimensions in view of speech synthesis," Seventh European Conference on Speech Communication and Technology, 2001. [18] R. Kehrein, "The prosody of authentic emotions," Speech Prosody 2002, International Conference, 2002. [19] T. W. Smith, The Book of Human Emotions: An Encyclopedia of Feeling from Anger to Wanderlust, Profile Books, 2015. [20] F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu and H. Zhu, "A Comprehensive Survey on Transfer Learning," Proceedings of the IEEE 109.1 , pp. 43-76, 2020. [21] J.-L. Li and C.-C. Lee, "An Enroll-to-Verify Approach for Cross-Task Unseen Emotion Class Recognition," IEEE Transactions on Affective Computing, 2022. [22] R. Xia and Y. Liu, "A multi-task learning framework for emotion recognition using 2D continuous space," EEE Transactions on affective computing, 8(1), pp. 3-14, 2015. [23] R. Cai, K. Guo, B. Xu, X. Yang and Z. Zhang, "Meta Multi-task Learning for Speech Emotion Recognition," INTERSPEECH 2020, October 2020. [24] G. Vrbančič and V. Podgorelec, "Transfer Learning With Adaptive Fine-Tuning," IEEE Access, vol. 8, pp. 196197-196211, 2020. [25] S. Masoudnia and R. Ebrahimpour, "Mixture of experts: a literature survey," Artif Intell Rev 42, p. 275–293, 2014. [26] J. M. Joyce, "Kullback-Leibler Divergence," International Encyclopedia of Statistical Science, p. 720–722, 01 January 2014. [27] C. Zhang and Y. Ma, Ensemble Machine Learning: Methods and Applications, Springer, 2012. [28] C. Busso, M. Bulut, C.-C. Lee, A. Kazemzadeh, E. Mower, S. Kim, J. N. Chang, S. Lee and S. S. Narayanan, "IEMOCAP: Interactive emotional dyadic motion capture database," Language resources and evaluation, vol. 42, no. 4, , pp. 335-359, December 2008. [29] R. Lotfian and C. Busso, "Building naturalistic emotionally balanced speech corpus by retrieving emotional speech from existing podcast recordings," IEEE Transactions on Affective Computing, vol. 10, no. 4, pp. 471-483, October-December 2019.
|