帳號:guest(3.17.175.130)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):李政霖
作者(外文):Li, Jeng-Lin
論文名稱(中文):學習深度特徵及標記空間以強化辨識任務
論文名稱(外文):Learning Deep Feature and Label Space to Enhance Discriminative Recognition Tasks
指導教授(中文):李祈均
指導教授(外文):Lee, Chi-Chun
口試委員(中文):張正尚
林嘉文
曹昱
賴穎暉
口試委員(外文):Chang, Cheng-Shang
Lin, Chia-Wen
Tsao, Yu
Lai, Ying-Hui
學位類別:博士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:105061526
出版年(民國):111
畢業學年度:111
語文別:英文
論文頁數:124
中文關鍵詞:人本辨識任務任務分類方法惡性血液疾病流式細胞儀情緒辨識
外文關鍵詞:Human Centered RecognitionTask Categorization SchemeHematologic MalignancyFlow CytometryEmotion Recognition
相關次數:
  • 推薦推薦:0
  • 點閱點閱:316
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
人本科技應用基於人工智慧技術在近年快速發展,以人為中心的生理與心理面向驅動許多辨識任務於各式應用場域。深度學習技術得以提升演算法建模的辨識能力主要源於對特徵及標記空間更穩健的學習。因此,本論文提出一個採用領域知識及建模方法將任務進行分類的方法,藉以確立辨識任務的特性並研發相應之深度學習技術。具體而已,我們以生理層面及心理層面對應特徵及標記空間之學習所組合出四個不同面向的任務目標,我們所開發的深度學習架構即為對應領域內問題導向的解決方案。本論文以惡性血液疾病的臨床應用及情緒辨識在特徵和標記空間的挑戰及技術發展作為四個面向的案例,以驗證此分類辨識任務之方法。
生理層面而言,我們的任務目標鎖定臨床醫療領域的惡性血液疾病。流式細胞儀是臨床上檢驗血液疾病並作為診斷及預後決策的主要檢測工具,其產生的大量資料是根據抽取骨髓中數以萬計的細胞對應多種抗體數據偵測結果,臨床仰賴專業醫療人員以手動方式將資料各個維度兩兩作圖後多次檢視再進行判讀,因此造成人力不足、延誤治療及個人判讀誤差等嚴重臨床問題。本論文提出包含特徵空間及標記空間的深度學習架構。其一在標記空間中的問題是當預後檢驗需要應用於不同疾病種類上,部分疾病擁有較少資料,我們提出以知識遷移與互補學習的方法,透過預訓練於較多資料血液疾病的模型幫助新疾病的模型學習。在臨床資料庫中,我們由急性骨髓性白血病進行知識遷移至急性淋巴性白血病,相較於僅用新資料建立的模型,在準確率上更加提升。其二在特徵空間上面臨的問題是流式細胞儀的單一樣本存在大量細胞,現有方法都需要下採樣,且無法有效率學習表徵以表達該大型資料矩陣的分布特性。因此我們提出切塊暨池化的方法,將大量細胞分塊進行運算而避免過去方法中降採樣造成的預測結果變異性及平行計算瓶頸,同時採用端到端的監督式學習架構,優化流式細胞儀表徵的辨識力。我們以兩個初診對惡性血液疾病資料庫進行驗證,於辨識疾病種類任務上取得高準確率並優於各項流式細胞儀建模方法。
心理層面而言,情緒辨識是最具代表性的任務,因應頻繁的情境改變產生不同情緒辨識任務造成標記空間的挑戰,未見過的情緒種類在既有方法中造成辨識模型無法預測,目前的遷移學習需要重新蒐集資料並訓練模型,因而造成使用不便性。本論文提出使用一個註冊至驗證的方法讓跨任務辨識只需要註冊幾句新情緒種類音檔,即可以簡單的相似度計算獲取預測結果。實驗驗證於兩個語音情緒資料庫,並發現這種預測方法可以達到跟重新蒐集資料同樣級別的準確率。除了標記任務的挑戰外,情緒的特徵空間學習涉及人類認知過程中產生的個人情緒表達差異,過去文獻中都忽略這些個人差異於訊號資料的影響,造成辨識準確率提升的瓶頸。因此本論文提出個人化特徵空間學習演算法將個人差異納入建模時的特徵空間學習中,在大型情緒資料庫上得到更加提升的辨識率。
以上生理及心理層面的辨識任務,上述案例中可見基於本論文提出以領域角度(生理或心理)及技術角度(特徵或標記空間)進行任務分類,可表現出辨識任務在不同分類性質的差異,以利提出相應之技術。本論文的主要貢獻即為提出此任務分類的方法幫助辨識任務開發的技術有效解決領域真實需求,於四個驗證案例中,惡性血液疾病的臨床應用上,我們分別在標記空間及特徵學習中解決預後殘餘癌細胞的偵測及初診的疾病分類任務,直接幫助過去少量資料的疾病建立模型,特徵學習的技術提供更具辨識度的表徵、穩定的低變異度結果和更少的計算需求都強化了該技術落地於臨床的法規及實務面。情緒辨識任務上,標記空間所提出的跨任務解決方案泛化了目前的情緒辨識應用,提供快速處理新任務預測的方法。同時,我們提出個人差異的多模態情緒辨識方法,使情緒辨識系統更加個人化。由於這些技術都基於任務分類架構下形成,除了幫助自動化,我們提出的相關分析更可幫助領域專家及使用者以系統性的資料驅動方法獲取新的發現。
Human-centered applications have been rapidly advanced with the rising of deep learning technologies. Physiology and psychology are two aspects of human's life driven recognition tasks for the applications, and deep learning algorithms are equipped with strong discriminativity by learning robust feature representation and label space. In this dissertation, we aim to identify a task categorization scheme to associate domain driven human centered aspects with model learning strategies. Specifically, we explore recognition tasks based on the categorization scheme to develop suitable deep learning frameworks for recognition performance enhancement. We use exemplary recognition tasks with validation experiments to demonstrate how the categorization scheme can facilitate problem formulation and deep learning solution design for recognition tasks.
For the physiological aspect, we specify the domain of hematologic malignancy which remains challenges with recognition tasks in clinical scenarios. The flow cytometry data widely used in the hematologic malignancy domain for diagnosis and prognosis decision-making, but the interpretation of the large flow cytometry data depends on an inefficient manual gating process. Although previously proposed automatic algorithms have partially alleviated the issue, some types of hematologic malignancies with insufficient data can hardly attain ideal model accuracy. We propose a knowledge transfer framework to facilitate the cross-disease model learning for a hematologic malignancy type with fewer samples using a pretrained model from another type of hematologic malignancy. The feature representation learning of flow cytometry data suffer from model variability and lack of discriminative sample-level representation. Therefore, we propose a chunking-for-pooling approach to avoid traditional downsampling approaches and introduce a deep supervised chunk-embedding network to resolve the issues. Our validation experiments are conducted in the clinically accessed databases, which reflects the model applicability of our proposed approaches.
For the psychological aspect, emotion recognition tasks are typical tasks to identify human's internal affective states. However, current approaches can hardly handle frequent change to a new task with unseen emotion classes. Current transfer learning approaches, requiring data recollection and model retraining are inefficient to address the cross-task label space problem. We propose an enroll-to-verify approach to flexibly conduct classification with a simple embedding distance comparison approach. The problem of feature representation in the emotion domain are involved with heterogeneous personalized factors. Current emotion recognition studies still lack systematic approaches to model individual differences. Therefore, we propose a personalized space learning deep framework to align speaker in a deep multimodal network for emotion recognition. We evaluate the two frameworks on large-scale public emotion datasets. Our proposed enroll-to-verify approach attain comparable performance with only very few samples collected and without retraining. Meanwhile, the personalized framework further improve emotion detection performance over state-of-the-art multimodal frameworks without considering individual differences. By means of deep feature and label space learning techniques, our proposed approaches can handle various aspects of emotion recognition challenges.
The major contribution of this dissertation is proposing a task categorization approach to identify domain properties (physiology and psychology) and modeling techniques (feature and label space). Among validation experiments in the four categories of tasks, recognition tasks for hematologic malignancy and emotion are originated from domain needs and addressed by consolidating solutions using the categorization approach. We solve the residual disease detection and disease classification problems in label space and feature representation, respectively. The solutions directly facilitate the modeling of a disease with fewer data and equip strong discriminative capability for robust and efficient representation, which enhance the applicability to the real world in terms of practical and regulatory requirements. For the emotion recognition tasks, the cross-task recognition approach enables generalization to handle quickly varied emotion scenarios. Modeling individual differences for feature representation learning also strengthen for personalization. These solutions are all developed based on the task categorization approach which not only provide automatic prediction but also help experts and users to obtain insights in a novel systematic scheme.
摘要 i
Abstract ii
Acknowledgements iv
Contents v
List of Figures viii
List of Tables xi
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Challenges: Physiology . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Challenges: Psychology . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Research Goal and Contribution . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Tasks and Resources 9
2.1 Physiology: Hematologic Malignancy . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 UPMC dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.2 hema.to dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 NTUH dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Psychology: Emotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 CMU-MOSEI dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 IEMOCAP dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.3 MELD dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Physiology:
A Knowledge-Reserved Distillation with Complementary Transfer for Automated FC-
based Classification Across Hematological Malignancies 19
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 Flow Cytometry Phenotype Embedding . . . . . . . . . . . . . . . . . 21
3.2.2 Knowledge-Reserved Network . . . . . . . . . . . . . . . . . . . . . . 22
3.2.3 Complementary Transfer Learning . . . . . . . . . . . . . . . . . . . . 23
3.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Physiology:
A Chunking-for-Pooling Cytometric Feature Space Learning Approach for Automatic
Hematologic Malignancy Classification 29
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.1 Deep Chunking-for-Pooling Framework . . . . . . . . . . . . . . . . . 34
4.2.2 Architecture Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.2 Classification Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3.3 Framework Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5 Psychology:
An Enroll-to-Verify Approach for Cross-Task Emotion Label Space Learning 61
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2.1 Speech Emotion Recognition . . . . . . . . . . . . . . . . . . . . . . . 64
5.2.2 Speaker Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3.1 Acoustic Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3.2 Pretrained Emotion Prototype Encoder . . . . . . . . . . . . . . . . . 70
5.3.3 Encoder Loss Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.3.4 Enrollment and Verification Procedure . . . . . . . . . . . . . . . . . . 73
5.4 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4.1 Compared Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4.2 Compared Features and Encoder Architectures . . . . . . . . . . . . . 76
5.4.3 Network Hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . 77
5.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.5.1 Exp I: Comparison of Encoder Losses . . . . . . . . . . . . . . . . . . 78
5.5.2 Exp II: Comparison of Features and Architectures . . . . . . . . . . . . 82
5.5.3 Exp III: Different Emotion Verification Tasks . . . . . . . . . . . . . . 84
5.6 Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.6.1 Embedding Distance Visualization . . . . . . . . . . . . . . . . . . . . 85
5.6.2 Pretrained Emotion Classes for Prototype Encoder . . . . . . . . . . . 86
5.6.3 Effects on the Number of Enrolled Samples . . . . . . . . . . . . . . . 87
5.6.4 Effects on Label Confidence of Enrolled Samples . . . . . . . . . . . . 89
6 Psychology:
Modeling Individual Differences In Multimodal Representation for Emotion Recogni-
tion 91
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2.1 Multimodal Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2.2 Personalized Space Learning . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7 Conclusion 103
7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.1.1 Physiology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.1.2 Psychology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.1 Physiology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.2 Psychology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
References 109
[1] M. Cheung, J. J. Campbell, L. Whitby, R. J. Thomas, J. Braybrook, and J. Petzing, “Current trends in flow cytometry automated data analysis software,” Cytometry Part A, 2021.
[2] J. Flores-Montero, G. Grigore, R. Fluxá, J. Hernández, P. Fernandez, J. Almeida, N. Muñoz, S. Böttcher, L. Sedek, V. van der Velden, et al., “Euroflow lymphoid screening tube (lst) data base for automated identification of blood lymphocyte subsets,” Journal of immunological methods, vol. 475, p. 112662, 2019.
[3] E. Linskens, A. M. Diks, J. Neirinck, M. Perez-Andres, E. De Maertelaere, M. A.
Berkowska, T. Kerre, M. Hofmans, A. Orfao, J. J. Van Dongen, et al., “Improved standardization of flow cytometry diagnostic screening of primary immunodeficiency by software-based automated gating,” Frontiers in immunology, vol. 11, 2020.
[4] M. C. Béné, F. Lacombe, and A. Porwit, “Unsupervised flow cytometry analysis in hematological malignancies: A new paradigm,” International Journal of Laboratory Hematology, vol. 43, pp. 54–64, 2021.
[5] S. A. Monaghan, J.-L. Li, Y.-C. Liu, M.-Y. Ko, M. Boyiadzis, T.-Y. Chang, Y.-F. Wang, C.-C. Lee, S. H. Swerdlow, and B.-S. Ko, “A machine learning approach to the classification of acute leukemias and distinction from nonneoplastic cytopenias using flow cytometry data,” American Journal of Clinical Pathology, 2021.
[6] M. Zhao, N. Mallesh, A. Höllein, R. Schabath, C. Haferlach, T. Haferlach, F. Elsner, H. Lüling, P. Krawitz, and W. Kern, “Hematologist-level classification of mature b-cell neoplasm using deep learning on multiparameter flow cytometry data,” Cytometry Part A, vol. 97, no. 10, pp. 1073–1080, 2020.
[7] R. W. Picard, Affective computing. MIT press, 2000.
[8] S. Narayanan and P. G. Georgiou, “Behavioral signal processing: Deriving human behavioral informatics from speech and language,” Proceedings of the IEEE, vol. 101, no. 5, pp. 1203–1233, 2013.
[9] P. Ekman, E. R. Sorenson, and W. V. Friesen, “Pan-cultural elements in facial displays of emotion,” Science, vol. 164, no. 3875, pp. 86–88, 1969.
[10] A. Bagher Zadeh, P. P. Liang, S. Poria, E. Cambria, and L.-P. Morency, “Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), (Melbourne, Australia), pp. 2236–2246, Association for Computational Linguistics, July 2018.
[11] C. Busso, M. Bulut, C.-C. Lee, A. Kazemzadeh, E. Mower, S. Kim, J. N. Chang, S. Lee, and S. S. Narayanan, “Iemocap: Interactive emotional dyadic motion capture database,” Language resources and evaluation, vol. 42, no. 4, p. 335, 2008.
[12] S.-L. Yeh, Y.-S. Lin, and C.-C. Lee, “A dialogical emotion decoder for speech emotion recognition in spoken dialog,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6479–6483, 2020.
[13] J.-L. Li and C.-C. Lee, “Attention learning with retrievable acoustic embedding of personality for emotion recognition,” in 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 171–177, 2019.
[14] S. Poria, D. Hazarika, N. Majumder, G. Naik, E. Cambria, and R. Mihalcea, “Meld: A multimodal multi-party dataset for emotion recognition in conversations,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 527–536, 2019.
[15] M. Reiter, M. Diem, A. Schumich, M. Maurer-Granofszky, L. Karawajew, J. G. Rossi, R. Ratei, S. Groeneveld-Krentz, E. O. Sajaroff, S. Suhendra, et al., “Automated flow cy-tometric mrd assessment in childhood acute b-lymphoblastic leukemia using supervised machine learning,” Cytometry Part A, 2019.
[16] I. Della Starza, S. Chiaretti, M. S. De Propris, L. Elia, M. Cavalli, L. A. De Novi, R. Soscia, M. Messina, A. Vitale, A. R. Guarini, et al., “Minimal residual disease in acute lymphoblastic leukemia: technical and clinical advances,” Frontiers in oncology, vol. 9, p. 726, 2019.
[17] B.-S. Ko, Y.-F. Wang, J.-L. Li, C.-C. Li, P.-F. Weng, S.-C. Hsu, H.-A. Hou, H.-H. Huang, M. Yao, C.-T. Lin, et al., “Clinically validated machine learning algorithm for detecting residual diseases with multicolor flow cytometry analysis in acute myeloid leukemia and myelodysplastic syndrome,” EBioMedicine, vol. 37, pp. 91–100, 2018.
[18] J. Li, Y. Wang, B. Ko, C. Li, J. Tang, and C. Lee, “Learning a cytometric deep phenotype embedding for automatic hematological malignancies classification,” in 2019 41st An-nual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1733–1736, July 2019.
[19] P. Rota, S. Groeneveld-Krentz, and M. Reiter, “On automated flow cytometric analysis for mrd estimation of acute lymphoblastic leukaemia: A comparison among different approaches,” in 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 438–441, Nov 2015.
[20] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015.
[21] S. Wu, J. Li, C. Liu, Z. Yu, and H.-S. Wong, “Mutual learning of complementary networks via residual correction for improving semi-supervised classification,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
[22] D. Hui, N. Didwaniya, M. Vidal, S. H. Shin, G. Chisholm, J. Roquemore, and E. Bruera,“Quality of end-of-life care in patients with hematologic malignancies: A retrospective cohort study,” Cancer, vol. 120, no. 10, pp. 1572–1578, 2014.
[23] E. Campo, S. H. Swerdlow, N. L. Harris, S. Pileri, H. Stein, and E. S. Jaffe, “The 2008 who classification of lymphoid neoplasms and beyond: evolving concepts and practical applications,” Blood, The Journal of the American Society of Hematology, vol. 117, no. 19, pp. 5019–5032, 2011.
[24] H. Rafei, H. M. Kantarjian, and E. J. Jabbour, “Recent advances in the treatment of acute lymphoblastic leukemia,” Leukemia & lymphoma, vol. 60, no. 11, pp. 2606–2621, 2019.
[25] H. Kantarjian, T. Kadia, C. DiNardo, N. Daver, G. Borthakur, E. Jabbour, G. Garcia-Manero, M. Konopleva, and F. Ravandi, “Acute myeloid leukemia: Current progress and future directions,” Blood cancer journal, vol. 11, no. 2, pp. 1–25, 2021.
[26] J. J. Jimenez, R. S. Chale, A. C. Abad, and A. V. Schally, “Acute promyelocytic leukemia (apl): a review of the literature,” Oncotarget, vol. 11, no. 11, p. 992, 2020.
[27] C. Duetz, C. Bachas, T. M. Westers, and A. A. van de Loosdrecht, “Computational analysis of flow cytometry data in hematological malignancies: future clinical practice?,” Current Opinion in Oncology, vol. 32, no. 2, pp. 162–169, 2020.
[28] S. Toghi Eshghi, A. Au-Yeung, C. Takahashi, C. R. Bolen, M. N. Nyachienga, S. P. Lear, C. Green, W. R. Mathews, and W. E. O’Gorman, “Quantitative comparison of conventional and t-sne-guided gating analyses,” Frontiers in immunology, vol. 10, p. 1194, 2019.
[29] J. P. Vial, N. Lechevalier, F. Lacombe, P.-Y. Dumas, A. Bidet, T. Leguay, F. Vergez, A. Pigneux, and M. C. Béné, “Unsupervised flow cytometry analysis allows for an accurate identification of minimal residual disease assessment in acute myeloid leukemia,” Cancers, vol. 13, no. 4, p. 629, 2021.
[30] D. P. Ng and L. M. Zuromski, “Augmented human intelligence and automated diagnosis in flow cytometry for hematologic malignancies,” American Journal of Clinical Pathology, vol. 155, no. 4, pp. 597–605, 2021.
[31] A. C. Belkina, C. O. Ciccolella, R. Anno, R. Halpert, J. Spidlen, and J. E. Snyder-Cappione, “Automated optimized parameters for t-distributed stochastic neighbor embedding improve visualization and analysis of large datasets,” Nature communications, vol. 10, no. 1, pp. 1–12, 2019.
[32] P. Rota, S. Groeneveld-Krentz, and M. Reiter, “On automated flow cytometric analysis for mrd estimation of acute lymphoblastic leukaemia: A comparison among different approaches,” in 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 438–441, IEEE, 2015.
[33] R. Licandro, P. Rota, M. Reiter, and M. Kampel, “Flow cytometry based automatic mrd assessment in acute lymphoblastic leukaemia: Longitudinal evaluation of time-specific cell population models,” in 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6, IEEE, 2016.
[34] R. Licandro, P. Rota, M. Reiter, and M. Kampel, “Flow cytometry based automatic mrd assessment in acute lymphoblastic leukaemia: Longitudinal evaluation of time-specific cell population models,” in 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6, 2016.
[35] L. Weijler, M. Diem, M. Reiter, and M. Maurer-Granofszky, “Detecting rare cell populations in flow cytometry data using umap,” in 2020 25th International Conference on Pattern Recognition (ICPR), pp. 4903–4909, IEEE, 2021.
[36] B. Rajwa, P. K. Wallace, E. A. Griffiths, and M. Dundar, “Automated assessment of disease progression in acute myeloid leukemia by probabilistic analysis of flow cytometry data,” IEEE Transactions on Biomedical Engineering, vol. 64, no. 5, pp. 1089–1098, 2016.
[37] Y.-F. Wang, B.-S. Ko, C.-C. Li, J.-L. Li, P.-F. Weng, H.-H. Huang, H.-A. Hou, H.-F. Tien, C.-C. Lee, and J.-L. Tang, “An artificial intelligence approach for b lymphoblastic
leukemia minimal residual disease detection and clinical prognosis prediction using flow cytometry data,” Blood, vol. 130, no. Supplement 1, pp. 3980–3980, 2017.
[38] J.-L. Li, T.-Y. Chang, Y.-F. Wang, B.-S. Ko, J.-L. Tang, and C.-C. Lee, “A knowledge-reserved distillation with complementary transfer for automated fc-based classification across hematological malignancies,” in 2020 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), pp. 5482–5485, 2020.
[39] A. Miech, I. Laptev, and J. Sivic, “Learnable pooling with context gating for video classification,” arXiv preprint arXiv:1706.06905, 2017.
[40] R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “Netvlad: Cnn architecture for weakly supervised place recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5297–5307, 2016.
[41] B. D. Hedley, M. Keeney, J. Popma, and I. Chin-Yee, “Novel lymphocyte screening tube using dried monoclonal antibody reagents,” Cytometry Part B: Clinical Cytometry, vol. 88, no. 6, pp. 361–370, 2015.
[42] A. Rajab, O. Axler, J. Leung, M. Wozniak, and A. Porwit, “Ten-color 15-antibody flow cytometry panel for immunophenotyping of lymphocyte population,” International journal of laboratory hematology, vol. 39, pp. 76–85, 2017.
[43] C. E. Pedreira, E. S. Costa, Q. Lecrevisse, J. J. van Dongen, A. Orfao, E. Consortium, et al., “Overview of clinical flow cytometry data analysis: recent advances and future challenges,” Trends in biotechnology, vol. 31, no. 7, pp. 415–425, 2013.
[44] P. Qiu, “Computational prediction of manually gated rare cells in flow cytometry data,” Cytometry Part A, vol. 87, no. 7, pp. 594–602, 2015.
[45] P. Tang, X. Wang, B. Shi, X. Bai, W. Liu, and Z. Tu, “Deep fishernet for object classification,” arXiv preprint arXiv:1608.00182, 2016.
[46] Y. Zhang, R. Jin, and Z.-H. Zhou, “Understanding bag-of-words model: a statistical framework,” International Journal of Machine Learning and Cybernetics, vol. 1, no. 1-4, pp. 43–52, 2010.
[47] Y. Cao, T. A. Geddes, J. Y. H. Yang, and P. Yang, “Ensemble deep learning in bioinformatics,” Nature Machine Intelligence, vol. 2, no. 9, pp. 500–508, 2020.
[48] M. Biehl, K. Bunte, and P. Schneider, “Analysis of flow cytometry data by matrix relevance learning vector quantization,” PLoS One, vol. 8, no. 3, p. e59401, 2013.
[49] J. Sánchez, F. Perronnin, T. Mensink, and J. Verbeek, “Image classification with the fisher vector: Theory and practice,” International journal of computer vision, vol. 105, no. 3, pp. 222–245, 2013.
[50] H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation,” in 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 3304–3311, IEEE, 2010.
[51] X. Xie, M. Liu, Y. Zhang, B. Wang, C. Zhu, C. Wang, Q. Li, Y. Huo, J. Guo, C. Xu, et al.,“Single-cell transcriptomic landscape of human blood cells,” National science review, vol. 8, no. 3, p. nwaa180, 2021.
[52] S. J. Loughran, S. Haas, A. C. Wilkinson, A. M. Klein, and M. Brand, “Lineage commitment of hematopoietic stem cells and progenitors: insights from recent single cell and lineage tracing technologies,” Experimental hematology, vol. 88, pp. 1–6, 2020.
[53] C. Galigalidou, L. Zaragoza-Infante, A. Iatrou, A. Chatzidimitriou, K. Stamatopoulos, and A. Agathangelidis, “Understanding monoclonal b cell lymphocytosis: An interplay of genetic and microenvironmental factors.,” Frontiers in Oncology, vol. 11, pp. 769612–769612, 2021.
[54] H. Kaseb, M. A. Tariq, and G. Gupta, “Lymphoblastic lymphoma,” StatPearls [Internet], 2021.
[55] J. Wang, M. Xue, R. Culhane, E. Diao, J. Ding, and V. Tarokh, “Speech emotion recognition with dual-sequence lstm architecture,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6474–6478, 2020.
[56] S. Latif, R. Rana, S. Khalifa, R. Jurdak, and J. Epps, “Direct Modelling of Speech Emotion from Raw Speech,” in Proc. Interspeech 2019, pp. 3920–3924, 2019.
[57] J.-L. Li, T.-Y. Huang, C.-M. Chang, and C.-C. Lee, “A waveform-feature dual branch acoustic embedding network for emotion recognition,” Frontiers in Computer Science, vol. 2, p. 13, 2020.
[58] A. S. Cowen and D. Keltner, “Self-report captures 27 distinct categories of emotion bridged by continuous gradients,” Proceedings of the National Academy of Sciences, vol. 114, no. 38, pp. E7900–E7909, 2017.
[59] J. M. Naik, “Speaker verification: A tutorial,” IEEE Communications Magazine, vol. 28, no. 1, pp. 42–48, 1990.
[60] J. S. Chung, J. Huh, S. Mun, M. Lee, H.-S. Heo, S. Choe, C. Ham, S. Jung, B.-J. Lee, and I. Han, “In Defence of Metric Learning for Speaker Recognition,” in Proc. Interspeech 2020, pp. 2977–2981, 2020.
[61] S. Poria, E. Cambria, D. Hazarika, N. Majumder, A. Zadeh, and L.-P. Morency, “Context-dependent sentiment analysis in user-generated videos,” in Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: Long papers), pp. 873–883, 2017.
[62] Z. Pan, Z. Luo, J. Yang, and H. Li, “Multi-Modal Attention for Speech Emotion Recognition,” in Proc. Interspeech 2020, pp. 364–368, 2020.
[63] J.-L. Li and C.-C. Lee, “Attentive to individual: A multimodal emotion recognition network with personalized attention profile.,” in Interspeech, pp. 211–215, 2019.
[64] S. Bhosale, R. Chakraborty, and S. K. Kopparapu, “Deep encoded linguistic and acoustic cues for attention based end to end speech emotion recognition,” in ICASSP 2020 – 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7189–7193, 2020.
[65] R. Cai, K. Guo, B. Xu, X. Yang, and Z. Zhang, “Meta Multi-Task Learning for Speech Emotion Recognition,” in Proc. Interspeech 2020, pp. 3336–3340, 2020.
[66] Y. Ahn, S. J. Lee, and J. W. Shin, “Cross-corpus speech emotion recognition based on few-shot learning and domain adaptation,” IEEE Signal Processing Letters, vol. 28, pp. 1190–1194, 2021.
[67] X. Xu, J. Deng, N. Cummins, Z. Zhang, L. Zhao, and B. W. Schuller, “Autonomous Emotion Learning in Speech: A View of Zero-Shot Speech Emotion Recognition,” in Proc. Interspeech 2019, pp. 949–953, 2019.
[68] M. A. Jalal, R. K. Moore, and T. Hain, “Spatio-temporal context modelling for speech emotion classification,” in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 853–859, 2019.
[69] P. Barros, N. Churamani, E. Lakomkin, H. Siqueira, A. Sutherland, and S. Wermter, “The omg-emotion behavior dataset,” in 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–7, 2018.
[70] H. Li, M. Tu, J. Huang, S. Narayanan, and P. Georgiou, “Speaker-invariant affective representation learning via adversarial training,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7144–7148, 2020.
[71] J.-L. Li and C.-C. Lee, “Encoding Individual Acoustic Features Using Dyad-Augmented Deep Variational Representations for Dialog-level Emotion Recognition,” in Proc. Interspeech 2018, pp. 3102–3106, 2018.
[72] S.-L. Yeh, Y.-S. Lin, and C.-C. Lee, “An interaction-aware attention network for speech emotion recognition in spoken dialogs,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6685–6689, 2019.
[73] M. Jaiswal, C.-P. Bara, Y. Luo, M. Burzo, R. Mihalcea, and E. M. Provost, “MuSE: a multimodal dataset of stressed emotion,” in Proceedings of the 12th Language Resources and Evaluation Conference, (Marseille, France), pp. 1499–1510, European Language Resources Association, May 2020.
[74] K. Sridhar and C. Busso, “Modeling uncertainty in predicting emotional attributes from spontaneous speech,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8384–8388, 2020.
[75] X. Ma, Z. Wu, J. Jia, M. Xu, H. Meng, and L. Cai, “Speech Emotion Recognition with Emotion-Pair Based Framework Considering Emotion Distribution Information in Dimensional Emotion Space,” in Proc. Interspeech 2017, pp. 1238–1242, 2017.
[76] Z. Huang, W. Xue, Q. Mao, and Y. Zhan, “Unsupervised domain adaptation for speech emotion recognition using pcanet,” Multimedia Tools Appl., vol. 76, p. 6785–6799, mar 2017.
[77] A. Marczewski, A. Veloso, and N. Ziviani, “Learning transferable features for speech emotion recognition,” in Proceedings of the on Thematic Workshops of ACM Multimedia 2017, Thematic Workshops ’17, (New York, NY, USA), p. 529–536, Association for Computing Machinery, 2017.
[78] p. song, “Transfer linear subspace learning for cross-corpus speech emotion recognition,” IEEE Transactions on Affective Computing, vol. 10, no. 2, pp. 265–275, 2019.
[79] J. Kim, G. Englebienne, K. P. Truong, and V. Evers, “Towards Speech Emotion Recognition “in the Wild"Using Aggregated Corpora and Deep Multi-Task Learning,” in Proc. Interspeech 2017, pp. 1113–1117, 2017.
[80] S. Latif, R. Rana, S. Younis, J. Qadir, and J. Epps, “Transfer Learning for Improving Speech Emotion Classification Accuracy,” in Proc. Interspeech 2018, pp. 257–261, 2018.
[81] M. Abdelwahab and C. Busso, “Domain adversarial for acoustic emotion recognition,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 12, pp. 2423–2435, 2018.
[82] Y. Gao, J. Liu, L. Wang, and J. Dang, “Domain-adversarial autoencoder with attention based feature level fusion for speech emotion recognition,” in ICASSP 2021 – 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6314–6318, 2021.
[83] H. Zhou and K. Chen, “Transferable positive/negative speech emotion recognition via class-wise adversarial domain adaptation,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3732–3736, 2019.
[84] Q. Mao, G. Xu, W. Xue, J. Gou, and Y. Zhan, “Learning emotion-discriminative and domain-invariant features for domain adaptation in speech emotion recognition,” Speech Communication, vol. 93, pp. 1–10, 2017.
[85] A. Schmitt, S. Ultes, and W. Minker, “A parameterized and annotated spoken dialog corpus of the cmu let's go bus information system,” in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), pp. 3369–3373, 2012.
[86] W. M. Kouw and M. Loog, “An introduction to domain adaptation and transfer learning,” arXiv preprint arXiv:1812.11806, 2018.
[87] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, 2010.
[88] A. Naman and L. Mancini, “Fixed-maml for few shot classification in multilingual speech emotion recognition,” arXiv preprint arXiv:2101.01356, 2021.
[89] Y. Liu, L. He, and J. Liu, “Large Margin Softmax Loss for Speaker Verification,” in Proc. Interspeech 2019, pp. 2873–2877, 2019.
[90] X. Xiang, S. Wang, H. Huang, Y. Qian, and K. Yu, “Margin matters: Towards more discriminative deep neural network embeddings for speaker recognition,” in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1652–1656, 2019.
[91] J. Wang, K.-C. Wang, M. T. Law, F. Rudzicz, and M. Brudno, “Centroid-based deep metric learning for speaker recognition,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3652–3656, 2019.
[92] A. Baevski, S. Schneider, and M. Auli, “vq-wav2vec: Self-supervised learning of discrete speech representations,” in International Conference on Learning Representations, 2019.
[93] V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, “Librispeech: An asr corpus based on public domain audio books,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206–5210, 2015.
[94] M. Ott, S. Edunov, A. Baevski, A. Fan, S. Gross, N. Ng, D. Grangier, and M. Auli,“fairseq: A fast, extensible toolkit for sequence modeling,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), (Minneapolis, Minnesota), pp. 48–53, Association for Computational Linguistics, June 2019.
[95] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008, 2017.
[96] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4685–4694, 2019.
[97] H. Wang, Y. Wang, Z. Zhou, X. Ji, D. Gong, J. Zhou, Z. Li, and W. Liu, “Cosface:
Large margin cosine loss for deep face recognition,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5265–5274, 2018.
[98] B. Liu, Y. Cao, Y. Lin, Q. Li, Z. Zhang, M. Long, and H. Hu, “Negative margin matters: Understanding margin in few-shot classification,” in European Conference on Computer Vision, pp. 438–455, Springer, 2020.
[99] C. Zhang, K. Koishida, and J. H. L. Hansen, “Text-independent speaker verification based on triplet convolutional neural network embeddings,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 9, pp. 1633–1644, 2018.
[100] Z. Ren, Z. Chen, and S. Xu, “Triplet based embedding distance and similarity learning for text-independent speaker verification,” in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 558–562, 2019.
[101] F. Eyben, M. Wöllmer, and B. Schuller, “Opensmile: the munich versatile and fast open-source audio feature extractor,” in Proceedings of the 18th ACM international conference on Multimedia, pp. 1459–1462, 2010.
[102] J. Huang, J. Tao, B. Liu, and Z. Lian, “Learning Utterance-Level Representations with Label Smoothing for Speech Emotion Recognition,” in Proc. Interspeech 2020, pp. 4079–4083, 2020.
[103] Y.-L. Huang, B.-H. Su, Y.-W. P. Hong, and C.-C. Lee, “An Attribute-Aligned Strategy for Learning Speech Representation,” in Proc. Interspeech 2021, pp. 1179–1183, 2021.
[104] S. T. Rajamani, K. T. Rajamani, A. Mallol-Ragolta, S. Liu, and B. Schuller, “A novel attention-based gated recurrent unit and its efficacy in speech emotion recognition,” in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6294–6298, 2021.
[105] W. Xie, A. Nagrani, J. S. Chung, and A. Zisserman, “Utterance-level aggregation for speaker recognition in the wild,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5791–5795, 2019.
[106] A. Nagrani, J. S. Chung, W. Xie, and A. Zisserman, “Voxceleb: Large-scale speaker verification in the wild,” Computer Speech & Language, vol. 60, p. 101027, 2020.
[107] T. G. Dietterich, “Approximate statistical tests for comparing supervised classification learning algorithms,” Neural computation, vol. 10, no. 7, pp. 1895–1923, 1998.
[108] P. Kuppens, J. Stouten, and B. Mesquita, “Individual differences in emotion components and dynamics: Introduction to the special issue,” Cognition and Emotion, vol. 23, no. 7, pp. 1249–1258, 2009.
[109] H. Cao, R. Verma, and A. Nenkova, “Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech,” Computer speech & language, vol. 29, no. 1, pp. 186–202, 2015.
[110] C. Welch, V. Perez-Rosas, J. Kummerfeld, and R. Mihalcea, “Look who’s talking: Inferring speaker attributes from personal longitudinal dialog,” in Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), 2019.
[111] K. R. Scherer, “Vocal communication of emotion: A review of research paradigms,” Speech communication, vol. 40, no. 1-2, pp. 227–256, 2003.
[112] S. Rukavina, S. Gruss, H. Hoffmann, J.-W. Tan, S. Walter, and H. C. Traue, “Affective computing and the impact of gender and age,” PloS one, vol. 11, no. 3, 2016.
[113] H. Sagha, J. Deng, and B. Schuller, “The effect of personality trait, age, and gender on the performance of automatic speech valence recognition,” in 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 86–91, IEEE, 2017.
[114] L. Zhang, L. Wang, J. Dang, L. Guo, and Q. Yu, “Gender-aware cnn-blstm for
speech emotion recognition,” in International Conference on Artificial Neural Networks, pp. 782–790, Springer, 2018.
[115] J.-L. Li and C.-C. Lee, “Attention learning with retrievable acoustic embedding of personality for emotion recognition,” in 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 171–177, IEEE, 2019.
[116] M. Sidorov, A. Schmitt, E. Semenkin, and W. Minker, “Could speaker, gender or age awareness be beneficial in speech-based emotion recognition?,” in Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp. 61–68, 2016.
[117] M. Sidorov, S. Ultes, and A. Schmitt, “Comparison of gender-and speaker-adaptive emotion recognition.,” in LREC, pp. 3476–3480, 2014.
[118] M. Bancroft, R. Lotfian, J. Hansen, and C. Busso, “Exploring the intersection between speaker verification and emotion recognition,” in 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), pp. 337–342, IEEE, 2019.
[119] J. Williams and S. King, “Disentangling style factors from speaker representations,” in Proc. Interspeech, vol. 2019, pp. 3945–3949, 2019.
[120] R. Pappagari, T. Wang, J. Villalba, N. Chen, and N. Dehak, “x-vectors meet emotions: A study on dependencies between emotion and speaker recognition,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7169–7173, IEEE, 2020.
[121] H. Li, M. Tu, J. Huang, S. Narayanan, and P. Georgiou, “Speaker-invariant affective representation learning via adversarial training,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7144–7148, IEEE, 2020.
[122] A. Zadeh, P. P. Liang, N. Mazumder, S. Poria, E. Cambria, and L.-P. Morency, “Memory fusion network for multi-view sequential learning,” in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[123] A. Zadeh, P. P. Liang, S. Poria, P. Vij, E. Cambria, and L.-P. Morency, “Multi-attention recurrent network for human communication comprehension,” in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[124] G. Degottex, J. Kane, T. Drugman, T. Raitio, and S. Scherer, “Covarep—a collaborative voice analysis repository for speech technologies,” in 2014 ieee international conference on acoustics, speech and signal processing (icassp), pp. 960–964, IEEE, 2014.
[125] T. Baltrušaitis, P. Robinson, and L.-P. Morency, “Openface: an open source facial behavior analysis toolkit,” in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10, IEEE, 2016.
[126] R. JeffreyPennington and C. Manning, “Glove: Global vectors for word representation,” in Conference on Empirical Methods in Natural Language Processing, Citeseer, 2014.
[127] J. Yuan and M. Liberman, “Speaker identification on the scotus corpus,” Journal of the Acoustical Society of America, vol. 123, no. 5, p. 3878, 2008.
[128] J. S. Chung, A. Nagrani, and A. Zisserman, “Voxceleb2: Deep speaker recognition,” Proc. Interspeech 2018, pp. 1086–1090, 2018.
[129] T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top

相關論文

1. 透過語音特徵建構基於堆疊稀疏自編碼器演算法之婚姻治療中夫妻互動行為量表自動化評分系統
2. 基於健保資料預測中風之研究並以Hadoop作為一種快速擷取特徵工具
3. 一個利用人類Thin-Slice情緒感知特性所建構而成之全時情緒辨識模型新框架
4. 應用多任務與多模態融合技術於候用校長演講自動評分系統之建構
5. 基於多模態主動式學習法進行樣本與標記之間的關係分析於候用校長評鑑之自動化評分系統建置
6. 透過結合fMRI大腦血氧濃度相依訊號以改善語音情緒辨識系統
7. 結合fMRI之迴旋積類神經網路多層次特徵 用以改善語音情緒辨識系統
8. 針對實體化交談介面開發基於行為衡量方法於自閉症小孩之評估系統
9. 一個多模態連續情緒辨識系統與其應用於全域情感辨識之研究
10. 整合文本多層次表達與嵌入演講屬性之表徵學習於強健候用校長演講自動化評分系統
11. 利用聯合因素分析研究大腦磁振神經影像之時間效應以改善情緒辨識系統
12. 利用LSTM演算法基於自閉症診斷觀察量表訪談建置辨識自閉症小孩之評估系統
13. 利用多模態模型混合CNN和LSTM影音特徵以自動化偵測急診病患疼痛程度
14. 以雙向長短期記憶網路架構混和多時間粒度文字模態改善婚 姻治療自動化行為評分系統
15. 透過表演逐字稿之互動特徵以改善中文戲劇表演資料庫情緒辨識系統
 
* *