作者(外文):Liu, Guei-Ming
論文名稱(外文):Multimodal machine learning classifier based on emotional transformation feature applied to deception detection in videos
指導教授(外文):Huang, Chih-Hao
口試委員(外文):Lin, Chia-Wen
Chung, Wei-Ho
外文關鍵詞:Multimodal video analysisMachine learningDeception detectionEmotion recognition
欺騙檢測在法學、商業、心理學等許多領域中一直是熱門的議題。近期隨著機器學習與計 算機視覺的發展,傳統欺騙檢測一部分的研究重心逐漸轉向到與自動辨識結合的視頻欺騙 檢測。科技與舊議題的結合固然令人振奮,但是仍有許多問題值得進一步研究,其中主要 挑戰之一是數據短缺問題。截至目前為止,僅發布了一個有關欺騙檢測的多模態基準數據 集,其中包含 121 個用於欺騙檢測的視頻剪輯 (欺騙性剪輯佔 61 部,真性剪輯佔 60 部)。 因此,對於這個訓練資料為數不多的資料集,大多數生成的欺騙檢測模型 (尤其是基於深 度神經網絡的方法) 都存在過擬合問題導致泛化能力不足。為了解決這些問題,我們提出 了一種新穎的情感轉換特徵 (ETF) 來分析有限數據下的欺騙檢測。對所提出的方法與最新 的多模態方法進行的分析和比較,結果表明識別效能可達到 91.67 準確率% 和 0.92 AUC 值。
Deception detection has always been a hot topic in many fields such as law, business, psychology and so on. Recently, with the development of machine learning and computer vision, part of the research focus of traditional deception detection has gradually turned to deception detection in video combined with automatic recognition. The combination of technology and old issues is exciting, but there are still many issues that deserve further study. One of the main challenges is the data shortage. So far, only one multi­modal benchmark dataset on deception detection has been published, which contains 121 video clips for deception detection (61 deceptive clips and 60 truthful clips). Due to the lack of training data available from the data set, most of the generated fraud detection models (especially based on deep neural network methods) have overfitting problems, resulting in poor generalization ability. In order to solve these problems, we proposed a novel emotional transformation feature (ETF) to analyze deception detection under limited data. The analysis and comparison between the proposed method and the multimodal method show that the recognition efficiency can reach an accuracy of 91.67 % and AUC value of 0.92
1 緒論 ........................................................................................... 1
1.1 動機與目的.............................................................................. 1
1.2 論文架構 ................................................................................ 2
2 相關研究探討 ................................................................................ 3
2.1 視覺模態 ................................................................................ 3
2.2 聲學模態 ................................................................................ 4
2.3 語言模態 ................................................................................ 4
3 系統架構 ...................................................................................... 5
3.1 多模態欺騙辨識架構 .................................................................. 5
3.2 情緒轉變特徵 (Emotional Transformation Feature) .................................. 6
3.3 結合聲音資訊修正情緒轉變特徵..................................................... 8
3.4 分類器 ................................................................................... 10
3.4.1 決策樹 (Decision Tree) ......................................................... 10
3.4.2 隨機森林 (Random Forest)..................................................... 12
3.4.3 k­近鄰演算法 (k­Nearest Neighbors) ......................................... 14
3.4.4 支援向量機 (Support Vector Machine)........................................ 15
4 實驗與結果 ................................................................................... 18
4.1 資料集 ................................................................................... 18
4.2 實驗方法與評估指標 .................................................................. 19
4.3 參數設定與實驗結果 .................................................................. 21
iii5 結論與未來展望.............................................................................. 24
參考文獻......................................................................................... 25
