帳號:guest(3.145.84.183)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳金博
作者(外文):CHEN, CHIN-PO
論文名稱(中文):使用語音技術量化研究自閉症光譜
論文名稱(外文):Using speech-based technology to characterize autistic traits of autism spectrum disorder
指導教授(中文):李祈均
指導教授(外文):Lee, Chi-Chun
口試委員(中文):劉奕汶
曹昱
黃元豪
高淑芬
口試委員(外文):Liu, Yi-Wen
Yu, Tsao
Huang, Yuan-Hao
Gau, Shur-Fen
學位類別:博士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:104061563
出版年(民國):113
畢業學年度:112
語文別:中文
論文頁數:106
中文關鍵詞:自閉症語音訊號處理深度學習
外文關鍵詞:autism spectrum disorderspeech signal processingdeep learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:47
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
人在溝通會把社交技能用在讓溝通更便捷的目的上,會展現一些特殊的
模式像是 accomodation。這些特殊的模式如果將它用 sensor 記錄下來,以
訊號的角度來看也能觀察到類似現象。甚至能更細微地看到人常會忽略
的地方。自閉症是一個被人稱做溝通與社交障礙的疾病,是一群被視為溝
通上有缺陷的人,至今為止我們都以醫療的角度來描述這個疾病,可能是
因為缺乏更細微的方式來定義自閉症導致到現在自閉症的定義都還不完
善,異質性高且診斷定義容易更換。因此這個研究想要以訊號的角度,透
過生成 speech analytics 來分析表達自閉症的 trait,我們首先從 non-verbal
的 analytics 來探討自閉症三個子類別之間的些微差異,我們接著藉由量化
speech 跟 language 去探討自閉症溝通能力上的幾個子向度的強弱差異,最
後我們藉由觀察 Vowel space 上的特性來探討自閉症與正常人的區別,以及
嚴重與輕微自閉症的差異。我們的研究在以上三個子課題都發現 insights,
這些 insights 提供了我們以另一種角度來看自閉症光譜的特殊行為特徵。
這篇論文的主要貢獻有兩個方面:語音算法的開發和語音分析所帶來的
洞見。在每個子主題中,我們都開發了語音算法來生成語音分析。除了對
自閉症特徵有更好的理解之外,這些語音分析還是數值,可以輸入到下游
的機器學習分類器中。這個特性使得在臨床場景中能夠進行自動評估成為
可能。因此,這些分析具有潛力被整合到邊緣設備中,以成為高效而可靠
的評估工具。此外,疾病進展追蹤和在生活中的自閉症風險篩查都是仍然
缺乏的潛在應用。至於從語音分析中獲得的洞見,它們提供了一個新的角
度來討論自閉症特徵。例如,我們發現了一個特殊的轉換結構適用於高功
能自閉症。因此,轉換互動的過程以及造成這一現象的原因都值得未來深入
探討。有了先進的工具和多角度來表徵自閉症特徵,我們可以更深入地
研究 ASD,並指定具體的行為表現形式以更好地理解。


In face-to-face communication, people have developed rich communication
skills to convey ideas and intentions for smoother interactions. Under this sce-
nario, specific patterns such as accommodation in speech tone or syntax can be
observed. It is possible to characterize these patterns by recording and analyzing
them. Moreover, from a signal perspective, the analytics characterizing human be-
havior can be beyond human imagination. For example, prosodic accommodation
manifests in the duration of each utterance, voice quality, and voice intensity.
Autism spectrum disorder (ASD) is a neural-developmental disorder featuring
communication, social reciprocity, and stereotyped behavior. The current defi-
nition of ASD is still vague, and diagnostic criteria change when versions of the
Diagnostic and Statistical Manual of Mental Disorders (DSM) are updated. Under-
standing the disease progression of autism is challenging because of its high het-
erogeneity. However, researchers have been studying ASD from a clinical angle.
ASD study from a signal-level point of view still falls short. With the advanced
technology that allows us to process and analyze signals from speech-language
sensors like microphones, it would be interesting to study autistic traits with com-
putational approaches.
Therefore, this study aims to analyze traits of ASD through signal processing,
utilizing speech analytics. In our first study, we explore subtle differences among
three subtypes of autism through non-verbal analytics. Then, we developed ver-
bal speech analytics to characterize speech and language involved in dialogue and
found their relationship with several psychological constructs about communica-tion deficits. Lastly, we investigate differences between autism and typical devel-
oping people, as well as differences between severe and mild cases of autism, by
observing characteristics in the vowel space.
There are two major contributions to this speech and language: speech algo-
rithm development and insights from speech analytics analyses. In each subtopic,
we develop speech algorithms to generate speech analytics. Besides a better under-
standing of autistic traits, these speech analytics are numerical values and can be
input to downstream machine learning classifiers. This property enables automatic
assessments in clinical scenarios. Hence, these analytics have the potential to be
integrated into edge devices to perform efficient and reliable assessment tools. In
addition, Disease progression tracking and into-life ASD risk screening are all pos-
sible applications that are still lacking. As for the insights from speech analytics
analyses, they provide a new point of view to discuss autistic traits. For exam-
ple, we found a special turn-taking structure for high-functioning autism. Hence,
how the turn-taking interaction goes and what causes this phenomenon is worthy
of future investigation. With advanced tools and multiple angles to characterize
autistic traits, we can push deeper into the studies of ASD and specify the concrete
behavioral manifestation for better understanding.

摘要
Abstract
Acknowledgements
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . .1
1.1.1 Research goal and contribution . . . . . . . . . . . . . . . . . . . . . .1
1.2 Dissertation organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
2Research Methodology 5
2.1 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 5
2.1.1 The Autism Diagnostic Observation Schedule (ADOS). . . . . . . . . . . . . . 5

2.1.2 Audio-Video Data Collection . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The behavior characterizing scaffold . . . . . . . . . . . . . .. . . . . . 10
3 Toward Differential Diagnosis of Autism Spectrum Disorder using Multimodal Behavior Descriptors and Executive Functions. . . . 13
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13
3.2 Clinical labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
3.2.1 Clinical measurement of behavior ratings – ADOS . . . . . . . . . . .18
3.2.2 Clinical measurement of executive function – CANTAB . . . . . . . .19
3.3 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
3.3.1 Audio-Video Low-Level Descriptors (LLDs) . . . . . . . . . . . . . .20
3.3.2 Segment-Level Features . . . . . . . . . . . . . . . . . . . . . . . . .23
3.3.3 Session-Level Features . . . . . . . . . . . . . . . . . . . . . . . . . .24
3.4 Experimental Setup and Results . . . . . . . . . . . . . . . . . . . . . . . . .24
3.4.1 Data cohort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
3.4.2 Experiment I Results and Discussions . . . . . . . . . . . . . . . . . .25
3.4.3 Experiment II Results and Discussions . . . . . . . . . . . . . . . . . .29
3.4.4 Experiment III Results . . . . . . . . . . . . . . . . . . . . . . . . . .30
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32
3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
3.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
4 Learning Converse-level Multimodal Embedding to Assess Social Deficit Severity
for Autism Spectrum Disorder. . . . . . . . . . . . . . . . . . . . . . . . .39
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Data cohort . . . . . . . . . . . . . . . . . . . . .. . . . . . . .41
4.3 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . .42
4.3.1 Converse-level Unit Definition . . . . . . . . . . . . . .. . . . . . .43
4.3.2 Converse-level Embedding (Lex) . . . . . . . . . . . . . . . . . . . .43
4.3.3 Converse-level Embedding (Acous) . . . . . . . . . . . . . . . . . . . .44
4.3.4 GRU with Attentive DNN Fusion Network . . . . . . . . . . . . . . . .45
4.4 Experimental Setup and Results . . . . . . . . . . . . . . . . . . . . . . . . . .46
4.4.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
4.4.2 Analysis and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
4.4.3 Analysis of Turn Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49
4.4.4 Analysis of dialogue perplexity . . . . . . . . . . . . . . . . . . . . . . . .49
4.4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50
5
Using Measures of Vowel Space for Autistic Traits Characterization
51
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
5.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . .54
5.2.1 Speech production-related communication impairment . . . . . . . . .55
5.2.2 Interaction-oriented social reciprocity deficits . . . . . . . . . . . . . 56
5.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57
5.3.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3.2 Utterance-level VSC features . . . . . . . . . . . . . . . 60
5.3.3 Conversation-level VSC features . . . . . . . . . . . . . . 63
5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
5.4.1 Data cohort . . . . . . . . . . . . . . . . . . . . . . . . .67
5.4.2 Definition of experimental parameters . . . . . . . . . . .68
5.4.3 Model explanation through SHAPley analysis . . . . . . .69
5.4.4 Additional features for this study . . . . . . . . . . . . . .73
5.4.5 Experiment1: classification of ASD/TD . . . . . . . . . .73
5.4.6 Analysis of the classification tasks . . . . . . . . . . . . .75
5.4.7 Experiment2: regression of communication deficit score .79
5.4.8 Analysis of the regression tasks . . . . . . . . . . . . . .80
5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87
5.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88
6 Limitation and Conclusion 95
6.1 Limitations about this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99
[1] T. W. Robbins, M. James, A. M. Owen, B. J. Sahakian, L. McInnes, and P. Rabbitt,
“Cambridge neuropsychological test automated battery (cantab): a factor analytic study
of a large sample of normal elderly volunteers,” Dementia and Geriatric Cognitive Dis-
orders, vol. 5, no. 5, pp. 266–281, 1994.
[2] C.-P. Chen, S. S.-F. Gau, and C.-C. Lee, “Learning converse-level multimodal embedding
to assess social deficit severity for autism spectrum disorder,” in 2020 IEEE International
Conference on Multimedia and Expo (ICME), pp. 1–6, IEEE, 2020.
[3] H. Giles and P. Powesland, “Accommodation theory,” Sociolinguistics: A reader,
pp. 232–239, 1997.
[4] A. Gorbyleva, “Prosodic interaction models in a conversation,” in International Confer-
ence on Speech and Computer, pp. 380–388, Springer, 2023.
[5] J. Cakir, R. E. Frye, and S. J. Walker, “The lifetime social cost of autism: 1990–2029,”
RASD, vol. 72, p. 101502, 2020.
[6] G. Xu, L. Strathearn, B. Liu, and W. Bao, “Prevalence of autism spectrum disorder among
us children and adolescents, 2014-2016,” Jama, vol. 319, no. 1, pp. 81–82, 2018.
[7] S. Qiu, Y. Lu, Y. Li, J. Shi, H. Cui, Y. Gu, Y. Li, W. Zhong, X. Zhu, Y. Liu, et al., “Preva-
lence of autism spectrum disorder in asia: A systematic review and meta-analysis,” Psy-
chiatry research, vol. 284, p. 112679, 2020.
[8] A. P. Association, A. P. Association, et al., “Diagnostic and statistical manual of mental
disorders: Dsm-5,” Arlington, VA, 2013.
[9] J. C. Wakefield, “Diagnostic issues and controversies in dsm-5: return of the false posi-
tives problem,” Annual review of clinical psychology, vol. 12, pp. 105–132, 2016.
[10] J. N. Constantino and C. P. Gruber, Social responsiveness scale: SRS-2. Western Psy-
chological Services Torrance, CA, 2012.
[11] L. C. Eaves, H. D. Wingert, H. H. Ho, and E. C. Mickelson, “Screening for autism spec-
trum disorders with the social communication questionnaire,” Journal of Developmental
& Behavioral Pediatrics, vol. 27, no. 2, pp. S95–S103, 2006.
[12] C. Lord, S. Risi, L. Lambrecht, E. H. Cook, B. L. Leventhal, P. C. DiLavore, A. Pick-
les, and M. Rutter, “The autism diagnostic observation schedule—generic: A standard
measure of social and communication deficits associated with the spectrum of autism,”
Journal of autism and developmental disorders, vol. 30, no. 3, pp. 205–223, 2000.
[13] E. Schopler, M. D. Lansing, R. J. Reichler, and L. M. Marcus, PEP-3, Psychoeducational
Profile. Pro-ed, 2005.
[14] S. Narayanan and P. G. Georgiou, “Behavioral signal processing: Deriving human be-
havioral informatics from speech and language,” Proceedings of the IEEE, vol. 101,
no. 5, pp. 1203–1233, 2013.
[15] P. J. Fray, T. W. Robbins, and B. J. Sahakian, “Neuorpsychiatyric applications of cantab.,”
International journal of geriatric psychiatry, vol. 11, no. 4, 1996.
[16] R. Landa, “Early communication development and intervention for children with
autism,” Developmental Disabilities Research Reviews, vol. 13, no. 1, pp. 16–25, 2007.
[17] C. Lord, E. H. Cook, B. L. Leventhal, and D. G. Amaral, “Autism spectrum disorders,”
Neuron, vol. 28, no. 2, pp. 355–363, 2000.
[18] G. W. H. Organization., The ICD-10 classification of mental and behavioural disorders:
clinical descriptions and diagnostic guidelines, vol. 1. World Health Organization, 1992.
[19] C. Gillberg, C. Gillberg, M. Råstam, and E. Wentz, “The asperger syndrome (and high-
functioning autism) diagnostic interview (asdi): a preliminary study of a new structured
clinical interview,” Autism, vol. 5, no. 1, pp. 57–66, 2001.
[20] S. Ozonoff, I. Cook, H. Coon, G. Dawson, R. M. Joseph, A. Klin, W. M. McMahon,
N. Minshew, J. A. Munson, B. F. Pennington, et al., “Performance on cambridge neu-
ropsychological test automated battery subtests sensitive to frontal lobe function in peo-
ple with autistic disorder: evidence from the collaborative programs of excellence in
autism network,” Journal of autism and developmental disorders, vol. 34, no. 2, pp. 139–150, 2004.
[21] L. Bennetto, B. F. Pennington, and S. J. Rogers, “Intact and impaired memory functions
in autism,” Child development, vol. 67, no. 4, pp. 1816–1835, 1996.
[22] S. Ozonoff and J. Jensen, “Brief report: Specific executive function profiles in three
neurodevelopmental disorders,” Journal of autism and developmental disorders, vol. 29,
no. 2, pp. 171–177, 1999.
[23] M. Prior and W. Hoffmann, “Brief report: Neuropsychological testing of autistic children
through an exploration with frontal lobe tests,” Journal of autism and developmental
disorders, vol. 20, no. 4, pp. 581–590, 1990.
[24] J. E. Russell, Autism as an executive disorder. Oxford University Press, 1997.
[25] E. L. Hill, “Executive dysfunction in autism,” Trends in cognitive sciences, vol. 8, no. 1,
pp. 26–32, 2004.
[26] S. Ozonoff, M. South, and J. N. Miller, “Dsm-iv-defined asperger syndrome: Cogni-
tive, behavioral and early history differentiation from high-functioning autism,” Autism,
vol. 4, no. 1, pp. 29–46, 2000.
[27] S. D. Steele, N. J. Minshew, B. Luna, and J. A. Sweeney, “Spatial working mem-
ory deficits in autism,” Journal of autism and developmental disorders, vol. 37, no. 4,
pp. 605–612, 2007.
[28] A. P. Association, Diagnostic and statistical manual of mental disorders (4th ed., text
rev.). Washington, DC: Author, 1994.
[29] L. Wing, “Asperger’s syndrome: a clinical account,” Psychological medicine, vol. 11,
no. 1, pp. 115–129, 1981.
[30] M. Ghaziuddin and L. Gerstein, “Pedantic speaking style differentiates asperger syn-
drome from high-functioning autism,” Journal of autism and developmental disorders,
vol. 26, no. 6, pp. 585–595, 1996.
[31] M. Ghaziuddin, “Brief report: Should the dsm v drop asperger syndrome?,” Journal of
autism and developmental disorders, vol. 40, no. 9, pp. 1146–1148, 2010.
[32] D. Bone, C.-C. Lee, T. Chaspari, J. Gibson, and S. Narayanan, “Signal processing and
machine learning for mental health research and clinical applications [perspectives],”
IEEE Signal Processing Magazine, vol. 34, no. 5, pp. 196–195, 2017.
[33] C.-C. Lee, A. Katsamanis, M. P. Black, B. R. Baucom, A. Christensen, P. G. Geor-
giou, and S. S. Narayanan, “Computing vocal entrainment: A signal-derived pca-based
quantification scheme with application to affect analysis in married couple interactions,”
Computer Speech & Language, vol. 28, no. 2, pp. 518–539, 2014.
[34] M. Reblin, R. E. Heyman, L. Ellington, B. R. Baucom, P. G. Georgiou, and S. T. Vada-
parampil, “Everyday couples'communication research: Overcoming methodological
barriers with technology,” Patient education and counseling, vol. 101, no. 3, pp. 551–
556, 2018.
[35] M. Nasir, B. Baucom, S. Narayanan, and P. Georgiou, “Towards an unsupervised en-
trainment distance in conversational speech using deep neural networks,” arXiv preprint
arXiv:1804.08782, 2018.
[36] B. Xiao, P. G. Georgiou, Z. E. Imel, D. C. Atkins, and S. Narayanan, “Modeling thera-
pist empathy and vocal entrainment in drug addiction counseling.,” in INTERSPEECH,
pp. 2861–2865, 2013.
[37] C.-P. Chen, X.-H. Tseng, S. S.-F. Gau, and C.-C. Lee, “Computing multimodal dyadic
behaviors during spontaneous diagnosis interviews toward automatic categorization of
autism spectrum disorder,” in Proc. Interspeech 2017, pp. 2361–2365, 2017.
[38] S. S.-F. Gau and C.-Y. Shang, “Executive functions as endophenotypes in adhd: evi-
dence from the cambridge neuropsychological test battery (cantab),” Journal of Child
Psychology and Psychiatry, vol. 51, no. 7, pp. 838–849, 2010.
[39] Y.-L. Chien, S.-F. Gau, C.-Y. Shang, Y.-N. Chiu, W.-C. Tsai, and Y.-Y. Wu, “Visual mem-
ory and sustained attention impairment in youths with autism spectrum disorders,” Psy-
chological medicine, vol. 45, no. 11, pp. 2263–2273, 2015.
[40] C. Hughes, J. Russell, and T. W. Robbins, “Evidence for executive dysfunction in
autism,” Neuropsychologia, vol. 32, no. 4, pp. 477–492, 1994.
[41] H. Wang and C. Schmid, “Action recognition with improved trajectories,” in IEEE In-
ternational Conference on Computer Vision, (Sydney, Australia), 2013.
[42] D. Roy, C. K. Mohan, and K. S. R. Murty, “Action recognition based on discrimina-
tive embedding of actions using siamese networks,” in 2018 25th IEEE International
Conference on Image Processing (ICIP), pp. 3473–3477, IEEE, 2018.
[43] L. Wang, Y. Qiao, and X. Tang, “Action recognition with trajectory-pooled deep-
convolutional descriptors,” in Proceedings of the IEEE conference on computer vision
and pattern recognition, pp. 4305–4314, 2015.
[44] P. . Boersma, “Praat, a system for doing phonetics by computer,” Glot international
5:9/10, vol. 5, 2002.
[45] C. Busso, S. Lee, and S. Narayanan, “Analysis of emotionally salient aspects of fun-
damental frequency for emotion detection,” IEEE transactions on audio, speech, and
language processing, vol. 17, no. 4, pp. 582–596, 2009.
[46] J. Hillenbrand and R. A. Houde, “Acoustic correlates of breathy vocal quality: dysphonic
voices and continuous speech,” Journal of Speech, Language, and Hearing Research,
vol. 39, no. 2, pp. 311–321, 1996.
[47] B. Halberstam, “Acoustic and perceptual parameters relating to connected speech are
more reliable measures of hoarseness than parameters relating to sustained vowels,”
ORL, vol. 66, no. 2, pp. 70–73, 2004.
[48] A. Mcallister, J. Sundberg, and S. R. Hibi, “Acoustic measurements and perceptual eval-
uation of hoarseness in children’s voices,” Logopedics Phoniatrics Vocology, vol. 23,
no. 1, pp. 27–38, 1998.
[49] D. Bone, C.-C. Lee, M. P. Black, M. E. Williams, S. Lee, P. Levitt, and S. Narayanan,
“The psychologist as an interlocutor in autism spectrum disorder assessment: Insights
from a study of spontaneous prosody,” Journal of Speech, Language, and Hearing Re-
search, vol. 57, no. 4, pp. 1162–1177, 2014.
[50] M. Wilson and T. P. Wilson, “An oscillator model of the timing of turn-taking,” Psycho-
nomic bulletin & review, vol. 12, no. 6, pp. 957–968, 2005.
[51] Z. Warren, M. L. McPheeters, N. Sathe, J. H. Foss-Feig, A. Glasser, and J. Veenstra-
VanderWeele, “A systematic review of early intensive intervention for autism spectrum
disorders,” Pediatrics, vol. 127, no. 5, pp. e1303–e1311, 2011.
[52]L. A. LeBlanc, A. M. Coates, S. Daneshvar, M. H. Charlop-Christy, C. Morris, and B. M.
Lancaster, “Using video modeling and reinforcement to teach perspective-taking skills to
children with autism,” Journal of applied behavior analysis, vol. 36, no. 2, pp. 253–257,
2003.
[53] J. Brok and E. Barakova, “Engaging autistic children in imitation and turn-taking games
with multiagent system of interactive lighting blocks,” Entertainment Computing-ICEC
2010, pp. 115–126, 2010.
[54] M. Goudbeek and K. Scherer, “Beyond arousal: Valence and potency/control cues in the
vocal expression of emotion,” The Journal of the Acoustical Society of America, vol. 128,
no. 3, pp. 1322–1336, 2010.
[55]C. Adams, J. Green, A. Gilchrist, and A. Cox, “Conversational behaviour of children with
asperger syndrome and conduct disorder,” Journal of Child Psychology and Psychiatry,
vol. 43, no. 5, pp. 679–690, 2002.
[56] F. Eyben, K. R. Scherer, B. W. Schuller, J. Sundberg, E. André, C. Busso, L. Y. Devillers,
J. Epps, P. Laukka, S. S. Narayanan, et al., “The geneva minimalistic acoustic parameter
set (gemaps) for voice research and affective computing,” IEEE Transactions on Affective
Computing, vol. 7, no. 2, pp. 190–202, 2016.
[57] N. Matsuura, M. Ishitobi, S. Arai, K. Kawamura, M. Asano, K. Inohara, T. Narimoto,
Y. Wada, M. Hiratani, and H. Kosaka, “Distinguishing between autism spectrum disor-
der and attention deficit hyperactivity disorder by using behavioral checklists, cognitive
assessments, and neuropsychological test battery,” Asian journal of psychiatry, vol. 12,
pp. 50–57, 2014.
[58] A. Kushki, J. Brian, A. Dupuis, and E. Anagnostou, “Functional autonomic nervous
system profile in children with autism spectrum disorder,” Molecular autism, vol. 5,
no. 1, p. 39, 2014.
[59] S. Ozonoff and D. L. Strayer, “Further evidence of intact working memory in autism,”
Journal of autism and developmental disorders, vol. 31, no. 3, pp. 257–263, 2001.
[60] H. B. Garretson, D. Fein, and L. Waterhouse, “Sustained attention in children with
autism,” Journal of autism and developmental disorders, vol. 20, no. 1, pp. 101–114,
1990.
[61] A. B. Sereno and S. C. Amador, “Attention and memory-related responses of neurons
in the lateral intraparietal area during spatial and shape-delayed match-to-sample tasks,”
Journal of neurophysiology, vol. 95, no. 2, pp. 1078–1098, 2006.
[62] S.-F. Chen, Y.-L. Chien, C.-T. Wu, C.-Y. Shang, Y.-Y. Wu, and S. Gau, “Deficits in execu-
tive functions among youths with autism spectrum disorders: an age-stratified analysis,”
Psychological medicine, vol. 46, no. 8, pp. 1625–1638, 2016.
[63] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirec-
tional transformers for language understanding,” arXiv:1810.04805v2, 2018.
[64] C. May, A. Wang, S. Bordia, S. R. Bowman, and R. Rudinger, “On measuring social
biases in sentence encoders,” arXiv:1903.10561v1, 2019.
[65] Y. Qiao, C. Xiong, Z. Liu, and Z. Liu, “Understanding the behaviors of bert in ranking,”
arXiv:1904.07531v4, 2019.
[66] F. Eyben, M. Wöllmer, and B. Schuller, “Opensmile: the munich versatile and fast open-
source audio feature extractor,” in Proceedings of the 18th ACM international conference
on Multimedia, pp. 1459–1462, ACM, 2010.
[67] L. Logeswaran and H. Lee, “An efficient framework for learning sentence representa-
tions,” arXiv:1803.02893v1, 2018.
[68] D. P. Kingma and J. Ba, “Adam:A method for stochastic optimization,”arXiv:1412.6980v9, 2014.
[69] I. Loshchilov and F. Hutter,“Decoupled weigh decayregularization,”arXiv:1711.05101v3, 2018.
[70] N. Rogge and J. Janssen, “The economic costs of autism spectrum disorder: A literature
review,” JADD, vol. 49, no. 7, pp. 2873–2900, 2019.
[71] M.-H. Chen, H.-T. Wei, L.-C. Chen, T.-P. Su, Y.-M. Bai, J.-W. Hsu, K.-L. Huang, W.-H.
Chang, T.-J. Chen, and Y.-S. Chen, “Autistic spectrum disorder, attention deficit hyperac-
tivity disorder, and psychiatric comorbidities: A nationwide study,” Research in Autism
Spectrum Disorders, vol. 10, pp. 1–6, 2015.
[72] D. Bone, J. Mertens, E. Zane, S. Lee, S. S. Narayanan, and R. B. Grossman, “Acoustic-
prosodic and physiological response to stressful interactions in children with autism spec-
trum disorder.,” in INTERSPEECH, pp. 147–151, 2017.
[73] D. Bone, S. Bishop, R. Gupta, S. Lee, and S. S. Narayanan, “Acoustic-prosodic and
turn-taking features in interactions with children with neurodevelopmental disorders.,”
in INTERSPEECH, pp. 1185–1189, 2016.
[74] M. Li, D. Tang, J. Zeng, T. Zhou, H. Zhu, B. Chen, and X. Zou, “An automated assess-
ment framework for atypical prosody and stereotyped idiosyncratic phrases related to
autism spectrum disorder,” Computer Speech & Language, vol. 56, pp. 80–94, 2019.
[75] J. A. Richards, D. Xu, and J. Gilkerson, “Development and performance of the lena
automatic autism screen,” Lena Foundation, 2010.
[76] C. Guo, F. Chen, Y. Chang, and J. Yan, “Applying random forest classification to diag-
nose autism using acoustical voice-quality parameters during lexical tone production,”
Biomedical Signal Processing and Control, vol. 77, p. 103811, 2022.
[77] J. Bishop, C. Zhou, K. Antolovic, L. Grebe, K. H. Hwang, G. Imaezue, E. Kistanova,
K. E. Lee, K. Paulino, and S. Zhang, “Brief report: Autistic traits predict spectral corre-
lates of vowel intelligibility for female speakers,” Journal of Autism and Developmental
Disorders, pp. 1–6, 2021.
[78]M. Kissine, P. Geelhand, M. Philippart De Foy, B. Harmegnies, and G. Deliens, “Phonetic
inflexibility in autistic adults,” Autism Research, vol. 14, no. 6, pp. 1186–1196, 2021.
[79] R. Fusaroli, A. Lambrechts, D. Bang, D. M. Bowler, and S. B. Gaigg, “Is voice a marker
for autism spectrum disorder? a systematic review and meta-analysis,” Autism Research,
vol. 10, no. 3, pp. 384–407, 2017.
[80] M. Kissine and P. Geelhand, “Brief report: Acoustic evidence for increased articulatory
stability in the speech of adults with autism spectrum disorder,” Journal of autism and
developmental disorders, vol. 49, no. 6, pp. 2572–2580, 2019.
[81] T. Talkar, J. R. Williamson, D. J. Hannon, H. M. Rao, S. Yuditskaya, K. T. Claypool,
D. Sturim, L. Nowinski, H. Saro, C. Stamm, et al., “Assessment of speech and fine
motor coordination in children with autism spectrum disorder,” IEEE Access, vol. 8,
pp. 127535–127545, 2020.
[82] L. McKeever, J. Cleland, J. Delafield-Butt, S. Fuchs, J. Cleland, and A. Rochet-Capellan,
“Aetiology of speech sound errors in autism,” in Speech Production and Perception:
Learning and Memory, 2018.
[83] B. A. Lippke, S. E. Dickey, J. W. Selmar, and A. L. Soder, PAT-3: Photo Articulation
Test. Pro-Ed, 1997.
[84] K. Ochi, N. Ono, K. Owada, M. Kojima, M. Kuroda, S. Sagayama, and H. Yamasue,
“Quantification of speech and synchrony in the conversation of adults with autism spec-
trum disorder,” PloS one, vol. 14, no. 12, p. e0225377, 2019.
[85] H. Lehnert-LeHouillier, S. Terrazas, and S. Sandoval, “Prosodic entrainment in conver-
sations of verbal children and teens on the autism spectrum,” Frontiers in Psychology,
p. 2718, 2020.
[86] J. Kruyt and Š. Beňuš, “Prosodic entrainment in individuals with autism spectrum disor-
der.,” Topics in Linguistics, vol. 22, no. 2, 2021.
[87] S. Sapir, C. Fox, J. Spielman, and L. Ramig, “Acoustic metrics of vowel articula-
tion in parkinson’s disease: Vowel space area (vsa) vs. vowel articulation index (vai),”
MAVEBA, pp. 173–175, 2011.
[88] C. DiCanio, H. Nam, J. D. Amith, R. C. García, and D. H. Whalen, “Vowel variability
in elicited versus spontaneous speech: Evidence from mixtec,” Journal of Phonetics,
vol. 48, pp. 45–59, 2015.
[89] M. K. Belmonte, T. Saxena-Chandhok, R. Cherian, R. Muneer, L. George, and
P. Karanth, “Oral motor deficits in speech-impaired children with autism,” Frontiers
in Integrative Neuroscience, vol. 7, p. 47, 2013.
[90] J. P. McCleery, N. A. Elliott, D. S. Sampanis, and C. A. Stefanidou, “Motor development
and motor resonance difficulties in autism: relevance to early intervention for language
and communication skills,” Frontiers in integrative neuroscience, vol. 7, p. 30, 2013.
[91] C. J. Wynn, E. R. Josephson, and S. A. Borrie, “An examination of articulatory precision
in autistic children and adults,” JSLHR, pp. 1–10, 2022.
[92] R. H. Gálvez, L. Gauder, J. Luque, and A. Gravano, “A unifying framework for mod-
eling acoustic/prosodic entrainment: definition and evaluation on two large corpora,” in
SIGDIAL, pp. 215–224, 2020.
[93] J. D. V. Quiros, O. Kapcak, H. Hung, and L. Cabrera-Quiros, “Individual and joint body
movement assessed by wearable sensing as a predictor of attraction in speed dates,”
TAFC, 2021.
[94] P. Boersma, “Praat, a system for doing phonetics by computer,” Glot. Int., vol. 5, no. 9,
pp. 341–345, 2001.
[95] F. Wu, L. P. García-Perera, D. Povey, and S. Khudanpur, “Advances in Automatic Speech
Recognition for Child Speech Using Factored Time Delay Neural Network,” in Proc.
Interspeech 2019, pp. 1–5, 2019.
[96] Y.-F. Liao, Y.-H. S. Chang, Y.-C. Lin, W.-H. Hsu, M. Pleva, and J. Juhar, “Formosa
speech in the wild corpus for improving taiwanese mandarin speech-enabled human-
computer interaction,” Journal of Signal Processing Systems, vol. 92, no. 8, pp. 853–873,
2020.
[97] B. M. Lobanov, “Classification of russian vowels spoken by different speakers,” The
Journal of the Acoustical Society of America, vol. 49, no. 2B, pp. 606–608, 1971.
[98] N. Flynn, “Comparing vowel formant normalisation procedures,” York Papers in Lin-
guistics Series, vol. 2, no. 11, pp. 1–28, 2011.
[99] R. D. Kent and H. K. Vorperian, “Static measurements of vowel formant frequencies and
bandwidths: A review,” JCD, vol. 74, pp. 74–97, 2018.
[100] N. Roy, S. L. Nissen, C. Dromey, and S. Sapir, “Articulatory changes in muscle ten-
sion dysphonia: evidence of vowel space expansion following manual circumlaryngeal
therapy,” JCD, vol. 42, no. 2, pp. 124–135, 2009.
[101] S. S. Wilks, “Multidimensional statistical scatter,” Contributions to probability and
statistics, pp. 486–503, 1960.
[102] C.-P. Chen, S. S.-F. Gau, and C.-C. Lee, “Toward differential diagnosis of autism spec-
trum disorder using multimodal behavior descriptors and executive functions,” Computer
Speech & Language, vol. 56, pp. 17–35, 2019.
[103] C. J. Wynn and S. A. Borrie, “Classifying conversational entrainment of speech behavior:
An expanded framework and review,” Journal of Phonetics, vol. 94, p. 101173, 2022.
[104] M. Sundararajan and A. Najmi, “The many shapley values for model explanation,” in
ICML (H. D. III and A. Singh, eds.), vol. 119 of Proceedings of Machine Learning Re-
search, pp. 9269–9278, PMLR, 13–18 Jul 2020.
[105] S. Weidman, M. Breen, and K. C. Haydon, “Prosodic speech entrainment in romantic
relationships,” in proceedings of Speech Prosody, pp. 508–512, 2016.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *