帳號:guest(3.144.102.156)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):賴韻婷
作者(外文):Lai, Yun-Ting
論文名稱(中文):針對女性頭聲、混偏頭聲、混偏胸聲、胸聲區音聲的分析與分類
論文名稱(外文):Analysis and Classification of Female Voices in Head, Headmix, Chestmix, and Chest Register
指導教授(中文):劉奕汶
蘇郁惠
指導教授(外文):Liu, Yi-Wen
Su, Yu-Huei
口試委員(中文):蔡振家
冀泰石
口試委員(外文):Tsai, Chen-Gia
Chi, Tai-Shih
學位類別:碩士
校院名稱:國立清華大學
系所名稱:音樂學系所
學號:106591506
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:35
中文關鍵詞:女性音聲聲區頻譜分析頭聲區混偏頭聲區混偏胸聲區胸聲區支援向量機多層感知器分類
外文關鍵詞:Female VoicesVocal RegistersSpectral AnalysisHead RegisterHeadmix RegisterChestmix RegisterChest RegisterSupport Vector MachinesMulti-layer PerceptronClassification
相關次數:
  • 推薦推薦:0
  • 點閱點閱:564
  • 評分評分:*****
  • 下載下載:29
  • 收藏收藏:0
對於初學歌唱者,在演唱中辨識和控制聲區是一項困難的事情,因此,本篇論文將研究女歌手於頭聲、混偏頭聲、混偏胸聲、胸聲區的音聲,盼能藉由建立穩健、高準確率的聲區分類模型以幫助歌手學習歌唱。在本研究中,首先充分討論了四種聲區的定義,接著,錄下三位女歌手演繹各聲區的音聲並交由兩位評審標註資料,再將這些音檔的每個時幀利用WORLD語音合成工具取出「梅爾倒頻譜係數」、「帶狀非週期性的頻譜包絡」和「基本頻率」,接著利用兩種機器學習演算法(支持向量機和多層感知機)進行分類,最後分別得到平均準確率為70%及68%的實驗結果。
For beginner vocalists, it is a difficult task to recognize and control vocal registers during singing. Thus, this study aims to analyze female singers' voices in head, headmix, chestmix and chest registers and hope to establish a robust classifier with high classification accuracy to help vocalists learn singing. In this study, the definitions of four target registers were fully discussed at first. Then, voice data in these registers were recorded by three female singers and labelled by two judges. For each time frame from audio files, mel-cepstral coefficients (MCC), band aperiodic spectral envelope (BAP), and fundamental frequency (F0) were extracted from the WORLD vocoder. Two machine learning techniques (Support Vector Machines and Multi-layer Perceptron) were then adopted for classification. Finally, the mean accuracy yielded from these classifiers were 70% and 68%, respectively.
摘要 i
Abstract ii
Acknowledgements iii
1 Introduction 1
1.1 Motivation and Purpose 1
1.2 Related Works 2
1.3 Task Description 3
1.4 Organization of Thesis 3
2 Voice Production and Vocal Registers 4
2.1 Singing Voice Production 4
2.2 Definition and Analysis of Vocal Registers 5
3 Dataset 7
3.1 Recording Task 7
3.2 Measurement 8
3.3 Statistical Distribution of the Dataset 9
4 Methods 10
4.1 WORLD Vocoder 10
4.2 Features 13
4.3 Supervised Learning Algorithms 14
4.3.1 Support Vector Machines 14
4.3.2 Multi-layer Perceptron 15
5 Experiment and Results 17
5.1 Process Overview 17
5.2 Data Preprocessing 18
5.2.1 Feature Extraction and Cleaning 18
5.2.2 Normalization 18
5.3 K-Fold Cross Validation 19
5.4 Classification Results 20
5.5 Regression Results 22
6 Discussion 25
6.1 Dataset 25
6.2 Selected Features 26
6.3 Classification and Regression Results 27
7 Conclusion and Future Works 29
Bibliography 31
Appendix 34
A.1 Suggestions From The Oral Defense Committees 34
[1] K. A. Kochis-Jennings, E. M. Finnegan, H. T. Hoffman, and S. Jaiswal, “Laryngeal muscle activity and vocal fold adduction during chest, chestmix, headmix, and head registers in females,” Journal of Voice, vol. 26, no. 2, pp. 182–193, 2012.
[2] H. Hollien, “On vocal registers,” Journal of Phonetics, vol. 2, pp. 125–143, 1972.
[3] R. Colton, “Spectral characteristics of the modal and falsetto registers,” Folia Phoniatrica et Logopaedica, vol. 24, no. 5–6, pp. 337–344, 1972.
[4] I. R. Titze, “A framework for the study of vocal registers,” Journal of Voice, vol. 2, no. 3, pp. 183–194, 1988.
[5] I. R. Titze and D. W. Martin, “Principles of voice production,” Iowa City, IA: National Center for Voice and Speech, pp. 200–350, 1998.
[6] K. A. Kochis-Jennings, E. M. Finnegan, H. T. Hoffman, S. Jaiswal, and D. Hull, “Cricothyroid muscle and thyroarytenoid muscle dominance in vocal register control: pre- liminary results,” Journal of Voice, vol. 28, no. 5, pp. 652–e21, 2014.
[7] G. J. Mysore, R. J. Cassidy, and J. O. Smith, “Singer-dependent falsetto detection for live vocal processing based on support vector classification,” in 2006 Fortieth Asilomar Conference on Signals, Systems and Computers, pp. 1139–1142, IEEE, 2006.
[8] A. Zysk and P. Badura, “An approach for vocal register recognition based on spectral analysis of singing,” World Academy of Science, Engineering and Technology, Interna- tional Journal of Computer, Electrical, Automation, Control and Information Engineer- ing, vol. 11, no. 2, pp. 207–212, 2017.
[9] M. Morise, F. Yokomori, and K. Ozawa, “World: a vocoder-based high-quality speech synthesis system for real-time applications,” IEICE Transactions on Information and Sys- tems, vol. 99, no. 7, pp. 1877–1884, 2016.
[10] 蕭自佑, “音聲醫學概論,” 台北: 藝軒, 1999.
[11] T. F. Cleveland, P. J. Sundberg, J. Prokop, et al., “Aerodynamic and acoustical measures of speech, operatic, and broadway vocal styles in a professional female singer,” Journal of Voice, vol. 17, no. 3, pp. 283–297, 2003.
[12] A. Keidar, R. R. Hurtig, and I. R. Titze, “The perceptual nature of vocal register change,”
Journal of Voice, vol. 1, no. 3, pp. 223–233, 1987.
[13] J. W. Large, “Acoustical study of isoparametric tones in the female chest and middle regis- ters in singing,” The Journal of the Acoustical Society of America, vol. 45, no. 1, pp. 314– 314, 1969.
[14] J. Estill, T. Baer, K. Honda, and K. S. Harris, “Supralaryngeal activity in a study of six voice qualities,” in Proceedings of the Stockholm Music Acoustics Conference, vol. 1983, pp. 157–174, Royal Swedish Academy of Music, 1985.
[15] J. Sundberg, T. F. Cleveland, R. Stone Jr, and J. Iwarsson, “Voice source characteristics in six premier country singers,” Journal of Voice, vol. 13, no. 2, pp. 168–183, 1999.
[16] J. Sundberg and C. Högset, “Voice source differences between falsetto and modal registers in counter tenors, tenors and baritones,” Logopedics Phoniatrics Vocology, vol. 26, no. 1, pp. 26–36, 2001.
[17] M. Thalén and J. Sundberg, “Describing different styles of singing: A comparison of a female singer’s voice source in ‘classical’, ‘pop’, ‘jazz’ and ‘blues’,” Logopedics Phoni- atrics Vocology, vol. 26, no. 2, pp. 82–93, 2001.
[18] P. Kitzing, “Photo-and electroglottographical recording of the laryngeal vibratory pattern during different registers,” Folia Phoniatrica et Logopaedica, vol. 34, no. 5, pp. 234–241, 1982.
[19] B. Roubeau, C. Chevrie-Muller, and C. Arabia-Guidet, “Electroglottographic study of the changes of voice registers,” Folia Phoniatrica et Logopaedica, vol. 39, no. 6, pp. 280–289, 1987.
[20] E. Vilkman, P. Alku, and A.-M. Laukkanen, “Vocal-fold collision mass as a differentiator between registers in the low-pitch range,” Journal of Voice, vol. 9, no. 1, pp. 66–73, 1995.
[21] M. Morise, “Cheaptrick, a spectral envelope estimator for high-quality speech synthesis,”
Speech Communication, vol. 67, pp. 1–7, 2015.
[22] M. Morise, “Error evaluation of an f0-adaptive spectral envelope estimator in robustness against the additive noise and f0 error,” IEICE Transactions on Information and Systems, vol. 98, no. 7, pp. 1405–1408, 2015.
[23] M. Morise, “D4C, a band-aperiodicity estimator for high-quality speech synthesis,”
Speech Communication, vol. 84, pp. 57–65, 2016.
[24] M. Morise, H. Kawahara, and H. Katayose, “Fast and reliable f0 estimation method based on the period extraction of vocal fold vibration of singing voice and speech,” in Audio Engineering Society Conference: 35th International Conference: Audio for Games, Audio Engineering Society, 2009.
[25] M. Morise, H. Kawahara, and T. Nishiura, “Rapid f0 estimation for high-snr speech based on fundamental component extraction,” Trans. IEICEJ, vol. 93, pp. 109–117, 2010.
[26] H. Fletcher and W. A. Munson, “Loudness, its definition, measurement and calculation,”
Bell System Technical Journal, vol. 12, no. 4, pp. 377–430, 1933.
[27] D. W. Robinson and R. S. Dadson, “A re-determination of the equal-loudness relations for pure tones,” British Journal of Applied Physics, vol. 7, no. 5, p. 166, 1956.
[28] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, no. 3, pp. 273–297, 1995.
[29] D. C. Liu and J. Nocedal, “On the limited memory bfgs method for large scale optimiza- tion,” Mathematical programming, vol. 45, no. 1–3, pp. 503–528, 1989.
[30] L. Bottou and O. Bousquet, “The tradeoffs of large scale learning,” in Advances in neural information processing systems, pp. 161–168, 2008.
[31] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[32] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blon- del, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau,
M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,”
Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
[33] S. R. Gunn et al., “Support vector machines for classification and regression,” ISIS tech- nical report, vol. 14, no. 1, pp. 5–16, 1998.
[34] F. Murtagh, “Multilayer perceptrons for classification and regression,” Neurocomputing, vol. 2, no. 5–6, pp. 183–197, 1991.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *