帳號:guest(3.128.197.55)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):吳柏辰
作者(外文):Wu, Po-Chen.
論文名稱(中文):利用圓形與垂直麥克風陣列之聲場定位和分離
論文名稱(外文):Acoustic source localization and signal separation using a circular microphone array with a vertical extension
指導教授(中文):白明憲
指導教授(外文):Bai, Ming-Sian
口試委員(中文):劉奕汶
陳榮順
口試委員(外文):Liu, Yi-Wen
Chen, Rong-Shun
學位類別:碩士
校院名稱:國立清華大學
系所名稱:動力機械工程學系
學號:104033542
出版年(民國):106
畢業學年度:105
語文別:英文
論文頁數:55
中文關鍵詞:麥克風陣列音源定位音源分離
外文關鍵詞:microphone arraysource localizatoinsource separation
相關次數:
  • 推薦推薦:0
  • 點閱點閱:460
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
本論文討論在聲場分析中重要議題;聲源定位及混合訊號分離在三維聲場,傳統圓型麥克風陣列不能進行仰角的定位,對仰角不敏感,且傳統垂直式線性麥克風陣列則是只能進行仰角定位,對水平角不敏感,本篇論文提出以混合式麥克風陣列,將一個對數間距的線性陣列垂直放置在圓型麥克風陣列的中心點上進行三維聲場之定位與分離。演算法基於以平面波方式傳遞的基礎,進行實驗定位與分離,以一階段或是兩階段的方法求解。在一階段的方法中,定位以及分離同時進行,在未定系統下,虛擬聲源數目大於麥克風數目,散佈虛擬聲源在三維聲場中,這類壓縮感知(compress sensing, CS)的問題來可以用有效的凸集合最佳化來求解(convex optimization, CVX)以及(focal underdetermined system solver, FOCUSS)方法求解。在兩階段方法中,在定位的階段,以最小變易無失真響應法(Minimum power distortionless response, MPDR)與多重信號分類法(Multiple signal classification, MUSIC)來進行聲源定位。至於在聲源分離的部分,在過定系統下,將使用題可諾夫正規化(Tikhonov regularization, TIKR)還原原本的聲源。硬體方面,圓形麥克風陣列使用三維印刷技術製造,架構為24個微機電系統麥克風等間距散布在圓環上,而對數間距的線性陣列則是用壓克力製作而成,8個微機電系統麥克風陣列分布在上面。分離後的訊號會使用主觀與客觀的標準來進行判斷。客觀判斷將使用語音質量感性評估(PESQ)以及訊噪比(segSNR)還有Google語音辨識來進行評估的標準,主觀的聆聽測試也會當作評估參考。
This thesis discusses issues in sound field analysis: acoustic sources localization and mixed signals separation in the three dimensional (3-D) sound field. The conventional uniform circular microphone array (UCA) can not separate the signal in the zenith angles. However, a symmetric vertical logarithmic-spacing linear array (LLA) can not extract the signals in the azimuth. This thesis proposes the hybrid microphone array (HMA) which consists of an unbaffled CMA with the vertical extension LLA in the center to analyze the 3-D sound field. Two methods are developed to locate and separate acoustic signals on the basis of plane-wave decomposition, and a one-stage method and a two-stage method are suggested. In the one-stage method, localization and separation are accomplished in the single shot. In the underdetermined system, the numbers of virtual sources are more than the microphones. This problem is solved by using compressive sensing (CS) techniques such as convex optimization (CVX), and the focal underdetermined system solver (FOCUSS). In the two-stage method, localization is first carried out by minimum power distortionless response (MPDR) or multiple signal classification (MUSIC) algorithms, followed by separation using Tikhonov regularization (TIKR) on the basis of the estimated source bearings. In our hardware architecture, there are a three-dimensionally printed circular array comprised of a 24- micro-electro-mechanical-system (MEMS) CMA and an 8- MEMS LLA is constructed. For evaluating the difference between the separated signals and the original sources, objective test and subjective listening test is applied. The separated signals are evaluated by the objective test using perceptual evaluation of speech quality (PESQ) test, segmental signal-to-noise ratio (segSNR) and Google® voice recognition.
摘 要 1
ABSTRACT 2
LIST OF FIGURES 5
LIST OF TABLES 6
Chapter 1 INTRODUCTION 7
Chapter2 ARRAY SIGNAL PROCESSING 11
2.1 Farfield Array model 12
2.2 Spatial correlation matrix 13
2.3 The manifold matrix of the hybrid array 14
Chapter 3 ACOUSTIC SOURCE LOCALIZATION AND SEPARATION 18
3.1 One-stage methods 19
3.1.1 Convex optimization 19
3.1.2 Focal underdetermined system solver 20
3.2 Two-stage methods in DOA estimation 21
3.2.1 MPDR algorithm 21
3.2.2 MUSIC algorithm 23
3.3 Two-stage methods in separation 25
3.3.1 TIKR method 25
3.4 Choice of regularization parameter 28
Chapter 4 NUMERICAL SIMULATION AND EXPERIMENTAL VALIDATION 35
4.1 Objective test and subjective test 36
4.2 Numerical simulation 37
4.2.1 One-stage method 37
4.2.2 Two-stage method 38
4.3 Experiment 39
Chapter 5 CONCLUSIONS 51
REFERENCES 53
[1] Wang, Z., Zhang, H., & Bi, G. (2014). “Speech Signal Recovery Based on Source Separation and Noise Suppression.”Journal of Computer and Communications, 2(09), 112. (2014)
[2] Y. H. Kim, and J. W. Choi, “Sound Visualization and Manipulation,” Wiley, Singapore, Chap. 4 (2013).
[3] M. R. Bai, C.W. Tung, and C.C. Lee, “Optimal design of loudspeaker arrays for robust cross-talk cancellation using the Taguchi method and the genetic algorithm”, J. Acoust. Soc. Am. 117 (5), 2802-2813 (2005).
[4] P. C. Loizou, Speech Enhancement: Theory and Practice, Taylor & Francis, (2007).
[5] M. R. Bai, Y. H. Yao, C. S. Lai and Y. Y. Lo, “Modal domain and space domain formulations of spherical microphone arrays with application to source localization and separation”, J. Acoust. Soc. Am., vol. 139, no. 3, pp. 1058-1070, (2016).
[6] B. Rafaely, B. Weiss, and E. Bachmat, “Spatial aliasing in spherical microphone arrays,” IEEE Trans. Signal Process, vol. 55, no. 3, pp. 1003–1010, Mar. (2007..
[7] B. Rafaely, “Fundamentals of Spherical Array Processing.” Springer, 2015, vol. 8.
[8] T. E. Tuncer and B. Friedlander, Classical and Modern Direction-of-Arrival Estimation, Academic Press, Chap. 4-6 (2009).
[9] M. R. Bai, C. S. Lai,”Localization and separation of acoustic sources by using a 2.5-dimensional circular microphone array”, J. Acoust. Soc. Am, vol. 139, no. 4. (2016).
[10] A. N. Tikhonov, “Solution of nonlinear integral equations of the first kind,” Soviet Math. Dokl., vol. 5, pp. 835-838, (1964).
[11] M. Bertero, T. Poggio and V. Torre, “Ill-Posed Problems in Early Vision,” Proceedings of the IEEE 76 (8), 869-889 (1988).
[12] P. R. Johnston and R. M. Gulrajani “Selecting the Corner in the -Curve Approach to Tikhonov Regularization”, IEEE, 47(9), (2000).
[13] P. C. Hansen, “Analysis of Discrete Ill-Posed Problems by Means of the L-curve”, Society for Industrial and Applied Mathematics, (1992).
[14] P. C. Hansen and D. P. O’leary, “The use of the l-curve in the regularization of discrete ill-posed problems”, Society for Industrial and Applied Mathematics, (1993).
[15] R. P. Brent, “Algorithms for Minimization without Derivatives,” Prentice-Hall, Inc., Englewood Cliffs, New Jersey, pp. 48-75, (1973).
[16] J. Candes and M. B. Wakin, “An introduction to compressive sampling,” IEEE Signal Processing Mag. 25 (2), 21-30 (2008).
[17] G. F. Edelmann and C. F. Gaumond, “Beamforming using compressive sensing.” The Journal of the Acoustical Society of America 130.4 (2011).
[18] S. Boyd and L. Vandenberghe, Convex optimization, Cambridge University Press, New York, Chap. 1-7 (2004).
[19] M. R. Bai and C. C. Chen, “Application of Convex Optimization to Acoustical Array Signal Processing,” J. Sound Vibration, 332(5), 6596-6616 (2013).
[20] M. Grant, and S. Boyd, cvx, Version 1.21 MATLAB software for disciplined convex programming available at http://cvxr.com/cvx, (2013).
[21] I.F. Gorodnitsky, Member, IEEE, and B.D. Rao “Sparse Signal Reconstruction from Limited Data Using FOCUSS: A Re-weighted Minimum Norm Algorithm” (1997).
[22] ITU-T Recommendation P.862, “Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs,” International Telecommunication Union, Geneva, Switzerland, 21pages, (2001).
[23] J. Neter, M.H. Kutner, C.J. Nachtsheim, and W. Wasserman, Applied Linear Regression Models, third ed. Chicago: Irwin, (1996).
[24] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-Time Signal Processing, 2nd ed. Upper Saddle River, NJ: Prentice-Hall, (1999).
(此全文未開放授權)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *