帳號:guest(3.133.133.189)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):游鈞楷
作者(外文):You, Jun-Kai
論文名稱(中文):多對麥克風於殘響房間之聲源定位
論文名稱(外文):Source Localization in a Reverberant Room via Multiple Pairs of Microphones
指導教授(中文):劉奕汶
指導教授(外文):Liu, Yi-Wen
口試委員(中文):廖弘源
洪樂文
口試委員(外文):Liao, Hong-Yuan
Hong, Yao-Win
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:104061529
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:45
中文關鍵詞:聲源定位抵達時間差殘響多通道音頻訊號處理最大似然估計
外文關鍵詞:sound source localizationtime difference of arrivalreverberationmultichannel audio processingmaximum likelihood
相關次數:
  • 推薦推薦:0
  • 點閱點閱:245
  • 評分評分:*****
  • 下載下載:23
  • 收藏收藏:0
本論文基於前人所提出的抵達時間差之機率模型來做出改善,使其對於有室內殘響的結果更加準確,並且發展任意多對麥克風的定位方法。為了處理室內殘響的問題,我們將殘響視為另外一種噪音,並重新加入到我們的模型當中,可以得知改善後的模型,對於抵達時間差的預測更為穩健。我們可以直觀地利用此機率模型做出2對麥克風的定位結果,但如果聲源擺在麥克風對之間的連線附近,其定位結果會下降很多,因此我們發展出可以延伸到任意多對麥克風的定位方法。從最大定位誤差結果得知,利用3對麥克風來做定位,比起用2對麥克風,下降超過20%的誤差。
A probabilistic time difference of arrival (TDOA) estimation method is extended for multiple pairs of microphones to collaboratively perform source localization. To deal with reverberation, we treat the reverberant signal as a separate type of noise, and the TDOA estimation result outperforms existing approaches such as an interaural phase difference (IPD) method [22], the GCC-PHAT [12], and the adopted method based on probabilistic modeling [21]. Then, the task of sound source localization using arbitrarily many pairs of microphones is formulated as a maximum likelihood problem, and the x-y coordinate of the sound source is estimated by gradient descent minimization of the cost function. Simulation in a reverberant room shows that, when using two pairs of microphones for source localization, large errors tend to occur when the source and the microphones are on the same line. Nevertheless, by adding a third pair of microphones, the proposed algorithm is able to reduce the maximum mean square localization error by more than 20%. And the relationship with different γ and 〖RT〗_60 is tested. The simulation shows that, no matter what 〖RT〗_60 is, γ=0.5 has the better performance than γ=0.1 and 0.3 in our system. However, γ=0.7 has the lower position error when 〖RT〗_60 is 0.45 s, and γ=0.9 has the lower position error when 〖RT〗_60 is 0.6 s.
摘要…………………………………………………………………………………….i
Abstract……………………………………………………………………………..…ii
Contents……………………………………………………………………………….I
List of Figures……………………………………………….………………………..III
List of Tables……………………………………………………………………….IV
Introduction 1
Research motivation……………………………………………………….1
Literature review…………………………………………………………….1
Thesis organization………………………………………………………….3
Probabilistic modeling review [21] 4
The modified system model 8
3.1 Two pairs of microphones for source localization……………………..…8
3.2 Dealing with reverberation………………………………………………..13
3.3 Extending to arbitrary many pairs of microphones………………………..15
3.4 Simulating reverberant rooms…………………….………………………..19
3.5 Reverberation time……………………………….………………………..22
Results and discussion 24
4.1 TDOA estimation under reverberation……………………………………..24
4.2 Parameter γ v.s. RT_60…………………………………………………..27
4.3 Localization results…………………………………………………..29
Conclusion 33

Appendix
Cramer-Rao Lower Bound (CRLB) and Fisher Information 35
1.0 Cramer-Rao Lower Bound (CRLB) Theorem……………………………..35
1.1 Finding MVUE when only |X[k]| is unknown…………..………………36
1.1 Finding MVUE when |X[k]| and σ^2 are unknown…………..…………37
Taylor series expansion and unbiased estimator 39
2.1 Finding r ̂ by Taylor series expansion……………………………..39
2.2 r ̂ is an unbiased estimator………………………………….……………..40
References 41
[1] B.-R. Chen, H.-Y. Lee, and Y.-W. Liu, “Unmixing convolutive mixtures by exploiting amplitude co-modulation: methods and evaluation on Mandarin speech recordings,” in Proc. Interspeech, pp. 1934-1937, 2017.
[2] T. Ohata, K. Nakamura, T. Mizumoto, T. Taiki, and K. Nakadai, “Improvement in outdoor sound source detection using a quadrotor-embedded microphone array,” in IEEE/RSJ Intelligent Robots and Systems, pp. 1902-1907, 2014.
[3] L. Rui, and K.-C. Ho, “Efficient closed-form estimators for multistatic sonar localization,” IEEE Transactions on Aerospace and Electronic Systems, vol. 51, no. 1, pp. 600-614, 2015.
[4] M. Rapp, M. Hahn, M. Thom, J. Dickmann and K. Di etmayer, “Semi-Markov Process Based Localization Using Radar in Dynamic Environments,” in IEEE Intelligent Transportation Systems (ITSC), pp. 423-429, 2015.
[5] R. Mazraani, M. Saez, L. Govoni, and D. Knobloch, “Experimental results of a combined TDOA/TOF technique for UWB based localization systems,” in IEEE Communications Workshops (ICC Workshops), pp. 1043-1048, 2017.
[6] A. Marti, M. Cobos, and J.J. Lopez, “Real time speaker localization and detection system for camera steering in multiparticipant videoconferencing environments,” in Proc. ICASSP, pp. 2592-2595, 2011.
[7] L. Rayleigh, “XII. On our perception of sound direction,” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, pp. 214-232, vol. 13, no. 74, 1907.
[8] P. Georgiou, C. Kyriakakis, and P. Tsakalides, “Robust time delay estimation for sound source localization in noisy environments,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 1997.
[9] D. Li and S. Levinson, “Adaptive sound source localization by two microphones,” in Proc. ICASSP, p. 143, 2001.
[10] F. Gong, W. Qing, and X. Zhang, “A new distance based algorithm for TDOA localization in cellular networks,” in IEEE Computer Science and Information Technology (ICCSIT), vol. 7, pp. 505-505, 2010.
[11] F. Meyer, A. Tesei, and Win, M.-Z., “Localization of multiple sources using time-difference of arrival measurements,” in Proc. ICASSP, pp. 3151-3155, 2017.
[12] C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Trans. on Acoust., Speech, and Signal Process., vol. 24, no. 4, pp. 320–327, Aug. 1976.
[13] D. Ying and Y. Yan, “Robust and Fast Localization of Single Speech Source Using a Planar Array,” IEEE Signal Processing Letters, vol. 20, no. 9, pp. 909-912, 2013.
[14] T.M. Sreejith, P.K. Joshin, S. Harshavardhan, and T.V. Sreenivas, “TDE sign based homing algorithm for sound source tracking using a Y-shaped microphone array,” in European Signal Processing Conference (EUSIPCO), pp. 1202-1206 , 2015.
[15] Z. Qin, J. Wang, and S. Wei, “A study of 3D sensor array geometry for TDOA based localization,” in IEEE Radar, pp. 1-5, 2016.
[16] W.G. Gardner and K.D. Martin, “HRTF Measurements of a KEMAR,” The Journal of the Acoustical Society of America, vol. 97, no. 6, pp. 3907-3908, 1995.
[17] D.N. Zotkin, R. Duraiswami, and N.A. Gumerov, “Regularized HRTF fitting using spherical harmonics,” in European Signal Processing Conference (EUSIPCO), pp. 1202-1206, 2015.
[18] A. Griffin and A. Mouchtaris, “Localizing multiple audio sources from DOA estimates in a wireless acoustic sensor network,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1-4, 2013.
[19] Z. Zohny and J. Chambers, “Modelling interaural level and phase cues with Student’s t-distribution for robust clustering in MESSL,” in Int. Conf. on Digital Signal Processing, pp. 59-65, 2014.
[20] H.-K. Hao, H.-M. Liang, and Y.-W. Liu, “Particle methods for real-time sound source localization based on the Multiple Signal Classification algorithm,” in IEEE Intelligent Green Building and Smart Grid (IGBSG), pp. 1–5, 2014.
[21] C.-W. Li and Y.-W. Liu, “Posterior probabilistic modeling for inter-channel phase and time difference estimation in audio signals,” in Proc. ICASSP, pp. 291-295, 2016.
[22] F. Fujii, N. Hogaki, and Y. Watanabe, “A simple and robust binaural sound source localization system using interaural time difference as a cue,” in IEEE Int. Conf. on Mechatronics and Automation, pp. 1095–1101, 2013.
[23] J. DiBiase, H. Silverman, and M. Brandstein, “Robust localization in reverberant rooms,” Microphone Arrays, pp. 157-180, 2001.
[24] H. Do, H.F. Silverman, and Y. Yu, “A real-time SRP-PHAT source location implementation using stochastic region contraction (SRC) on a large-aperture microphone array,” in Proc. ICASSP, vol. 1, pp. 121-124, 2007.
[25] H. Wang and P. Chu, “Voice source localization for automatic camera pointing system in videoconferencing,” in Proc. ICASSP, vol. 1, pp. 187-190, 1997.
[26] Y. Rui and D. Florencio, “Time delay estimation in the presence of correlated noise and reverberation,” in Proc. ICASSP, vol. 2, pp. 133-136, 2004.
[27] L. Sun and Q. Cheng, “Indoor Multiple sound source tracking using refined TDOA measurements,” in Conf. Information Sciences and Systems (CISS), pp. 1-5, 2015.
[28] Y. Li and H. Chen, “Reverberation robust feature extraction for sound source localization using a small-sized microphone array,” IEEE Sensors Journal, vol. 17, no. 19, pp. 6331-6339, 2017.
[29] X. Li, L. Girin, R. Horaud, and S. Gannot, “Multiple-speaker localization based on direct-path features and likelihood maximization with spatial sparsity regularization,” IEEE/ACM Trans. on Audio, Speech, and Language Processing, vol. 25 no. 10, pp. 1997-2012, 2017.
[30] D.J. Torrieri, “Statistical theory of passive location systems,” IEEE transactions on Aerospace and Electronic Systems, vol. 20, no. 2, pp. 183–197, 1984.
[31] T. Nishiura, T. Yamada, S. Nakamura, and K. Shikano, “Localization of multiple sound sources based on a CSP analysis with a microphone array,” in Proc. ICASSP, vol. 2, pp. 1053-1056, 2000.
[32] T. Nishiura, S. Nakamura, and K. Shikano, “Talker localization in a real acoustic environment based on DOA estimation and statistical sound source identification,” in Proc. ICASSP, vol. 1, pp. 893-896, 2002.
[33] A. Magassouba, N. Bertin, and F. Chaumette, “First applications of sound-based control on a mobile robot equipped with two microphones,” in IEEE International Conference on Robotics and Automation (ICRA), pp. 2557-2562, 2016.
[34] S. M. Kay, Fundamentals of statistical signal processing, volume I: estimation theory, Prentice Hall, 1993.
[35] R. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. on antennas and propagation, vol. 343, no. 3, pp. 276-280, 1986.
[36] R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotational invariance techniques,” IEEE Trans. on acoustics, speech, and signal processing, vol. 37, no. 7, pp. 984-995, 1989.
[37] O. A. Oumar, M. F. Siyau, and T. P. Sattar, “Comparison between MUSIC and ESPRIT direction of arrival estimation algorithms for wireless communication systems,” in IEEE Future Generation Communication Technology (FGCT), pp. 99-103, 2012.
[38] A. Pierce Acoustics, An Introduction to Its Physical Principles and Applications, NY Mc Graw-Hill, 1991.
[39] N. Ma, T. May, G. Brown, "Exploiting deep neural networks and head movements for robust binaural localisation of multiple sources in reverberant environments", IEEE Trans. Audio, Speech, Lang. Process., vol. 25, no. 12, pp. 2444-2453, 2017.
[40] J.Woodruff and D.L.Wang, "Binaural localization of multiple sources in reverberant and noisy environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 5, pp. 1503-1512, 2012.
[41] T. May, S. van de Par, and A. Kohlrausch, "A probabilistic model for robust localization based on a binaural auditory front-end," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 1, pp. 1-13, 2011.
[42] T. May, N. Ma, and G. J. Brown, “Robust localisation of multiple speakers exploiting head movements and multi-conditional training of binaural cues,” in Proc. ICASSP, pp. 2679–2683, 2015.
[43] T. May, S. van de Par, and A. Kohlrausch, “Binaural localization and detection of speakers in complex acoustic scenes,” in The Technology of Binaural Listening, Springer, pp. 397–425, 2013.
[44] N. Ma, T. May, H. Wierstorf, and G. J. Brown, “A machine hearing system exploiting head movements for binaural sound localisation in reverberant conditions,” in Proc. ICASSP, pp. 2699–2703, 2015.
[45] ISO 3382-2:2008, “Acoustics -- Measurement of room acoustic parameters -- Part 2: Reverberation time in ordinary rooms”, 2008.
[46] J. B. Allen and D. A. Berkley, “Image method for efficiently simulating small-room acoustics,” The Journal of the Acoustical Society of America, vol. 65, no. 4, pp. 943-950, 1979.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *