以GPU為運算核心的二階段哼唱選歌系統__國立清華大學博碩士論文全文影像系統

帳號：guest(18.220.35.83) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	高瑋澤
作者(外文):	Kao, Wei-Tsa
論文名稱(中文):	以GPU為運算核心的二階段哼唱選歌系統
論文名稱(外文):	A Two-Stage Query by Singing/Humming System on GPU
指導教授(中文):	張智星張俊盛
口試委員(中文):	張智星張俊盛呂仁園王新民
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊系統與應用研究所
學號:	100065504
出版年(民國):	102
畢業學年度:	101
語文別:	英文
論文頁數:	51
中文關鍵詞:	音樂檢索、哼唱選歌、線性伸縮、動態時間校正、GPU
外文關鍵詞:	Music Retrieval、Query by Singing and Humming、Linear Scaling、Dynamic Time Warping、GPU
相關次數:	推薦:0 點閱:717 評分: 下載:4 收藏:0

本研究提出了使用GPU架構實作的二階段哼唱選歌系統。哼唱選歌 (Query by Singing and Humming, QBSH )是種利用人聲進行歌曲搜尋的方法，系統會採用使用者哼唱的片段並從資料庫中找出前十名最相似的歌曲。

為了增加比對速度，我們先使用了線性伸縮並從擁有八千四百三十一首流行歌的資料庫中找出較為可能的候選歌曲，接著會對這些候選歌進行動態時間校正比對以求得較好的效能。經過了最佳化微調以及合併方法的改進後，該系統能夠比純粹在GPU上使用動態時間校正快上7倍，且辨識率能達到77.65%。

關鍵字：音樂檢索、哼唱選歌、線性伸縮、動態時間校正、GPU

This research proposes the use of GPU (graphic processing unit) to implementing a two-stage comparison method for a QBSH (query by singing/humming) system. The system can take a user’s singing or humming and retrieve the top-10 most likely candidates from a database of 8431 songs.

In order to speed up the comparison, we apply linear scaling in the first stage to select candidate songs from the database. These candidate songs are then re-ranked by dynamic time warping to achieve better recognition accuracy in the second stage. With the optimum setting and improvement of combination method, we can achieve a speedup factor of 7 (compared to dynamic time warping on GPU) and an accuracy of 77.65%.

Keyword: Music Retrieval, Query by Singing and Humming, Linear Scaling, Dynamic Time Warping, GPU

Abstract 3
Acknowledgement 4
List of Graphs 7
List of Tables 9
List of Equations 10
Chapter 1 Introduction 11
1.1 Objective of Research 11
1.2 Related Research 11
1.3 Chapter Summary 16
Chapter 2 MIRACLE System 17
2.1 System Flow 17
2.1.1 Recording from Web System and Performing Pitch Tracking 17
2.1.2 Loading Database and Preprocessing 18
2.1.3 Performing Melody Recognition and Post-process the Result 18
2.2 CUDA (Compute Unified Device Architecture) 18
2.2.1 Register 22
2.2.2 Shared memory 22
2.2.3 Device memory (off-chip) 22
2.3 Linear Scaling 23
2.4 Dynamic Time Warping 25
Chapter 3 Methods and Implementations 28
3.1 Two-stage Recognition and Candidate Song Selection 28
3.2 Combination of Melody Recognition 29
3.3 Anchor Note Matching 32
3.4 GPU Parallel Implementation 33
3.4.1 Linear Scaling on GPU 33
3.4.2 Dynamic Time Warping on GPU 34
Chapter 4 Experimental Results and Analysis 35
4.1 Experimental Dataset and Database 35
4.2 Experimental Results 36
4.2.1 Parameter Tuning 36
4.2.2 Method Results 41
4.3 Comparison of Borda Count Method and Reciprocal Method 45
4.4 Error Analysis 46
Chapter 5 Conclusion and Future Work 48
5.1 Conclusion 48
5.2 Future Work 48
References 49

[1] J.-S. R. Jang, J.-C. Chen and M.-Y. Kao.,“MIRACLE: A Music Information Retrieval System with Clustered Computing Engines,” ISMIR 2001.
[2] C.-C. Wang, C.-H. Chen, C.-Y. Kuo, L.-T. Chiu, and J.-S. R. Jang, “Accelerating Query by Singing/Humming on GPU: Optimization for Web Deployment,” ICASSP 2012
[3] G. Poli, A. L. M. Levada, J. F. Mari, J. H. Satio, “Voice Command Recognition with Dynamic Time Warping (DTW) using Graphics Processing Units (GPU) with Compute Unified Device Architecture (CUDA),” in Proceedings of the 19th International Symposium on Computer Architecture and High Performance Computing , SBAC-PAD 2007, Brazil, pp. 19–25, 2007.
[4] Jun Li, Shuangping Chen, Yanhui Li, “The Fast Evaluation of Hidden Markov Models on GPU,” in IEEE International Conference on Intelligent Computing and Intelligent Systems, Shanghai, vol. 4:426-430, Nov., 2009.
[5] P. Ferraro, P. Hanna, L. Imbert, and T. Izart, “Accelerating Query-by-Humming on GPU” in Proceedings of the 10th International Conference on Music Information Retrieval, ISMIR 2009, pp. 279–284, 200.
[6] Chin-Yang Kuo, “Accelerating Query By Singing/Humming on GPU”, National Tsing Hua Univ. 2013.
[7] Tzu-Chiao Lin, “Research and Implementation of Query by Singing/Humming for Embedded Karaoke Systems”, National Tsing Hua Univ. 2009
[8] Yi-Fan Fang, “Improvement and Implementation of Query by Singing/Humming Systems,”National Tsing Hua Univ. 2010
[9] Tin Kam Ho, J. Hull, Sargur N. Srihari, “Decision Combination in Multiple Classifier Systems,” in IEEE Transactions on Patter Analysis and Machine Intelligence (PAMI), Jam., 1994
[10] Ming-Xian Zou, “Query By Singing/Humming Using Combination of Classifiers”, National Tsing Univ. 2008.
[11] CUDA –Wikipedia, http://en.wikipedia.org/wiki/CUDA
[12] NVIDIA, “CUDA C Best Practices Guide”, http://docs.nvidia.com/cuda/cuda-c-best-practices-guide/
[13] NVIDIA, “NVIDIA CUDA C Programming Guide Version 4.2.”, http://docs.nvidia.com/cuda/cuda-c-programming-guide/
[14] CVG @ ETHZ - GP – GPU: General Purpose Programming on the Graphics Processing Unit, http://www.cvg.ethz.ch/teaching/2011spring/gpgpu/2012
[15] 林俊淵, 周嘉奕, 林郁翔, 李昇達, 陳昱蓉, 黃宣穎,李天齡, “CUDA輕鬆上手。新世代GPU應用技術”, 松崗資訊股份有限公司, 2011
[16] NVIDIA_GPU_Computing_Webinars_CUDA_Memory_Optimization.pdf, 2012
[17] Jang, J.-S Roger, Ming-Yang Kao, “A Query-by-Singing System based on Dynamic Programming,” International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP. 85-89, Hsinchu, Taiwan, Dec 2000.
[18] Borda Count – http://en.wikipedia.org/wiki/Borda_count
[19] Downhill Simplex search, “Lagarias, J.C., J. A. Reeds, M. H. Wright, and P. E. Wright, “Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions,” SIAM Journal of Optimization, Vol. 9 Number 1, pp. 112-147, 1998”

電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文