|
[1] Y.Bando and M.Tanaka,“A chord recognition method of guitar sound using its constituent tone information,” IEEJ Transactions on Electrical and Electronic Engineering, vol. 17, no. 1, pp. 103–109, 2022. [2] K. Lee, “Automatic chord recognition using an HMM with supervised learn- ing,” ISMIR, 2006. [3] J. Jiang, K. Chen, W. Li, and G. Xia, “Large-vocabulary chord transcription via chord structure decomposition,” in ISMIR, pp. 644–651, 2019. [4] C. Gamer, “Some combinational resources of equal-tempered systems,” Jour- nal of Music Theory, vol. 11, no. 1, pp. 32–59, 1967. [5] H. Boatwright, “Harmonic materials of modern music: Resources of the tem- pered scale,” Journal of the American Musicological Society, vol. 17, no. 3, pp. 408–413, 1964. [6] J. Jiang, K. Chen, W. Li, and G. Xia, “MIREX 2018 submission: A structural chord representation for automatic large-vocabulary chord transcription,” Pro- ceedings of the Music Information Retrieval Evaluation eXchange, 2018. [7] Y. Wu, T. Carsault, and K. Yoshii, “Automatic chord estimation based on a frame-wise convolutional recurrent neural network with non-aligned anno- tations,” in 2019 27th European Signal Processing Conference (EUSIPCO), pp. 1–5, IEEE, 2019. [8] T. Fujishima, “Real-time chord recognition of musical sound: A system us- ing common lisp music,” in Proceedings of the International Computer Music Conference, 1999. [9] L. E. Baum and T. Petrie, “Statistical inference for probabilistic functions of finite state Markov chains,” The Annals of Mathematical Statistics, vol. 37, no. 6, pp. 1554–1563, 1966. [10] A. Sheh and D. P. Ellis, “Chord segmentation and recognition using EM- trained hidden Markov models,” in ISMIR, pp. 183–189, 2003. [11] G.D.Forney,“TheViterbialgorithm,”ProceedingsoftheIEEE,vol.61,no.3, pp. 268–278, 1973. [12] E. J. Humphrey and J. P. Bello, “Rethinking automatic chord recognition with convolutional neural networks,” in 2012 11th International Conference on Ma- chine Learning and Applications, pp. 357–362, IEEE, 2012. [13] S.Sigtia,N.Boulanger-Lewandowski,andS.Dixon,“Audiochordrecognition with a hybrid recurrent neural network,” in ISMIR, pp. 127–133, 2015. [14] B. McFee and J. P. Bello, “Structured training for large-vocabulary chord recognition,” in ISMIR, pp. 188–194, 2017. 44 [15] C.SchörkhuberandA.Klapuri,“Constant-Qtransformtoolboxformusicpro- cessing,” in 7th Sound and Music Computing Conference, Barcelona, Spain, pp. 3–64, 2010. [16] Y.WuandW.Li,“AutomaticaudiochordrecognitionwithMIDI-traineddeep feature and BLSTM-CRF sequence decoding model,” IEEE/ACM Transac- tions on Audio, Speech, and Language Processing, vol. 27, no. 2, pp. 355–366, 2018. [17] R. M. Bittner, B. McFee, J. Salamon, P. Li, and J. P. Bello, “Deep salience representations for F0 estimation in polyphonic music,” in ISMIR, pp. 63–70, 2017. [18] G. Byambatsogt, L. Choimaa, and G. Koutaki, “Guitar chord sensing and recognition using multi-task learning and physical data augmentation with robotics,” Sensors, vol. 20, no. 21, p. 6077, 2020. [19] R. Caruana, Multitask Learning. Springer, 1998. [20] D. Kim, Y. Lee, and H. Ko, “Multi-task learning for animal species and group category classification,” in Proceedings of the 2019 7th International Confer- ence on Information Technology: IoT and Smart City, pp. 435–438, 2019. [21] S.Ruder,“Anoverviewofmulti-tasklearningindeepneuralnetworks,”2017. arXiv:1706.05098. [22] R. Caruana, “Multitask learning: A knowledge-based source of inductive bias,” in Proceedings of the Tenth International Conference on Machine Learn- ing, pp. 41–48, Citeseer, 1993. [23] L. Duong, T. Cohn, S. Bird, and P. Cook, “Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser,” in Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 845–850, 2015. [24] L. C. Lu, “An interactive call and response blue guitar jamming system based on Markov chain and music theory,” Master’s thesis, National Tsing Hua Uni- versity, 2022. [25] T.Y.Lin,P.Goyal,R.Girshick,K.He,andP.Dollár,“Focallossfordenseob- ject detection,” in Proceedings of the IEEE International Conference on Com- puter Vision, pp. 2980–2988, 2017. [26] S. Panchapagesan, M. Sun, A. Khare, S. Matsoukas, A. Mandal, B. Hoffmeis- ter, and S. Vitaladevuni, “Multi-task learning and weighted cross-entropy for dnn-based keyword spotting,” in Interspeech, pp. 760–764, 2016. [27] D. Griffin and J. Lim, “Signal estimation from modified short-time Fourier transform,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 2, pp. 236–243, 1984. 45
[28] B.McFee,E.J.Humphrey,andJ.P.Bello,“Asoftwareframeworkformusical data augmentation,” in ISMIR, pp. 248–254, Citeseer, 2015. [29] N. Takahashi, M. Gygli, B. Pfister, and L. Van Gool, “Deep convolutional neural networks and data augmentation for acoustic event detection,” 2016. arXiv:1604.07160. [30] I. J. Good, “Rational decisions,” Journal of the Royal Statistical Society: Se- ries B (Methodological), vol. 14, no. 1, pp. 107–114, 1952. [31] P. Cerda, G. Varoquaux, and B. Kégl, “Similarity encoding for learning with dirty categorical variables,” Machine Learning, vol. 107, no. 8-10, pp. 1477– 1494, 2018. [32] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural network acoustic models,” in Proceedings of the International Confer- ence on Machine Learning, vol. 30, p. 3, 2013. [33] A. F. Agarap, “Deep learning using rectified linear units (ReLU),” 2018. arXiv:1803.08375. [34] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” Journal of Machine Learning Research, vol. 15, no. 56, pp. 1929–1958, 2014. [35] L. Liebel and M. Körner, “Auxiliary tasks in multi-task learning,” 2018. arXiv:1805.06334. [36] A. Kendall, Y. Gal, and R. Cipolla, “Multi-task learning using uncertainty to weigh losses for scene geometry and semantics,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7482–7491, 2018. [37] O. Abdel-Hamid, A.-r. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, “Convolutional neural networks for speech recognition,” IEEE/ACM Transac- tions on Audio, Speech, and Language Processing, vol. 22, no. 10, pp. 1533– 1545, 2014. [38] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on Imagenet classification,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034, 2015. [39] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feed- forward neural networks,” in Proceedings of the Thirteenth International Con- ference on Artificial Intelligence and Statistics, pp. 249–256, PMLR, May 2010. [40] E. Cano, D. FitzGerald, A. Liutkus, M. D. Plumbley, and F.-R. Stöter, “Mu- sical source separation: An introduction,” IEEE Signal Processing Magazine, vol. 36, no. 1, pp. 31–40, 2018. [41] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014. arXiv:1412.6980. 46
[42] R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in IJCAI, pp. 1137–1145, Montreal, Canada, 1995. [43] F. Korzeniowski, D. R. Sears, and G. Widmer, “A large-scale study of lan- guage models for chord prediction,” in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 91–95, IEEE, 2018. [44] T. Carsault, J. Nika, P. Esling, and G. Assayag, “Combining real-time extrac- tion and prediction of musical chord progressions for creative applications,” Electronics, vol. 10, no. 21, p. 2634, 2021. |