|
[1] A. E. Rosenberg, “Automatic speaker verification: A review,” Proceedings of the IEEE, vol. 64, no. 4, pp. 475–487, 1976. [2] S. Furui, “Cepstral analysis technique for automatic speaker verification,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 29, no. 2, pp. 254–272, 1981. [3] J. P. Campbell, “Speaker recognition: a tutorial,” Proceedings of the IEEE, vol. 85, no. 9, pp. 1437–1462, 1997. [4] F. Bimbot, J.F. Bonastre, C. Fredouille, G. Gravier, I. MagrinChagnolleau, S. Meignier, T. Merlin, J. OrtegaGarcía, D. PetrovskaDelacrétaz, and D. A. Reynolds, “A tutorial on textindependent speaker verification,” EURASIP Journal on Advances in Signal Processing, vol. 2004, no. 4, p. 101962, 2004. [5] W. M. Campbell, J. P. Campbell, T. P. Gleason, D. A. Reynolds, and W. Shen, “Speaker verification using support vector machines and highlevel features,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 7, pp. 2085–2094, 2007. [6] N. W. Evans, T. Kinnunen, and J. Yamagishi, “Spoofing and countermeasures for automatic speaker verification.,” in Interspeech, pp. 925–929, 2013. [7] Z. Wu, T. Kinnunen, N. Evans, J. Yamagishi, C. Hanilçi, M. Sahidullah, and A. Sizov, “Asvspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge,” in Sixteenth Annual Conference of the International Speech Communication Association, 2015. [8] M. Todisco, H. Delgado, and N. Evans, “Constant q cepstral coefficients: A spoofing countermeasure for automatic speaker verification,” Computer Speech & Language, vol. 45, pp. 516–535, 2017. [9] M. Singh and D. Pati, “Countermeasures to replay attacks: A review,” IETE Technical Review, pp. 1–16, 2019. [10] H. Tak and H. Patil, “Novel linear frequency residual cepstral features for replay attack detection,” in Proc. Interspeech 2018, pp. 726–730, 2018. [11] M. Todisco, H. Delgado, and N. Evans, “A new feature for automatic speaker verification anti-spoofing: Constant q cepstral coefficients,” in Odyssey 2016, pp. 283–290, 2016. [12] S. Jelil, R. K. Das, S. M. Prasanna, and R. Sinha, “Spoof detection using source, instantaneous frequency and cepstral features.,” in Interspeech, pp. 22–26, 2017. [13] Z. Chen, Z. Xie, W. Zhang, and X. Xu, “Resnet and model fusion for automatic spoofing detection.,” in Interspeech, pp. 102–106, 2017. [14] N. Huang, Z. Shen, S. Long, M. Wu, H. Shih, Q. Zheng, N.C. Yen, C.C. Tung, and H. Liu, “The empirical mode decomposition and the hilbert spectrum for nonlinear and nonstationary time series analysis,” Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, vol. 454, pp. 903–995, 03 1998. [15] S. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 357–366, 1980. [16] M. Bezoui, A. Elmoutaouakkil, and A. Benihssane, “Feature extraction of some quranic recitation using mel-frequency cepstral coeficients (mfcc),” in 2016 5th International Conference on Multimedia Computing and Systems (ICMCS), pp. 127–131, 2016. [17] A. B. Kandali, A. Routray, and T. K. Basu, “Emotion recognition from assamese speeches using mfcc features and gmm classifier,” in TENCON 2008 2008 IEEE Region 10 Conference, pp. 1–5, 2008. [18] K. S. R. Murty and B. Yegnanarayana, “Combining evidence from residual phase and mfcc features for speaker recognition,” IEEE Signal Processing Letters, vol. 13, no. 1, pp. 52–55, 2006. [19] M. Witkowski, S. Kacprzak, P. Żelasko, K. Kowalczyk, and J. Gałka, “Audio replay attack detection using high-frequency features,” in Interspeech, pp. 27–31, 08 2017. [20] T. Kinnunen, M. Sahidullah, H. Delgado, M. Todisco, N. Evans, J. Yamagishi, and K. A. Lee, “The asvspoof 2017 challenge: Assessing the limits of replay spoofing attack detection,” 2017. [21] H. Delgado, M. Todisco, M. Sahidullah, N. Evans, T. Kinnunen, K. A. Lee, and J. Yamagishi, “Asvspoof 2017 version 2.0: meta-data analysis and baseline enhancements,” in Odyssey 2018-The Speaker and Language Recognition Workshop, 2018. [22] M. Todisco, X. Wang, V. Vestman, M. Sahidullah, H. Delgado, A. Nautsch, J. Yamagishi, N. Evans, T. Kinnunen, and K. A. Lee, “Asvspoof 2019: Future horizons in spoofed and fake audio detection,” arXiv preprint arXiv:1904.05441, 2019. [23] P. Tapkir and H. Patil, “Novel empirical mode decomposition cepstral features for replay spoof detection,” in Proc. Interspeech 2018, pp. 721–725, 2018. [24] S. Mankad and S. Garg, “On the performance of empirical mode decomposition-based replay spoofing detection in speaker verification systems,” Progress in Artificial Intelligence, vol. 9, 08 2020. [25] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017. [26] D. Cai, Z. Ni, W. Liu, W. Cai, G. Li, M. Li, D. Cai, Z. Ni, W. Liu, and W. Cai, “End-to-end deep learning framework for speech paralinguistics detection based on perception aware spectrum.,” in INTERSPEECH, pp. 3452–3456, 2017. [27] L. Huang, Y. Gan, and H. Ye, “Audio-replay attacks spoofing detection for automatic speaker verification system,” in 2019 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pp. 392–396, 2019. [28] H. Dinkel, Y. Qian, and K. Yu, “Investigating raw wave deep neural networks for end-to-end speaker spoofing detection,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 11, pp. 2002–2014, 2018. [29] A. Martin, G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki, “The det curve in assessment of detection task performance,” tech. rep., National Inst of Standards and Technology Gaithersburg MD, 1997. [30] B. Bakar and C. Hanilçi, “An experimental study on audio replay attack detection using deep neural networks,” in 2018 IEEE Spoken Language Technology Workshop (SLT), pp. 132–138, 2018.
|