|
1. Vinoo Alluri and Petri Toiviainen. Exploring perceptual and acoustical correlates of polyphonic timbre. Music Perception: An Interdisciplinary Journal, 27(3):223–242, 2010. 2. Jean-Julien Aucouturier and Emmanuel Bigand. Seven problems that keep mir from attracting the interest of cognition and neuroscience. Journal of Intelligent Information Systems, 41(3):483–497, 2013. 3. O. B. Bohan. Singing style transfer, 2017. http://madebyoll.in/posts/singing_style_transfer/. 4. Anne Caclin, Stephen McAdams, Bennett K Smith, and Suzanne Winsberg. Acoustic correlates of timbre space dimensions: A confirmatory study using synthetic tones. The Journal of the Acoustical Society of America, 118(1):471–482, 2005. 5. Marcelo Freitas Caetano and Xavier Rodet. Sound morphing by feature interpolation. In Proc. IEEE ICASSP, pages 22–27, 2011. 6. Yu-Sheng Chen, Yu-Ching Wang, Man-Hsin Kao, and Yung-Yu Chuang. Deep photo enhancer: Unpaired learning for image enhancement from photographs with gans. In CVPR, pages 6306–6314, 2018. 7. Shuqi Dai and Gus Xia. Music style transfer issues: A position paper. In the 6th International Workshop on Musical Metacreation (MUME), 2018. 8. Chris Donahue, Julian McAuley, and Miller Puckette. Synthesizing audio with generative adversarial networks. arXiv preprint arXiv:1802.04208, 2018. 9. Jonathan Driedger, Thomas Pr¨atzlich, and Meinard M¨uller. Let it bee-towards nmf-inspired audio mosaicing. In ISMIR, pages 350–356, 2015. 10. Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. Image style transfer using convolutional neural networks. In IEEE CVPR, pages 2414–2423, 2016. 11. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. Generative adversarial nets. In NIPS, pages 2672–2680, 2014. 12. John M Grey. Multidimensional perceptual scaling of musical timbres. The Journal of the Acoustical Society of America, 61(5):1270–1277, 1977. 13. JunYoung Gwak, Christopher B. Choy, Animesh Garg, Manmohan Chandraker, and Silvio Savarese. Weakly supervised generative adversarial networks for 3d reconstruction. CoRR, abs/1705.10904, 2017. 14. Albert Haque, Michelle Guo, and Prateek Verma. Conditional end-to-end audio transforms. arXiv preprint arXiv:1804.00047, 2018. 15. Ehsan Hosseini-Asl, Yingbo Zhou, Caiming Xiong, and Richard Socher. A multi-discriminator cyclegan for unsupervised non-parallel speech domain adaptation. arXiv preprint arXiv:1804.00522, 2018. 16. Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. Multimodal unsupervised image-to-image translation. In ECCV, 2018. 17. Alexia Jolicoeur-Martineau. The relativistic discriminator: a key element missing from standard GAN. CoRR, abs/1807.00734, 2018. 18. Kazuhiro Kobayashi, Tomoki Toda, Graham Neubig, Sakriani Sakti, and Satoshi Nakamura. Statistical singing voice conversion with direct waveform modification based on the spectrum differential. In INTERSPEECH, 2014. 19. Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. Learning representations for automatic colorization. In Proc. ECCV, Part IV, pages 577–593,2016. 20. Olivier Lartillot, Petri Toiviainen, and Tuomas Eerola. A matlab toolbox for music information retrieval. In Data analysis, machine learning and applications, pages 261–268. Springer, 2008. 21. Yijun Li, Sifei Liu, Jimei Yang, and Ming-Hsuan Yang. Generative face completion. In CVPR, pages 5892–5900, 2017. 22. Ming-Yu Liu, Thomas Breuel, and Jan Kautz. Unsupervised image-to-image translation networks. CoRR, abs/1703.00848, 2017. 23. Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, and Stephen Paul Smolley. Least squares generative adversarial networks. In ICCV,pages 2813–2821, 2017. 24. Noam Mor, Lior Wolf, Adam Polyak, and Yaniv Taigman. A universal music translation network. arXiv preprint arXiv:1805.07848, 2018. 25. Geoffroy Peeters, Bruno L Giordano, Patrick Susini, Nicolas Misdariis, and Stephen McAdams. The timbre toolbox: Extracting audio descriptors from musical signals. The Journal of the Acoustical Society of America, 130(5):2902–2916, 2011. 26. Kai Siedenburg, Ichiro Fujinaga, and Stephen McAdams. A comparison of approaches to timbre descriptors in music information retrieval and music psychology. Journal of New Music Research, 45(1):27–41, 2016. 27. Stanley S Stevens. On the psychophysical law. Psychological review, 64(3):153,1957. 28. Shih-Yang Su, Cheng-Kai Chiu, Li Su, and Yi-Hsuan Yang. Automatic conversion of pop music into chiptunes for 8-bit pixel art. In Proc. IEEE ICASSP,pages 411–415. IEEE, 2017. 29. D. Ulyanov and V. Lebedev. Singing style transfer, 2016. https://dmitryulyanov.github.io/ audio-texture-synthesis-and-style-transfer/. 30. Vesa V¨alim¨aki, Sira Gonz´alez, Ossi Kimmelma, and Jukka Parviainen. Digital audio antiquing-signal processing methods for imitating the sound quality of historical recordings. Journal of the Audio Engineering Society, 56(3):115–139,2008. 31. A¨aron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio. In SSW, page 125, 2016. 32. Prateek Verma and Julius O. Smith. Neural style transfer for audio spectograms. CoRR, abs/1801.01589, 2018. 33. Cheng-Wei Wu, Jen-Yu Liu, Yi-Hsuan Yang, and Jyh-Shing R Jang. Singing style transfer using cycle-consistent boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1807.02254, 2018. 34. Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. Seqgan: Sequence generative adversarial nets with policy gradient. In AAAI, pages 2852–2858, 2017. 35. Richard Zhang, Phillip Isola, and Alexei A. Efros. Colorful image colorization. In Proc. ECCV, Part III, 2016. 36. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. CoRR, abs/1703.10593, 2017. 37. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. Toward multimodal image-to-image translation. In NIPS, pages 465–476, 2017.
|