帳號:guest(18.218.17.45)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):鄭光丸
作者(外文):Giambi, Manuel
論文名稱(中文):酷貓人工智慧 – 從人類的對應方來學習爵士音樂即興作曲的自動化
論文名稱(外文):CoolCatAI – Tackling the Automated Jazz Improvisation Task by Learning from its Human Counterpart
指導教授(中文):蘇豐文
指導教授(外文):Soo, Von-Wun
口試委員(中文):郭柏志
陳鴻文
口試委員(外文):Kuo, Po-Chih
Chen, Hong-Wen
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:109065430
出版年(民國):111
畢業學年度:110
語文別:英文
論文頁數:92
中文關鍵詞:爵士音樂人工智慧
外文關鍵詞:JazzDeep LearningMachine LearningMusicImprovisation
相關次數:
  • 推薦推薦:0
  • 點閱點閱:577
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
Creative tasks are at the cutting edge of machine learning research, which is seeing many recent improvements, but automated systems are still far from reaching human levels of proficiency and creativity. Advancements in music generation, and in particular jazz music generation, are being slowed down by the lack of sizeable high-quality datasets. In this work, we try to mitigate this problem by curating a large symbolic jazz music dataset that can be used for a number of downstream tasks. This dataset contains improvised melodies (solos), each paired and aligned with its corresponding chord progression and original melody.
Furthermore, we design a family of deep learning models (dubbed ’CoolCatAI’), to test the hypothesis that learning from the human task we are trying to automate can help us achieve better results. We train these models using the newly created dataset and discuss the results.
An analysis of the models’ learned embeddings indicates that the models have learned fundamental music theory concepts and an objective evaluation of the generated music shows promising results in metrics pertaining to four areas: melody, rhythm, harmony and creativity. For most of the metrics, our models surpass the previous approaches. Finally, subjective evaluation results show that the perceived quality and novelty of the music generated by CoolCatAI are comparable to that of human-improvised music.
Abstract (Chinese) I
Abstract II
Contents III
List of Figures VII
List of Tables IX
List of Algorithms X
1 Introduction 1
1.1 ThesisStructure............................. 1
1.2 Motivation................................ 1
1.3 UnderstandingImprovisation ..................... 2
1.4 LearningtoImprovise ......................... 3
1.5 ImprovisationContext ......................... 4
2 Background 6
2.1 Terminology............................... 6
2.1.1 MusicGeneration........................ 6
2.1.2 Improvisation .......................... 7
2.1.3 Leadsheet ............................ 7
2.1.4 ChordProgression ....................... 9
2.1.5 Melody ............................. 9
2.1.6 Cycle............................... 9
2.2 RelatedWork .............................. 10
3 Methodology 12
3.1 CoolCatAI................................ 12
3.1.1 NetworkArchitecture...................... 12
3.1.2 LSTMNetworks ........................ 14
3.2 RhythmEncoding............................ 16
3.2.1 Time-StepEncoding ...................... 16
3.2.2 DurationEncoding ....................... 19
3.3 ChordEncoding............................. 24
3.3.1 CompressedChordEncoding ................. 25
3.3.2 FixedChordEncoding ..................... 26
3.3.3 ExtendedChordEncoding................... 27
4 Dataset 28
4.1 DataCuration.............................. 28
4.1.1 ExampleStructure ....................... 28
4.1.2 DataSources .......................... 29
4.1.3 ChordProgressions....................... 29
4.1.4 DuplicateMelodyRemoval................... 30
4.1.5 FileNamesStandardization .................. 30
4.1.6 TimeSignatureSelection.................... 30
4.1.7 Original and Improvised Melodies Tagging .................... 31 4.1.8 MelodyExtraction ....................... 31
4.1.9 PolyphonyRemoval....................... 31
4.1.10 MelodyAlignment ....................... 33
4.1.11 MetadataIntegration...................... 36
4.2 DatasetAnalysis ............................ 37
4.2.1 NumberofImprovisedExamplesperSong . . . . . . . . . . 37
4.2.2 Numberofmeasures ...................... 37
4.2.3 Numberofnotes ........................ 38
4.2.4 Noteoffset............................ 38
4.2.5 Noteduration.......................... 39
4.2.6 Notepitchandpitchclass ................... 39
4.2.7 Chordtriad ........................... 41
4.2.8 Songkey............................. 43
5 Experiments 44
5.1 Baseline ................................. 44
5.2 Training................................. 44
5.2.1 Inputtensorscreation ..................... 45
5.2.2 Hyper-parameters........................ 45
5.2.3 Trainingtermination ...................... 47
5.3 Generation................................ 47
5.3.1 Generationhyper-parameters ................. 48
5.4 ObjectiveEvaluation .......................... 49
5.4.1 MelodyMetrics......................... 49
5.4.2 RhythmMetrics......................... 50
5.4.3 HarmonyMetrics ........................ 51
5.4.4 CreativityMetrics ....................... 52
5.5 SubjectiveEvaluation.......................... 53
5.5.1 PersonalInformation...................... 53
5.5.2 MusicalEvaluation ....................... 54
6 Results 57
6.1 EmbeddingAnalysis .......................... 57
6.1.1 OffsetEmbeddingAnalysis................... 58
6.1.2 DurationEmbeddingAnalysis................. 58
6.1.3 PitchEmbeddingAnalysis................... 61
6.2 ObjectiveEvaluationResults ..................... 63
6.2.1 MelodyMetrics......................... 63
6.2.2 RhythmMetrics......................... 66
6.2.3 HarmonyMetrics ........................ 67
6.2.4 CreativityMetrics ....................... 68
6.3 SubjectiveEvaluationResults ..................... 70
6.3.1 Demographics.......................... 70
6.3.2 ScoreAnalysis.......................... 70
6.3.3 HumanvsComputer ...................... 73
7 Conclusion 75
7.1 Contributions .............................. 75 7.2 FutureWork............................... 77
A Dataset 79
A.1 DataSources .............................. 79
Bibliography
[1] Gino Brunner, Yuyi Wang, Roger Wattenhofer, and Jonas Wiesendanger. Jambot: Music theory aware chord based generation of polyphonic music with lstms, 2017.
[2] Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, and Yi-Hsuan Yang. Musegan: Multi-track sequential generative adversarial networks for symbolic music gen- eration and accompaniment, 2017.
[3] Jon Gillick, Kevin Tang, and Robert Keller. Learning jazz grammars, 01 2010.
[4] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adver- sarial networks, 2014.
[5] Ga ̈etan Hadjeres, Fran ̧cois Pachet, and Frank Nielsen. Deepbach: A steerable model for bach chorales generation, 2017.
[6] Shunit Haviv Hakimi, Nadav Bhonker, and Ran El-Yaniv. Bebopnet: Deep neural models for personalized jazz improvisations, 2020.
[7] Sepp Hochreiter and Ju ̈rgen Schmidhuber. Long short-term memory, nov 1997.
[8] Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Curtis Hawthorne, Andrew M Dai, Matthew D Hoffman, and Douglas Eck. Music transformer: Generating music with long-term structure, 2018.
[9] Yu-Siang Huang and Yi-Hsuan Yang. Pop music transformer: Beat-based modeling and generation of expressive pop piano compositions, 2020.
[10] Hsiao-Tzu Hung, Chung-Yang Wang, Yi-Hsuan Yang, and Hsin-Min Wang. Improving automatic jazz melody generation by transfer learning techniques, 2019.
[11] Hakan Inan, Khashayar Khosravi, and Richard Socher. Tying word vectors and word classifiers: A loss framework for language modeling, 2016.
[12] George Papadopoulos and Geraint Wiggins. A genetic algorithm for the gen- eration of jazz melodies, 06 2000.
[13] Nicholas Trieu and Robert Keller. Jazzgan: Improvising with generative ad- versarial networks, 08 2018.
[14] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need, 2017.
[15] Roberts Waite, Eck and Abolafia. Project magenta: Generating long-term structure in songs and stories, 2016.
[16] Shih-Lun Wu and Yi-Hsuan Yang. The jazz transformer on the front line: Exploring the shortcomings of ai-composed music through quantitative mea- sures, 2020.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *