Exploring Algorithmic Composition Techniques in Music Generation
Algorithmic composition involves the use of algorithms to create music, mimicking human composers by generating music based on specific rules and structures. This presentation delves into various approaches such as DeepBach, MuseGAN, and EMI, highlighting the use of evolutionary algorithms, machine learning models, and linguistic techniques. It covers elements of music, categorization based on rules and models, technical aspects of algorithms, and deep dive into the DeepBach model that imitates Bach's music style through RNN and Gibbs sampling.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Algorithmic Composition: Some Approches Taiwan Evolutionary Intelligence Laboratory 2018/03/13 Group Meeting Presentation
Outline Introduction Algorithms: DeepBach MuseGAN EMI Conclusion
Introduction Algorithmic composition is the technique of using algorithms to create music. (Wikipedia) Create music like human composers. Randomly generated according to some rules.
Introduction Elements of music: Melody, rhythm, timbre Temporal structure Hierarchical structure Symbolic music generation (composition)
Categories Human pre-defined rules (prior knowledge) Evolutionary algorithm Machine learning Markov model RNN GAN Linguistic model
Tech of Algorithms Encoding Model Dataset Dataset determines the genre of generated music.
DeepBach Proposed by Sony CSL in 2016 Imitate Bach s music style. Use RNN model. RNN learns temporal structures in the music.
DeepBach A chorale is represented as a tuple of six lists: (V1, V2, V3, V4, S, F)
DeepBach Generate music by Gibbs sampling. Gibbs sampling: When the joint distribution (P(X,Y)) is not known explicitly or is difficult to sample from directly, but the conditional distribution of each variable (P(X|Y), P(Y|X), P(X), P(Y)) is known.
DeepBach Generate music by Gibbs sampling. Music are randomly generated initially. In each iteration, randomly choose a note and re-sample it by trained model.
DeepBach Dataset: The database of chorale harmonizations by J.S. Bach included in the music21 toolkit. Chorale( ) short pieces written for a four-part chorus (soprano, alto, tenor and bass)
MuseGAN Proposed by Dong et al. in 2017 Multi-track polyphonic music generation. Use GAN model. Use CNN in GAN to learn temporal structures in the music.
MuseGAN A 84 96 5 tensor
MuseGAN Temporal structure temporal information input bars
MuseGAN temporal information input
MuseGAN Dataset: Lakh MIDI Dataset / Lakh Pianoroll Dataset Choose rock music.
MuseGAN Some metrics are proposed to see how well the model learned from the dataset. Metrics: Ratio of empty bars Number of used pitch classes Ratio of qualified notes. Drum pattern Tonal distance (between tracks)
EMI Experiments in Musical Intelligence Have been developed by David Cope since 1987. Learn different styles of music. Non-linear linguistic-based composition Process music like natural language. Music is not generated in temporal order.
EMI Linguistic model Pattern-matching Signatures Augmented transition network
ATN Use finite state machines to parse sentences. N: noun V: verb Adj: adjective Det: determinator or article PN: proper noun CAT(X) Category of a word
EMI Use pattern-matching to find signatures. Use ATN to rearrange signatures into replicated works of similar style of music.
Demo DeepBach https://youtu.be/QiBM7-5hA6o MuseGAN https://salu133445.github.io/musegan/results EMI https://youtu.be/-Nb1s-o7dVg (2012) Disclaimer: No information shows that if these tracks were cherry-picked or edited.
Comments Classical is easier to learn because the rules are somehow explicit. (counterpoint etc.) Data quality is also important. Benchmarks used in MuseGAN are not realistic for music quality.
Comments RNN, GAN and linguistic model perform well in learning to create music. In my opinion, linguistic model represents music structures better, thus is more reasonable than other models.
Reference DeepBach: a Steerable Model for Bach chorales generation (Hadjeres and Pachet, 2016) MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment (Dong et al., 2017) (slide) https://salu133445.github.io/musegan/pdf/musegan-aaai2018-slides.pdf Experiments in Musical Intelligence (EMI): Non-Linear Linguistic-based Composition (Cope, 1989) Computer Modeling of Musical Intelligence in Experiments in Musical Intelligence (Cope, 1992) An Expert System for Computer-Assisted Music Composition (Cope, 1987) Pattern Matching as an Engine for the Computer Simulation of Musical Style (Cope, 1990) Augmented Transition Networks - http://www.fit.vutbr.cz/~rudolfa/grants.php?file=%2Fproj%2F533%2Ffmnl03- atn.pdf&id=533