This repository contains the code for the experiments conducted in the paper Generating music with data: Application of Deep Learning models for symbolic music composition. This paper is a summary of my research on the application of Deep Learning models for symbolic music composition. The research was conducted at the Universidade de São Paulo.
This repository also includes a summary of the research results of the experiments conducted in the paper, as well as a support notebook approaching AI Music from an overview perspective.
Language models based on deep learning have shown promising results for artistic generation purposes, including musical generation. However, the evaluation of symbolic musical generation models is mostly based on low-level mathematical metrics (e.g. result of the loss function) due to the inherent difficulty of measuring the musical quality of a given performance, given the subjective nature of music.
This work seeks to measure and evaluate musical excerpts generated by deep learning models from a human perspective, limited to the scope of classical piano music generation.
In this assessment, a population of 117 people performed blind tests with musical excerpts of human composition and musical excerpts generated through artificial intelligence models, including the models PerformanceRNN (Oore et al., 2018), Music Transformer (Cheng-Zhi et al., 2018), MuseNet (Payne, 2019), and a custom model based on GRUs (Lee, 2020).
The experiments demonstrate that musical excerpts generated through models based on the Transformer neural network architecture (Vaswani et al. 2017) obtained the greatest receptivity within the tested population, surpassing results of human compositions. In addition, the experiments also demonstrate that people with greater musical sensitivity and musical experience are more able to identify the compositional origin of the excerpts heard. Comments from participants with no musical experience, participants with musical experience and professional musicians about some of the tested musical excerpts were also included in this work.
├───data # Data folder structure for the experiments
│ ├───interim # Intermediate data that has been transformed
│ ├───primers
│ │ ├───midi # Example of Primers in MIDI format
│ │ └───tokens # Example of tokenized files for primers
│ ├───processed # The final, canonical data sets for modeling
│ └───raw # The original, immutable MIDI data
├───docs # Support documentation, i.e. summary of research results and intro to AI Music resources
├───notebooks # MAESTRO simple exploratory data analysis
├───src
│ ├── custom-gpt2 # Source code for the custom GPT2 model, including the whole pipeline
│ ├── custom-gru # Source code for the custom GRU model, including the whole pipeline
│ ├── maestro # Dotnet notebook to extract random samples from the MAESTRO dataset
│ ├── musenet # README file presenting how to generate music with musenet
│ ├── music-transformers # Source code for the music transformers model
│ └── performance-rnn # Python notebook presenting how to generate music with performance-rnn
└───test-results # Folder containing the results and exploratory data analysis of the blind test performed in the research
You will need to install conda to facilitate environment management. If you want to use the containerized version of some of the environments, you will also need to install docker.
It's highly recommended to read the paper Generating music with data: Application of Deep Learning models for symbolic music composition before starting to use this repository. The paper details in an overview way the concepts and the experiments conducted in this work.
If you are completely new to AI Music, you can check the intro-to-ai-music located at the documentation section of this repository. This notebook explains in a non-formal way some concepts around AI Music. In case you want to have a technical perspective over the architecture of this type of problem itself, you can check the pipeline diagram located at the documentation section of this repository.
If you are mainly interested in the results of the experiments conducted in the paper, you can check the summary-of-research-results.md file located at the documentation section of this repository. This file explains in a simplified way the results of the blind test performed by the participants, alongside with some curiosities around AI Music.
This source code is organized in a way that each folder represents a model or a set of experiments. Each folder contains a README file with instructions on how to use the code. The majority of the models can be divided in two categories:
- Trainable models: models that can be trained from scratch, including the Custom GRU and the Custom GPT2 models.
- Pretrained models: models that can be used to generate music based on pre-trained checkpoints, including the Music Transformers and the Performance RNN models.
The exception of those categories is the MuseNet model, that is not open sourced by OpenAI, but can be used to generate music nevertheless through their web interface.
If you want to train the models from scratch, you will need to download the MAESTRO dataset and place it in the data/raw
folder. You can create dataset with maestro automatically by using make
with the following command:
make maestro
Note: Check the make file for others dataset automation such as data spliting and data processing.
Feel free to fork or contribute to this repository, and do not hesitate to contact me if you have any questions or suggestions.