MIDI Transformer

Overview

This project aims to model the emotional nuances of piano performances using GPT-2 transformer architectures. By leveraging MIDI format, the project trains models to generate and interpret musical performances, capturing the expressive qualities of piano music.

Features

Streamlit Dashboards: Visualize and interact with MIDI data and model outputs.
Model Training: Train GPT-2 models to generate piano music sequences.
Data Augmentation: Techniques such as pitch shifting and tempo changes to enhance training data.

Installation

To get started, clone the repository and install the required dependencies:

git clone https://github.com/Nospoko/midi-transformer.git
cd midi-transformer
pip install -r requirements.txt

Running the Dashboards

Dashboards are built with Streamlit. To run the dashboard:

PYTHONPATH=. streamlit run --server.port 4002 dashboards/main.py

Available Dashboards

Main Dashboard: dashboards/main.py
- Visualize MIDI data and model predictions.
GPT Review Dashboard: dashboards/gpt_review.py
- Review and GPT model outputs.
MIDI Dataset Review Dashboard: dashboards/midi_dataset_review.py
- Explore and analyze the MIDI torch dataset.
HF Dataset Review Dashboard: dashboards/hf_dataset_review.py
- Review Hugging Face datasets defined in tokenzied_midi_dataset module.
Run Evaluation Dashboard: dashboards/run_eval.py
- Evaluate model performance.
Browse Generated Dashboard: dashboards/browse_generated.py
- Browse MIDI sequences generated with python -m scripts.generate_all.
Augmentation Review Dashboard: dashboards/augmentation_review.py
- Review data augmentation techniques.

Project Purpose

This project explores the intersection of music and machine learning by:

Modeling the expressive nuances of piano performances.
Developing methods for data augmentation and MIDI data processing.
Training transformer models to generate and interpret piano music.

How to Train Your Own Model

Setup

Ensure you have set up your environment correctly. Follow the .env.example pattern to set up your .env file with your personal tokens for wandb and Hugging Face.

Train the Model

To train the model, use the following command:

python -m gpt2.train

Use AwesomeTokensDataset

To use the AwesomeTokensDataset, first run:

python -m scripts.train_awesome_tokenizer

This creates a pre-trained tokenizer JSON in the pretrained/awesome_tokenizers directory.

Augmentation

We use pitch_shift and change_speed augmentation techniques. sequentially (pitch_shift, then change_speed). This results in a dataset approximately four times larger.

Important Links

Maestro Dataset: Link to dataset
GitHub Repository: midi-transformer
Midi Tokenizers Repository: midi-tokenizers
Platform for pianists and algorithmic music enthusiasts: pianoroll.io

Code Style

This repository uses pre-commit hooks with forced python formatting (black, flake8, and isort):

pip install pre-commit
pre-commit install

Whenever you execute git commit the files altered / added within the commit will be checked and corrected. black and isort can modify files locally - if that happens you have to git add them again. You might also be prompted to introduce some fixes manually.

To run the hooks against all files without running git commit:

pre-commit run --all-files

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
checkpoints		checkpoints
dashboards		dashboards
data		data
gpt2		gpt2
pretrained/awesome_tokenizers		pretrained/awesome_tokenizers
scripts		scripts
tmp		tmp
tokenized_midi_datasets		tokenized_midi_datasets
.env.example		.env.example
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MIDI Transformer

Table of Contents

Overview

Features

Installation

Running the Dashboards

Available Dashboards

Project Purpose

How to Train Your Own Model

Setup

Train the Model

Use AwesomeTokensDataset

Augmentation

Important Links

Code Style

About

Releases

Packages

Contributors 2

Languages

Nospoko/midi-transformer

Folders and files

Latest commit

History

Repository files navigation

MIDI Transformer

Table of Contents

Overview

Features

Installation

Running the Dashboards

Available Dashboards

Project Purpose

How to Train Your Own Model

Setup

Train the Model

Use AwesomeTokensDataset

Augmentation

Important Links

Code Style

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages