This repo contains code for our NeurIPS 2020 paper Improving Generalization in Reinforcement Learning with Mixture Regularization. Code for PPO is based on train-procgen. Code for Rainbow is based on retro-baselines and anyrl-py.
conda env create --file py36_cu9_tf112.yml
conda activate py36_cu9_tf112
git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .
Check out experiments README for running different experiments. You may also use the scripts in experiments
folder to start training. All results are available at Google Drive.
@misc{wang2020improving,
title={Improving Generalization in Reinforcement Learning with Mixture Regularization},
author={Kaixin Wang and Bingyi Kang and Jie Shao and Jiashi Feng},
year={2020},
eprint={2010.10814},
archivePrefix={arXiv},
primaryClass={cs.LG}
}