An Exploration of Optimization Alternatives for Deep Reinforcement Learning

Project for Deep Learning course - ETH Zurich - Fall 2018

Goal: We analyze different optimization approaches and examine their performances in different DRL applications, aiming at understanding why and how they perform differently. In particular, we focus our analysis on gradient-based approaches and (gradient-free) evolution-based optimization methods.

Environment	CartPole-v1	BipedalWalker-v2
Gradient-based optimization	DQN *	TD3 **
Gradient-free optimization	GA *	GA ***

* feed-forward neural network consisting of 1 hidden layer with 24 units

** feed-forward neural networks consisting of 2 hidden layers with 512, 256 units

*** feed-forward neural networks consisting of 3 hidden layers with 128, 128, 3 units

Evaluation metrics: we compare the different algorithms, according to their time of convergence, total agent's reward, weights distances and reward function's hessian.

Getting started

Requirements

Create a virtual environment and install all required packages:

conda create --name deep-learning python=3.6

source activate deep-learning

pip install -r requirements.txt

Configuration file

In config.yml, one can choose which OpenAI Gym environment and optimization algorithm to use (all available possibilities are listed on top). For example:

environment:
  name: 'CartPole-v1'
  animate: False

algorithm: 'ga'

For each environment, we defined a specific neural network architecture for evolutionary algorithms in src/config/models.yml.

It contains also optimization algorithms' parameters we used to train agents.

Train agents

Please, make sure that you have set the desired environment and optimization algorithm in config.yml, before start training.

python src/main.py

If you are using a machine without a display, please run the following instead:

xvfb-run -s "-screen 0 1400x900x24" python src/main.py

Results analysis

The analysis of the different DRL optimization algorithms can be found in results and notebooks folders.

Project directory

.
├── config.yml                # configuration file
├── src
│   ├── config                # configuration loading package
│   ├── A2C                   # A2C package
│   ├── DDPG                  # Deep Deterministic Policy Gradients package
│   ├── TD3                   # TD3 package
│   ├── GA                    # Genetic Algorithm package
│   ├── DQN                   # Deep Q Learning package
│   ├── ES                    # Evolution Strategies package
│   ├── CMA_ES                # Covariance Matrix Adapatation ES package
│   ├── population            # population package for evolutionary algorithms
│   ├── main.py               # main 
│   ├── optimizers.py         # base gradient-free optimizer
│   ├── loss_analysis.py      # functions for loss analysis 
│   ├── visualization.py      # visualization for analysis 
│   └── utils.py              # helper functions
├── notebooks                 # notebooks with results analysis
├── results                   # folder containing training results and analysis plots
├── runs.yml                  # runs log file
└── requirements.txt          # list of all packages used

Note: the code has been tested on the following machines: macOS and Ubuntu computers, Google Cloud Platform virtual machine, and partially on ETH Leonhard cluster.

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
notebooks		notebooks
results		results
src		src
visualization/weights		visualization/weights
.gitignore		.gitignore
README.md		README.md
config.yml		config.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An Exploration of Optimization Alternatives for Deep Reinforcement Learning

Project for Deep Learning course - ETH Zurich - Fall 2018

Getting started

Requirements

Configuration file

Train agents

Results analysis

Project directory

About

Releases

Packages

Contributors 3

Languages

robertah/drl-optimization

Folders and files

Latest commit

History

Repository files navigation

An Exploration of Optimization Alternatives for Deep Reinforcement Learning

Project for Deep Learning course - ETH Zurich - Fall 2018

Getting started

Requirements

Configuration file

Train agents

Results analysis

Project directory

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages