RNN-RL

Pytorch implementations of RL (Reinforcement Learning) algorithms with RNN (Reccurent Neural Network) and Experience Replay

Disclaimer: My code is based on TD3 of openAI/spinningup.

Motivations

Experiment RL containing RNN and Experience Replay to better understand how following techniches and parameters affect.

R2D2 incporporated RNN into distributed reinforcement learning to achieve significant performance improvements on Atari tasks.

In that paper, they investigated the training of RNNs with Experience Replay. And proposed following techniques adapt off-policy and Experience Replay to Actor-Critic algorithm.

'Stored state' keep the hidden state of to the experience buffer when roll out.
"Burn-in" allow network to go through state before training timestep.

TD3 which is Actor-Critic algorithm which has replay buffer is used for following benchmarks.

What I want to answer for

Difference using simple stacked observation and RNN network against POMDP task
How following techniques make difference
- Stored state
- Burn-in process
How parameters of above techniques affect performance

How to install

pip install -e .

How to run

without RNN, using CPU

python rnnrl/algos/pytorch/td3/td3.py --env Pendulum-v0 --seed=$i --device cpu

with RNN, using GPU

python rnnrl/algos/pytorch/td3/td3.py --env Pendulum-v0 --seed=$i --device cuda --recurrent

Results

Benchmarks are executed under environment of Pendulum-v0 with PartialObservation.

PartialObservation is a wrapper to allow policy to receive observation only once in 3 times of steps for making POMDP. The naive technique to mitigate POMDP is to simply use stacked observations as observation at some point.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

RNN-RL

Motivations

What I want to answer for

How to install

How to run

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

RNN-RL

Motivations

What I want to answer for

How to install

How to run

Results