An actor-critic model in TensorFlow, using KFAC loss, as descriibed in: "Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation" by Wu et al. Tested on Atari games.
Presentation: https://drive.google.com/open?id=1tMWLk45CWVNBj8werpZe0QDEoMdCJ0MA6zPETjyzFds
Slides: https://docs.google.com/presentation/d/1nWnYXL_4z9sW_tO9mf_XMTHr4O0kfJczrXaqvEaX9Vc/edit?usp=sharing
Plots of realtive sample efficiencies as compared to baselines: